home edit page issue tracker

This page pertains to UD version 2.

UD Coptic Scriptorium

Language: Coptic (code: cop)
Family: Afro-Asiatic, Egyptian

This treebank has been part of Universal Dependencies since the UD v1.4 release.

The following people have contributed to making this treebank part of UD: Mitchell Abrams, Elizabeth Davidson, Amir Zeldes.

Repository: UD_Coptic-Scriptorium
Search this treebank on-line: PML-TQ
Download all treebanks: UD 2.14

License: CC BY 4.0

Genre: bible, fiction, nonfiction

Questions, comments? General annotation questions (either Coptic-specific or cross-linguistic) can be raised in the main UD issue tracker. You can report bugs in this treebank in the treebank-specific issue tracker on Github. If you want to collaborate, please contact [amir • zeldes (æt) georgetown • edu]. Development of the treebank happens outside the UD repository. If there are bugs, either the original data source or the conversion procedure must be fixed. Do not submit pull requests against the UD repository.

Annotation Source
Lemmas annotated manually
UPOS annotated manually in non-UD style, automatically converted to UD
XPOS annotated manually
Features assigned by a program, not checked manually
Relations annotated manually, natively in UD style


UD Coptic contains manually annotated Sahidic Coptic texts, including Biblical texts, sermons, letters, and hagiography.

The Coptic Universal Dependency Treebank is a manually annotated corpus of Sahidic Coptic texts, currently containing excerpts from the Sahidic New Testament Gospel of Mark, Works by Archmandrite Shenoute of Atripe, the Letters of Besa, lives of Sts. Cyrus and Onnophrius, Epistle of Pseudo-Ephrem, the Dormition of John the Apostle and short stories from the Apophthegmata Patrum (Sayings of the Desert Fathers). Detailed information about the treebank is available here:


The data was digitized or previously available in digital format, and annotated manually for part of speech in the project Coptic Scriptorium. For individual credit and further information see:


Coptic POS tags come from the Coptic Scriptorium tag set, which is available from the project and treebank websites.


The underlying POS tagged material was produced as part of the projects Coptic Scriptorium, KOMeT and KELLIA, funded by the NEH in the USA and BMBF and DFG in Germany (see http://copticscriptorium.org/ for more details). Treebank annotation was done mainly by Mitchell Abrams, Liz Davidson and Amir Zeldes. Thanks are also due to Israel Avrahamy, Asael Benyami, Yinon Kahan and Oran Szachter for their contributions.

Statistics of UD Coptic Scriptorium

POS Tags






Tokenization and Word Segmentation



Nominal Features

Degree and Polarity

Verbal Features

Pronouns, Determiners, Quantifiers

Other Features


Auxiliary Verbs and Copula

Core Arguments, Oblique Arguments and Adjuncts

Here we consider only relations between verbs (parent) and nouns or pronouns (child).

Relations Overview