home edit page issue tracker

This page pertains to UD version 2.

UD Ancient Hebrew PTNK

Language: Ancient Hebrew (code: hbo)
Family: Afro-Asiatic, Semitic

This treebank has been part of Universal Dependencies since the UD v2.10 release.

The following people have contributed to making this treebank part of UD: Daniel Swanson.

Repository: UD_Ancient_Hebrew-PTNK
Search this treebank on-line: PML-TQ
Download all treebanks: UD 2.14

License: CC BY-NC 4.0

Genre: bible

Questions, comments? General annotation questions (either Ancient Hebrew-specific or cross-linguistic) can be raised in the main UD issue tracker. You can report bugs in this treebank in the treebank-specific issue tracker on Github. If you want to collaborate, please contact [awesomeevildudes (æt) gmail • com]. Development of the treebank happens outside the UD repository. If there are bugs, either the original data source or the conversion procedure must be fixed. Do not submit pull requests against the UD repository.

Annotation Source
Lemmas annotated manually in non-UD style, automatically converted to UD
UPOS annotated manually in non-UD style, automatically converted to UD
XPOS annotated manually
Features annotated manually in non-UD style, automatically converted to UD
Relations annotated manually, natively in UD style


UD Ancient Hebrew PTNK contains portions of the Biblia Hebraic Stuttgartensia with morphological annotations from ETCBC.

This treebank contains portions of the Hebrew Bible as digitized and annotated in the Biblia Hebraica Stuttgartensia (Amstelodamensis) by the Eep Talstra Centre for Bible and Computer at Vrije Universiteit Amsterdam. Those annotations are licensed under Attribution-NonCommercial 4.0 International (CC BY-NC 4.0).

The corpus can be found at github.com/etcbc/bhsa. The dependency annotations were generated using VISL CG-3 and manually verified by Daniel Swanson. The code for generating them can be found at https://github.com/mr-martian/hbo-UD. Errors in the data should be reported to that repository.



title = "A {U}niversal {D}ependencies Treebank of {A}ncient {H}ebrew",
author = "Swanson, Daniel and
Tyers, Francis",
booktitle = "Proceedings of the Thirteenth Language Resources and Evaluation Conference",
month = jun,
year = "2022",
address = "Marseille, France",
publisher = "European Language Resources Association",
url = "https://aclanthology.org/2022.lrec-1.252",
pages = "2353--2361",
abstract = "In this paper we present the initial construction of a Universal Dependencies treebank with morphological annotations of Ancient Hebrew containing portions of the Hebrew Scriptures (1579 sentences, 27K tokens) for use in comparative study with ancient translations and for analysis of the development of Hebrew syntax. We construct this treebank by applying a rule-based parser (300 rules) to an existing morphologically-annotated corpus with minimal constituency structure and manually verifying the output and present the results of this semi-automated annotation process and some of the annotation decisions made in the process of applying the UD guidelines to a new language.",

Statistics of UD Ancient Hebrew PTNK

POS Tags






Tokenization and Word Segmentation



Nominal Features

Degree and Polarity

Verbal Features

Pronouns, Determiners, Quantifiers

Other Features


Auxiliary Verbs and Copula

Core Arguments, Oblique Arguments and Adjuncts

Here we consider only relations between verbs (parent) and nouns or pronouns (child).

Relations Overview