home edit page issue tracker

This page pertains to UD version 2.

It appears that you have Javascript disabled. Please consider enabling Javascript for this page to see the visualizations.

UD for Ancient Hebrew

Tokenization and Word Segmentation

No tokens in the Ancient Hebrew treebank should contain whitespace. The following are made into separate tokens:

Prepositions (ב, כ, ל, מ)
Possessive and object pronouns (ני, נו, ו, ם, …)
- The corresponding independent pronoun is used as the lemma
Conjunction ו
Definite determiner ה
- This includes ה when it appears as demonstrative agreement on adjectives, participles, and demonstrative determiners
- Since the text includes vowels diacritics, ה is included as a token even when it does not correspond to a full character in the consonantal text.

Morphology

Tags

All tags are used except X and SYM. AUX is used for the copula היה.

The positive and negative existentials ישׁ and אין are tagged VERB.

Participles are tagged either VERB or NOUN. If they have arguments or obliques, they are tagged as VERB, but if they do not then they are tagged as NOUN if they participate in nominal phrases.

The correspondences between XPOS (BHSA feature sp) and UPOS are listed below. Rows prefixed with → indicate that the part of speech tag’s correspondence is conditioned by the BHSA lexical set feature.

BHSA tag	BHSA name	UPOS	Notes
`adjv`	adjective	ADJ	Also NOUN in certain situations
→ `ordn`	ordinal	NUM
`advb`	adverb	ADV
`art`	article	DET	Also SCONJ in certain situations
`conj`	conjunction	CCONJ or SCONJ
`inrg`	interrogative particle	ADV or PART
`intj`	interjection	INTJ
`nega`	negative particle	ADV
`nmpr`	proper noun	PROPN
`prde`	demonstrative pronoun	PRON
`prep`	preposition	ADP
`prin`	interrogative pronoun	PRON
`prn`	pronominal suffix	PRON	Tag added in conversion process
`prps`	personal pronoun	PRON
`punct`	punctuation	PUNCT	Tag added in conversion process
`subs`	noun	NOUN
→ `card`	cardinal	NUM
→ `nmcp`	copulative noun	VERB	These are the existential verbs
→ `padv`	potential adverb	ADV	Sometimes
→ `ppre`	potential preposition	ADP	Sometimes
`verb`	verb	VERB	Also NOUN in certain situations
→ `vbcp`	copulative verb	AUX

Features

The following universal features are in use:

The following language-specific features are in use:

HebBinyan: VERB (AUX, NOUN)

The following MISC features are present:

Cantillation
- The names of any cantillation marks that appear on a word, e.g. Cantillation=Etnahta
- When multiple marks appear on a word, they are separated with commas and listed in the order they appear, e.g. Cantillation=Pazer,Geresh
- The names follow the spellings used by Unicode
Gloss
- Currently taken from the BHSA gloss feature
LexDomain[SDBH]
- ID of the semantic domain(s) corresponding to the value of LId[SDBH]
LId[SDBH]
- ID of the (mostly) disambiguated word in MARBLE’s Semantic Dictionary of Biblical Hebrew
LId[Strongs]
- Number of the word root in Strong’s Concordance
- The values come from the MACULA corpus, which assigns non-numeric values to function words (conjunctions, prefixed prepositions) which are not listed in the original concordance
Ref
- The values are formatted as BOOK_CHAPTER.VERSE, e.g. GEN_1.1
- The book abbreviations are listed below
Ref[BHSA]
- The numeric ID of the word in the BHSA corpus
Ref[MACULA]
- The ID of the word in the MACULA corpus
SpaceAfter=No
Translit
- The value of this field follows the Library of Congress romanization standard

Book	`Ref` Abbreviation
Genesis	`GEN`
Exodus	`EXOD`
Leviticus	`LEV`
Numbers	`NUM`
Deuteronomy	`DEUT`
Joshua	`JOSH`
Judges	`JUDG`
Ruth	`RUTH`
1 Samuel	`1SAM`
2 Samuel	`2SAM`
1 Kings	`1KGS`
2 Kings	`2KGS`
1 Chronicles	`1CHR`
2 Chronicles	`2CHR`
Ezra	`EZRA`
Nehemiah	`NEH`
Esther	`ESTH`
Job	`JOB`
Psalms	`PS`
Proverbs	`PROV`
Ecclesiastes	`ECCL`
Song of Songs	`SONG`
Isaiah	`ISA`
Jeremiah	`JER`
Lamentations	`LAM`
Ezekiel	`EZEK`
Daniel	`DAN`
Hosea	`HOS`
Joel	`JOEL`
Amos	`AMOS`
Obadiah	`OBAD`
Jonah	`JONAH`
Micah	`MIC`
Nahum	`NAH`
Habakkuk	`HAB`
Zephaniah	`ZEPH`
Haggai	`HAG`
Zechariah	`ZECH`
Malachi	`MAL`

Syntax

The subtypes acl:relcl, compound:smixut, nmod:poss, nsubj:outer, and obl:npmod are used. The relation compound is currently unused.

The relation clf is unused.

The relations list, goeswith, reparandum, and dep are currently unused, but may be used in future.

Detailed discussion of the relations that are used can be found via the list of Ancient Hebrew relations.

Treebanks

There is 1 Ancient Hebrew UD treebank:

Ancient Hebrew-PTNK