home edit page issue tracker

This page pertains to UD version 2.

UD for Old English

Currently, UD Old English is a small effort with one live treebank that consists of the 20 Cairo sentences: UD Old English-Cairo. Since the Cairo sentences are translations of modern sentences, the treebank is thus anachronistic and is not representative of Old English and its dependency and syntax as they were used when they were used.

Tokenization and Word Segmentation

Morphology

Tags

The UD Old English allows the full inventory of UPOS tags, not all UPOS tags are attested in the corpus due to its small size. Currently, 5 tags are not attested in the treebank.

Features

Old English contains richer features than English. In addition to the feature space in UD English, UD Old English provides case (nominative, accusative, genitive, dative, and instrumental) and gender (masculine, feminine, neutral) information.

Old English-Cairo contains case, gender, number, person, verb form, mood, tense, degree, and possessive features.

There are some features that are possible in UD Old English but do not appear in it.

The complete feature space is as follows:

Syntax

Standard deprels are used. Similarly to UD English, obl:unmarked is used for oblique nominals without prepositions (e.g. ȝebroht ȝierstan-dæg brought yesterday; ȝierstan-dæg is oblique without a preposition).

Treebanks

There is 1 Old English UD treebanks: