home edit page issue tracker

This page pertains to UD version 2.

UD Upper Sorbian UFAL

Language: Upper Sorbian (code: hsb)
Family: Indo-European, Slavic

This treebank has been part of Universal Dependencies since the UD v2.1 release.

The following people have contributed to making this treebank part of UD: Daniel Zeman, Anna Nedoluzhko.

Repository: UD_Upper_Sorbian-UFAL
Search this treebank on-line: PML-TQ
Download all treebanks: UD 2.14

License: CC BY-SA 4.0

Genre: wiki, nonfiction

Questions, comments? General annotation questions (either Upper Sorbian-specific or cross-linguistic) can be raised in the main UD issue tracker. You can report bugs in this treebank in the treebank-specific issue tracker on Github. If you want to collaborate, please contact [zeman (æt) ufal • mff • cuni • cz]. Development of the treebank happens directly in the UD repository, so you may submit bug fixes as pull requests against the dev branch.

Annotation Source
Lemmas annotated manually
UPOS annotated manually, natively in UD style
XPOS annotated manually
Features annotated manually, natively in UD style
Relations annotated manually, natively in UD style


A small treebank of Upper Sorbian based mostly on Wikipedia.

The Upper Sorbian sentences are taken from the W2C corpus (Martin Majliš), which was further manually filtered, morphologically and syntactically annotated by Dan Zeman; lemmatization by Anna Nedoluzhko.

Sentences in the W2C corpus are shuffled.


Statistics of UD Upper Sorbian UFAL

POS Tags






Tokenization and Word Segmentation



Nominal Features

Degree and Polarity

Verbal Features

Pronouns, Determiners, Quantifiers

Other Features


Auxiliary Verbs and Copula

Core Arguments, Oblique Arguments and Adjuncts

Here we consider only relations between verbs (parent) and nouns or pronouns (child).

Reflexive Verbs

Reflexive Passive

Verbs with Reflexive Core Objects

Relations Overview