This is part of archived UD v1 documentation. See for the current version.
home issue tracker


The tagset behind BulTreeBank is elaborately described in English in the following stylebook of (Simov, Osenova and Slavcheva 2004):

The tagset is positional. It encodes both levels: part-of-speech and its grammatical features (when available). It contains nearly 700 tags, since Bulgarian is a morphologically rich language.

Note that the symbol `#’, used in the Universal POS section indicates a holder for arbitrary number of features, suppressed in the respective tag as irrelevant in the BulTreeBank tagset, when mapped to the Universal one.