home edit page issue tracker

This page still pertains to UD version 1.


The tagset behind BulTreeBank is elaborately described in English in the following stylebook of (Simov, Osenova and Slavcheva 2004): http://www.bultreebank.org/TechRep/BTB-TR03.pdf

The tagset is positional. It encodes both levels: part-of-speech and its grammatical features (when available). It contains nearly 700 tags, since Bulgarian is a morphologically rich language.

Note that the symbol `#’, used in the Universal POS section indicates a holder for arbitrary number of features, suppressed in the respective tag as irrelevant in the BulTreeBank tagset, when mapped to the Universal one.