home edit page issue tracker

This page pertains to UD version 2.

UD for Belarusian

Tokenization and Word Segmentation

The low-level tokenization generally adopts the RNC standard.

Some special cases worth mentioning:

The Belarusian UD treebank does not contain multiword tokens.

Instruction: Describe the general rules for delimiting words (for example, based on whitespace and punctuation) and exceptions to these rules. Specify whether words with spaces and/or multiword tokens occur. Include links to further language-specific documentation if available.




Other Lexical Features

Language-Specific Features


Core Arguments, Oblique Arguments and Adjuncts

Non-verbal Clauses

Relation Subtypes

Instruction: Give criteria for identifying core arguments (subjects and objects), and describe the range of copula constructions in nonverbal clauses. List all subtype relations used. Include links to language-specific relations definitions if any.


There is one Belarusian UD treebanks:

Instruction: Treebank-specific pages are generated automatically from the README file in the treebank repository and from the data in the latest release. Link to the respective *-index.html page in the treebanks folder, using the language code and the treebank code in the file name.