This page pertains to UD version 2.

UD for Tatar

This is a work-in-progress overview of the UD annotation for Tatar.

Tokenization and Word Segmentation


Tatar is an agglutinative language with suffixation, and has a rich inflectional and derivational morphology. This morphological complexities cause some conflicts with the current UD v2 feature inventory. For example, a verb’s voice can contain more than one values. To this problem, the treebank temporarily employs a manner proposed by the Turkish treebanks, and just concatenates the values together in the alphabetical order.

Tags and Features

This is an overview only. For more detailed discussion and examples, see the list of Tatar POS tags.


Relations Overview

However, more relation subtypes are considered to be used in later versions, such as:


There is one Tatar UD treebank: