home edit page issue tracker

This page pertains to UD version 2.

UD Russian SynTagRus

Language: Russian (code: ru)
Family: Indo-European, Slavic

This treebank has been part of Universal Dependencies since the UD v1.3 release.

The following people have contributed to making this treebank part of UD: Kira Droganova, Olga Lyashevskaya, Daniel Zeman.

Repository: UD_Russian-SynTagRus
Search this treebank on-line: PML-TQ
Download all treebanks: UD 2.6

License: CC BY-NC-SA 4.0

Genre: news, nonfiction, fiction

Questions, comments? General annotation questions (either Russian-specific or cross-linguistic) can be raised in the main UD issue tracker. You can report bugs in this treebank in the treebank-specific issue tracker on Github. If you want to collaborate, please contact [zeman (æt) ufal • mff • cuni • cz, droganova (æt) ufal • mff • cuni • cz]. Development of the treebank happens outside the UD repository. If there are bugs, either the original data source or the conversion procedure must be fixed. Do not submit pull requests against the UD repository.

Annotation Source
Lemmas annotated manually in non-UD style, automatically converted to UD
UPOS annotated manually in non-UD style, automatically converted to UD
XPOS not available
Features annotated manually in non-UD style, automatically converted to UD
Relations annotated manually in non-UD style, automatically converted to UD

Description

Russian data from the SynTagRus corpus.

The SynTagRus dependency treebank is being developed by the Computational Linguistics Laboratory, A.A.Kharkevich Institute of Information Transmission Problems, Russian Academy of Sciences, located in Moscow.

Currently the treebank contains over 1,000,000 tokens (over 66,000 sentences) belonging to texts from a variety of genres (contemporary fiction, popular science, newspaper and journal articles dated between 1960 and 2016, texts of online news etc.)

SynTagRus is a human-corrected corpus of Russian supplied with comprehensive morphological annotation and syntactic annotation in the form of a complete dependency tree provided for every sentence. Additionally, the original version of SynTagRus contains other types of annotation, first of all lexical functional annotation in terms of lexical functions as defined in the Meaning-Text model.

It is an integral but fully autonomous part of the Russian National Corpus developed in a nationwide research project and can be freely consulted on the Web: http://www.ruscorpora.ru/instruction-syntax.html

For more details, see the recently published paper (in Russian):

Дяченко П.В., Иомдин Л.Л., Лазурский А.В., Митюшин Л.Г., Подлесская О.Ю., Сизов В.Г., Фролова Т.И., Цинман Л.Л. Современное состояние глубоко аннотированного корпуса текстов русского языка (СинТагРус) // Сборник «Национальный корпус русского языка: 10 лет проекту». Труды Института русского языка им. В.В. Виноградова. М., 2015. Вып. 6. С. 272-299.

References

Acknowledgments

Statistics of UD Russian SynTagRus

POS Tags

ADJADPADVAUXCCONJDETINTJNOUNNUMPARTPRONPROPNPUNCTSCONJSYMVERBX

Features

AnimacyAspectCaseDegreeForeignGenderMoodNumberPersonPolarityTenseVariantVerbFormVoice

Relations

aclacl:relcladvcladvmodamodapposauxaux:passcaseccccompcompoundconjcopcsubjcsubj:passdepdetdiscourseexplfixedflatflat:foreignflat:nameiobjmarknmodnsubjnsubj:passnummodnummod:entitynummod:govobjoblorphanparataxispunctrootvocativexcomp

Tokenization and Word Segmentation

Morphology

Tags

Nominal Features

Degree and Polarity

Verbal Features

Pronouns, Determiners, Quantifiers

Other Features

Syntax

Auxiliary Verbs and Copula

Core Arguments, Oblique Arguments and Adjuncts

Here we consider only relations between verbs (parent) and nouns or pronouns (child).

Relations Overview