home edit page issue tracker

Universal Dependencies

Introduction to Universal Dependencies

This is the online documentation for Universal Dependencies, version 1 (2014-10-01). Version 2 was planned for November 2016 but has been postponed until the spring of 2017. If you plan to start using the scheme, we recommend waiting for version 2.

UD Treebanks

Amharic - - ? -
Ancient Greek 244K
Ancient Greek-PROIEL 206K -
Arabic 242K -
Basque 121K
Bulgarian 156K
Buryat 5K -
Catalan 530K
Chinese 123K
Coptic 4K
Croatian 87K -
Czech 1,503K
Czech-CAC 493K
Czech-CLTT 35K
Danish 100K
Dutch 209K -
Dutch-LassySmall 98K -
English 254K
English-ESL 97K
English-LinES 82K
Estonian 234K -
Faroese 119K -
Finnish 181K
Finnish-FTB 159K -
French 390K
Galician 138K
German 293K -
Gothic 56K -
Greek 59K
Hebrew 115K -
Hindi 351K -
Hungarian 42K
Indonesian 121K -
Irish 23K
Italian 252K
Japanese-KTC 267K
Kazakh 4K
Korean - - - -
Latin 47K -
Latin-ITTB 291K -
Latin-PROIEL 165K -
Latvian 20K -
Norwegian 311K
Old Church Slavonic 57K -
Persian 151K
Polish 83K -
Portuguese 209K -
Portuguese-BR 298K -
Romanian 145K
Russian 99K
Russian-SynTagRus 1,032K
Sanskrit 1K -
Serbian - - ?
Slovenian 140K
Slovenian-SST 29K
Spanish 423K
Spanish-AnCora 547K
Swedish 96K
Swedish-LinES 79K
Swedish Sign Language - - ?
Tamil 8K -
Turkish 56K
Ukrainian - - -
Urdu - - ?
Uyghur 45K -
Vietnamese 43K -

Download

The data is released through LINDAT/CLARIN.

Query online

You can query the UD treebanks on-line using

Stay up to date

If you want to receive news about Universal Dependencies, you can subscribe to the UD mailing list.

See also the list of open issues and what was decided about them at the Uppsala meeting. New: preparing v2.

Contribute to UD

If you want to add a new language/treebank, please read the instructions for adding a new language and encoding its metadata.

If you want to make a release of an existing treebank, please follow the steps in the release checklist and make sure your data shows as validating in the format validation runs. Check the content validation for any suspicious patterns in the data; there are direct links to our treebank search that you can use to browse the suspicious data points. Check that the list of contributors is correct; this is gathered from the metadata in the READMEs. General instructions for contributing to the online documentation can be found here.

Tools and additional documentation

There is a separate page about tools that are available for work with UD data.

Direct links to the experimental language family documentation: