home edit page issue tracker

This page pertains to UD version 2.

Universal Dependencies v2

Executive summary of changes from v1 to v2

This is the online documentation for Universal Dependencies, version 2 (2016-12-01). Note: The treebanks listed below still follow the v1 guidelines available here.

Want to know more about UD?

If you want to receive news about Universal Dependencies, you can subscribe to the UD mailing list.

UD Treebanks

Ancient Greek 244K
Ancient Greek-PROIEL 206K -
Arabic 242K -
Basque 121K
Bulgarian 156K
Buryat 9K -
Catalan 530K
Chinese 123K
Coptic 5K
Croatian 139K -
Czech 1,503K
Czech-CAC 493K
Czech-CLTT 35K
Danish 100K
Dutch 209K -
Dutch-LassySmall 98K -
English 254K
English-ESL 97K
English-LinES 82K
Estonian 234K -
Faroese 132K -
Finnish 181K
Finnish-FTB 159K -
French 391K
Galician 138K
Galician-TreeGal 24K
German 293K -
Gothic 56K -
Greek 59K
Hebrew 115K -
Hindi 351K -
Hungarian 42K
Indonesian 121K -
Irish 23K
Italian 272K
Japanese 92K -
Japanese-KTC 267K
Kazakh 6K
Latin 47K -
Latin-ITTB 291K -
Latin-PROIEL 165K -
Latvian 20K -
Norwegian-Bokmaal 310K
Old Church Slavonic 57K -
Persian 151K
Polish 83K -
Portuguese 209K -
Portuguese-BR 298K -
Portuguese-Bosque 227K
Romanian 218K
Russian 99K
Russian-SynTagRus 1,068K
Sanskrit 1K -
Slovak 106K -
Slovenian 140K
Slovenian-SST 29K
Spanish 423K
Spanish-AnCora 547K
Swedish 96K
Swedish-LinES 79K
Swedish Sign Language <1K -
Tamil 8K -
Turkish 56K
Ukrainian 1K
Uyghur 6K -
Vietnamese 43K -

Upcoming UD Treebanks

Amharic - - ? -
Arabic-LDC - -
Cantonese - -
Chinese-HK - -
Korean - - - -
Kurmanji - - ?
Norwegian-Nynorsk - - ? -
Serbian - -
Somali - -
Sorani - - ?
Urdu - -

Disclaimer: Our use of flags to symbolise languages is only intended as a visual enhancement of the website and should not be interpreted as a political statement in any way.

Download

The data is released through LINDAT/CLARIN.

Query online

You can query the UD treebanks on-line using

Language family documentation (experimental)