home edit page issue tracker

This page pertains to UD version 2.

Universal Dependencies v2

Executive summary of changes from v1 to v2

This is the online documentation for Universal Dependencies, version 2 (2016-12-01). Note: The treebanks listed below still follow the v1 guidelines available here.

Upcoming UD-related events

Want to know more about UD?

If you want to receive news about Universal Dependencies, you can subscribe to the UD mailing list.

UD Treebanks

Ancient Greek 202K
Ancient Greek-PROIEL 211K -
Arabic 242K -
Arabic-NYUAD 629K -
Arabic-PUD 20K - ?
Basque 121K
Belarusian 6K -
Bulgarian 156K
Buryat 10K - ?
Catalan 530K
Chinese 123K
Chinese-PUD 21K - ?
Coptic 3K
Croatian 197K -
Czech 1,330K
Czech-CAC 493K
Czech-CLTT 37K
Czech-PUD 18K ?
Danish 100K
Dutch 209K -
Dutch-LassySmall 101K -
English 254K
English-ESL 88K
English-LinES 82K
English-PUD 21K ?
English-ParTUT 49K
Estonian 47K -
Finnish 202K
Finnish-FTB 159K -
Finnish-PUD 15K - ?
French 391K
French-FTB 556K - ?
French-PUD 24K - ?
French-ParTUT 27K
French-Sequoia 68K -
Galician 138K
Galician-TreeGal 23K
German 287K -
German-PUD 20K - ?
Gothic 55K -
Greek 61K
Hebrew 115K -
Hindi 351K -
Hindi-PUD 23K - ?
Hungarian 42K
Indonesian 121K -
Indonesian-PUD 25K - ?
Irish 23K
Italian 273K
Italian-PUD 22K - ?
Italian-ParTUT 39K
Japanese 186K
Japanese-KTC 189K
Japanese-PUD 26K - ?
Kazakh 10K
Korean 74K
Korean-PUD 22K - ?
Korean-Sejong 89K - ?
Kurmanji 10K - ?
Latin 29K
Latin-ITTB 291K -
Latin-PROIEL 171K -
Latvian 54K -
Lithuanian 5K -
North Sami 55K - ?
Norwegian-Bokmaal 310K
Norwegian-Nynorsk 301K
Old Church Slavonic 57K -
Persian 151K
Polish 82K -
Portuguese 210K
Portuguese-BR 297K -
Portuguese-PUD 21K - ?
Romanian 218K
Russian 99K
Russian-PUD 19K - ?
Russian-SynTagRus 1,107K
Sanskrit 1K -
Slovak 106K -
Slovenian 140K
Slovenian-SST 29K
Spanish 423K
Spanish-AnCora 547K
Spanish-PUD 22K - ?
Swedish 96K
Swedish-LinES 79K
Swedish-PUD 19K - ?
Swedish Sign Language <1K -
Tamil 8K -
Thai-PUD 23K - ?
Turkish 56K
Turkish-PUD 16K - ?
Ukrainian 25K
Upper Sorbian 11K - ?
Urdu 138K -
Uyghur 13K -
Vietnamese 43K -

Upcoming UD Treebanks

Amharic - - ? -
Cantonese - -
Chinese-CFL - ?
Chinese-HK - -
Dargwa - - ?
Faroese - -
Lithuanian-Alksnis - - ?
Marathi - -
Romansh - - ?
Romansh-Sursilv - - ?
Serbian - -
Somali - -
Sorani - - ?

Disclaimer: Our use of flags to symbolise languages is only intended as a visual enhancement of the website and should not be interpreted as a political statement in any way.

Download

The data is released through LINDAT/CLARIN.

Query online

You can query the UD treebanks on-line using

Language family documentation (experimental)