Universal Dependencies v2
- Tokenization and word segmentation
- CoNLL-U format
This is the online documentation for Universal Dependencies, version 2 (2016-12-01). Note: The treebanks listed below still follow the v1 guidelines available here.
Upcoming UD-related events
- CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies
- NoDaLiDa Workshop on Universal Dependencies (UDW 2017)
Want to know more about UD?
If you want to receive news about Universal Dependencies, you can subscribe to the UD mailing list.
Upcoming UD Treebanks
Disclaimer: Our use of flags to symbolise languages is only intended as a visual enhancement of the website and should not be interpreted as a political statement in any way.
The data is released through LINDAT/CLARIN.
- Version 1.4 treebanks are available at http://hdl.handle.net/11234/1-1827. 64 treebanks, 47 languages, released November 15, 2016.
- Version 1.3 treebanks are archived at http://hdl.handle.net/11234/1-1699. 54 treebanks, 40 languages, released May 15, 2016.
- Version 1.2 treebanks are archived at http://hdl.handle.net/11234/1-1548. 37 treebanks, 33 languages, released Nov 15, 2015.
- Version 1.1 treebanks are archived at http://hdl.handle.net/11234/LRT-1478. 19 treebanks, 18 languages, released May 15, 2015.
- Version 1.0 treebanks are archived at http://hdl.handle.net/11234/1-1464. 10 treebanks, 10 languages, released Jan 15, 2015.
- In general, we intend to have regular treebank releases every six months. However, the next release will be brought forward to facilitate uses of Version 2.0 in one of the CoNLL 2017 Shared Tasks. The release is tentatively scheduled to March 1, 2017.
You can query the UD treebanks on-line using
- SETS treebank search maintained by the University of Turku, or
- PML Tree Query maintained by the Charles University in Prague.