Universal Dependencies v2
- Tokenization and word segmentation
- CoNLL-U format
This is the online documentation for Universal Dependencies, version 2 (2016-12-01). Note: The treebanks listed below still follow the v1 guidelines available here.
Want to know more about UD?
If you want to receive news about Universal Dependencies, you can subscribe to the UD mailing list.
Upcoming UD Treebanks
Disclaimer: Our use of flags to symbolise languages is only intended as a visual enhancement of the website and should not be interpreted as a political statement in any way.
The data is released through LINDAT/CLARIN.
- Version 1.4 treebanks are available at http://hdl.handle.net/11234/1-1827. 64 treebanks, 47 languages, released November 15, 2016.
- Version 1.3 treebanks are archived at http://hdl.handle.net/11234/1-1699. 54 treebanks, 40 languages, released May 15, 2016.
- Version 1.2 treebanks are archived at http://hdl.handle.net/11234/1-1548. 37 treebanks, 33 languages, released Nov 15, 2015.
- Version 1.1 treebanks are archived at http://hdl.handle.net/11234/LRT-1478. 19 treebanks, 18 languages, released May 15, 2015.
- Version 1.0 treebanks are archived at http://hdl.handle.net/11234/1-1464. 10 treebanks, 10 languages, released Jan 15, 2015.
- In general, we intend to have regular treebank releases every six months. However, the next release will be brought forward to facilitate uses of Version 2.0 in one of the CoNLL 2017 Shared Tasks. The release will probably be in February 2017.
You can query the UD treebanks on-line using
- SETS treebank search maintained by the University of Turku, or
- PML Tree Query maintained by the Charles University in Prague.