Universal Dependencies v1
Introduction to Universal Dependencies
- Tokenization
- Morphology
- Syntax
- CoNLL-U format
This is the online documentation for Universal Dependencies, version 1 (2014-10-01).
Version 1 was replaced by version 2 on December 1, 2016.
UD Treebanks
Disclaimer: Our use of flags to symbolise languages is only intended as a visual enhancement of the website and should not be interpreted as a political statement in any way.
Download
The data is released through LINDAT/CLARIN.
- Version 1.4 treebanks are available at http://hdl.handle.net/11234/1-1827. 64 treebanks, 47 languages, released November 15, 2016.
- Version 1.3 treebanks are archived at http://hdl.handle.net/11234/1-1699. 54 treebanks, 40 languages, released May 15, 2016.
- Version 1.2 treebanks are archived at http://hdl.handle.net/11234/1-1548. 37 treebanks, 33 languages, released Nov 15, 2015.
- Version 1.1 treebanks are archived at http://hdl.handle.net/11234/LRT-1478. 19 treebanks, 18 languages, released May 15, 2015.
- Version 1.0 treebanks are archived at http://hdl.handle.net/11234/1-1464. 10 treebanks, 10 languages, released Jan 15, 2015.
- In general, we intend to have regular treebank releases every six months. However, the next release will be brought forward to facilitate uses of Version 2.0 in one of the CoNLL 2017 Shared Tasks. The release will probably be in February 2017.
Query online
You can query the UD treebanks on-line using
- SETS treebank search maintained by the University of Turku, or
- PML Tree Query maintained by the Charles University in Prague.
Stay up to date
If you want to receive news about Universal Dependencies, you can subscribe to the UD mailing list.
New: A draft of UD v2 is now available.
Contribute to UD
If you want to add a new language/treebank, please read the instructions for adding a new language and encoding its metadata.
If you want to make a release of an existing treebank, please follow the steps in the release checklist and make sure your data shows as validating in the format validation runs. Check the content validation for any suspicious patterns in the data; there are direct links to our treebank search that you can use to browse the suspicious data points. Check that the list of contributors is correct; this is gathered from the metadata in the READMEs. General instructions for contributing to the online documentation can be found here.
Tools and additional documentation
There is a separate page about tools that are available for work with UD data.
Direct links to the experimental language family documentation: