Introduction
The UD Dutch treebank is based on the Alpino treebank. The data were used in the CoNLL-X Shared Task in dependency parsing (2006); the CoNLL version was taken and converted to the Prague dependency style as a part of HamleDT (since 2011). Later versions of HamleDT added a conversion to the Stanford dependencies (2014) and to Universal Dependencies (HamleDT 3.0, 2015). The conversion path from the original Alpino still goes through the CoNLL-X format and the Prague dependencies, which may occasionally lead to loss of information. The first release of Universal Dependencies that includes this treebank is UD v1.2 in November 2015. It is essentially the HamleDT conversion but the data is not identical to HamleDT 3.0 because the conversion procedure has been further improved.
Links
- Alpino
- HamleDT
- Treex is the software used for conversion
- Interset was used to convert POS tags and features
References
- Leonoor van der Beek, Gosse Bouma, Jan Daciuk, Tanja Gaustad, Robert Malouf, Gertjan van Noord, Robbert Prins, BegoƱa Villada. 2002. Chapter 5. The Alpino Dependency Treebank. In: Algorithms for Linguistic Processing NWO PIONIER Progress Report, Groningen, the Netherlands.