Bug in text files (sent 2017-03-13 21:11 CET)
Dear shared task participants,
we have found a bug in the script used to extract raw text from the CoNLL-U files (thanks Tianze Shi for reporting the bug!) It means that a few .txt files from the UD 2.0 release contain errors. The main files in the CoNLL-U format are not affected by the bug. You can download fixed data from the following URLs:
- http://ufal.mff.cuni.cz/~zeman/soubory/ud-treebanks-conll2017.tgz
- http://ufal.mff.cuni.cz/~zeman/soubory/ud-treebanks-v2.0.tgz
- The fixed script can be downloaded from https://github.com/UniversalDependencies/tools/blob/master/conllu_to_text.pl
Sorry for the inconvenience.
Best regards,
Dan Zeman
on behalf of the costocom :) (connl shared task organizing committee) http://universaldependencies.org/conll17/