Test data conflict (sent 2017-05-09 17:46 CET)

Dear shared task participants,

unfortunately, only now we found out that there is a conflict between UD_Italian and UD_Italian-ParTUT. Parts of one treebank’s training data occur in the other treebank’s test data and vice versa. We had to modify the test set so that the conflicting sentences are removed from the UD_Italian test set, and the whole UD_Italian-ParTUT test set is withdrawn. You can still use models trained on UD_Italian-ParTUT if you want to!

In TIRA, you will now see the old test set crossed out and deprecated. Please don’t use it with any new system runs. However, if you already have a system running on the old test set, you can leave it running. If this turns out to be your final run, we will have to re-score it and make sure that the bad sentences are omitted. If your run on the old test data already finished, please re-run it on the new data, there is still plenty of time!

Don’t hesitate to ask if you have any questions. We apologize for the inconvenience.

A note on different matter: please do not use the “name” field of the metadata.json file in your system. This field is not available for the test data. Use “lcode” and “tcode” instead.

Best regards,

Dan Zeman on behalf of the costocom :) (connl shared task organizing committee) http://universaldependencies.org/conll17/