TIRA (sent 2017-04-15 22:20 CET)

Dear shared task participants,

here is a summary of the news from the past few weeks.

UDPipe 1.1. As some of you noticed, the published baseline models could not be read by UDPipe 1.0. The new version, UDPipe 1.1.0, was officially released on March 29; see http://ufal.mff.cuni.cz/udpipe.

Baseline results. The README.txt file in the package with the UDPipe baseline models lists their scores on the development data. For your convenience, these scores have now been published also on the shared task website, see http://universaldependencies.org/conll17/baseline.html.

TIRA Virtual Machines. Many of you have already your virtual machines on TIRA. If you have not responded to Martin Potthast’s e-mail asking about your operating system preferences, please do not postpone this step. You will likely need some time to get yourself familiar with the TIRA platform and to deploy your system in the VM.

There is now some more information about TIRA at the shared task website. To quickly address some repeatedly asked questions: The typical way to go is that you train your models offline, i.e. on your own hardware. (Training directly in the TIRA VM is not explicitly prohibited but the resources there are limited, hence it is not suitable for training a decent dependency parser.) Once ready, you upload both your system and the models to the VM. Then you proceed to the TIRA web interface (same login as to your VM), register there the shell command to run your system, and run it. Note that your VM will not be accessible while your system is running – it will be “sandboxed”, detached from the internet, and after the run the state of the VM before the run will be restored. Your run can then be reviewed and evaluated by the organizers. Although all of you probably test your systems against the development data on the machine where you develop your systems, you should try to run it “the official way” on TIRA so that you are familiar with the process and know that everything works. The sooner the better :-) so we can sort out any problems.

Processing the data on TIRA. Within your VM, you can see the development and trial data mounted read-only at /media/training-datasets/universal-dependency-learning (trial data is a small subset of development data that you can use for quick debugging, without having your VM sandboxed for too long). First try running your system on these datasets from within your VM (no sandboxing), then try the same through the web interface (everything like in the test phase, i.e. including sandboxing). When invoked from the web interface, your system will be given path to the input folder and path to the output folder, where it is supposed to generate all output files. When you run the system on development or trial data, the input path will lead to the location mentioned above. But during the test phase, it will be a different path to which you normally don’t have access. And while the TIRA folder with development data contains also the gold standard files, the trial and test folders contain only the two permitted input files: raw text or CoNLL-U preprocessed by UDPipe.

As you will see, the file names are slightly different from the UD release, and files for all languages are in one folder. There are two extra files, metadata.json and README.txt (which documents the fields in metadata.json). Your system should start by reading metadata.json, which contains the list of input files that must be processed, and the names of corresponding output files that must be generated in the output folder. The metadata will also tell you the language code and treebank code of each input file (although the codes are typically also used in file names, the proper place where your system should read them is the metadata file). For test files that correspond to a UD 2.0 treebank, these codes will match those you know from the UD release. But remember that you are also supposed to process 1. unknown treebank codes for known languages; 2. and even unknown language codes (surprise languages). If your system fails to provide a valid CoNLL-U output for an input file, its score on that part will be zero. Even a random tree should be better than zero, so make sure to generate something even if surprise languages are not your focus in this task.

If you want to test your system locally, you can download the data with the folder structure used at TIRA, and with the input files preprocessed by UDPipe, from http://ufal.mff.cuni.cz/~zeman/soubory/tira-data-participants.zip.

Best regards, Dan Zeman on behalf of the costocom :) (connl shared task organizing committee) http://universaldependencies.org/conll17/