Baseline results
Two open-source systems have been run on the development data under comparable conditions to provide baseline results: UDPipe and SyntaxNet.
UDPipe
UDPipe results reflect end-to-end processing, i.e. tokenization, word segmentation, sentence segmentation, prediction of lemmas, UPOS tags, XPOS tags and morphological features, and parsing, i.e. prediction of labeled dependency relations. Data preprocessed by the lower levels of UDPipe (everything except for parsing) are available to participants who do not want to train their own segmentation and morphology.
Baseline UDPipe models have been released and can be downloaded from http://hdl.handle.net/11234/1-1990 (note that you need UDPipe 1.1.0 to load them). The package also contains data splits and hyperparameters needed to re-train the models the same way. Furthermore it contains UD 2.0 training and development data with morphology predicted by UDPipe.
The evaluation scores, measured on development data, have been published together with the models (the README.txt file in udpipe-ud-2.0-conll17-170315.tar). For convenience, we also reproduce the relevant part of the README.txt here:
Baseline model performance
We measure the performance using the attached official evaluation script conll17_ud_eval.py (version 1.0).
We report not only the official LAS metric, but also all other metrics, in order for you to be able to decide whether to use UDPipe for segmentation/tokenization/POS tagging/lemmatization. The description of the metrics is available in the evaluation script.
We measure the performance in three settings – first when only the raw texts are available (this is the official setting, F1 score is reported), then when the gold segmentation and tokenization is available, and finally when the gold segmentation, tokenization, lemmatization and morphology is available (reporting accuracy in the two latter cases).
F1 scores when processing raw texts
Language | Tokens | Sentences | Words | UPOS | XPOS | Feats | AllTags | Lemmas | UAS | LAS ---------------------+--------+-----------+--------+-------+-------+--------+---------+--------+-------+------ Ancient_Greek-PROIEL | 100.00 | 41.95 | 100.00 | 95.94 | 96.11 | 88.56 | 87.35 | 92.09 | 72.15 | 66.88 Ancient_Greek | 99.98 | 99.17 | 99.98 | 81.52 | 72.15 | 86.85 | 72.15 | 83.45 | 62.11 | 55.21 Arabic | 99.99 | 77.99 | 93.86 | 88.58 | 82.85 | 82.97 | 81.77 | 87.08 | 70.14 | 64.13 Basque | 99.99 | 99.00 | 99.99 | 92.78 | 99.99 | 87.30 | 84.90 | 93.31 | 74.54 | 69.18 Bulgarian | 99.84 | 92.41 | 99.84 | 97.49 | 94.26 | 95.20 | 93.36 | 94.40 | 86.86 | 82.14 Catalan | 99.97 | 98.77 | 99.96 | 98.12 | 98.12 | 97.46 | 96.83 | 98.13 | 88.29 | 85.21 Chinese | 88.95 | 97.60 | 88.95 | 82.37 | 82.32 | 87.66 | 81.03 | 88.94 | 59.98 | 56.00 Croatian | 99.98 | 97.23 | 99.98 | 96.11 | 99.98 | 85.46 | 84.34 | 95.16 | 81.74 | 76.16 Czech-CAC | 100.00 | 99.09 | 100.00 | 98.78 | 90.76 | 89.75 | 89.14 | 97.21 | 86.73 | 83.40 Czech-CLTT | 98.65 | 74.11 | 98.65 | 90.35 | 78.60 | 79.49 | 78.50 | 91.70 | 70.02 | 66.46 Czech | 99.96 | 92.41 | 99.96 | 98.41 | 93.26 | 92.41 | 92.05 | 98.08 | 87.38 | 83.93 Danish | 99.68 | 84.36 | 99.68 | 94.80 | 99.68 | 93.96 | 92.84 | 94.86 | 76.86 | 73.24 Dutch-LassySmall | 99.90 | 79.31 | 99.90 | 95.26 | 99.90 | 94.67 | 93.35 | 96.56 | 76.62 | 72.64 Dutch | 99.87 | 92.11 | 99.87 | 93.94 | 90.64 | 91.61 | 89.63 | 91.91 | 78.92 | 73.91 English-LinES | 99.93 | 87.36 | 99.93 | 94.74 | 92.85 | 99.93 | 91.15 | 99.93 | 78.48 | 74.25 English-ParTUT | 99.46 | 97.62 | 99.46 | 93.99 | 93.68 | 92.94 | 91.64 | 96.37 | 77.81 | 73.81 English | 98.69 | 76.35 | 98.69 | 93.10 | 92.42 | 93.82 | 90.95 | 95.87 | 79.24 | 75.80 Estonian | 99.79 | 84.91 | 99.79 | 87.28 | 89.70 | 81.35 | 79.12 | 79.14 | 67.89 | 58.98 Finnish-FTB | 99.94 | 82.52 | 99.93 | 92.00 | 90.95 | 92.38 | 89.10 | 88.86 | 78.60 | 74.02 Finnish | 99.69 | 86.47 | 99.69 | 94.22 | 95.36 | 90.97 | 89.98 | 86.52 | 77.99 | 73.43 French-ParTUT | 99.68 | 97.56 | 98.34 | 90.44 | 89.91 | 88.02 | 85.96 | 90.53 | 75.37 | 70.70 French-Sequoia | 99.59 | 90.20 | 98.65 | 95.21 | 98.65 | 93.94 | 92.82 | 96.35 | 82.47 | 79.80 French | 99.81 | 97.09 | 99.18 | 96.28 | 99.18 | 96.04 | 95.13 | 96.89 | 87.65 | 85.16 Galician-TreeGal | 99.65 | 81.74 | 98.76 | 90.58 | 86.78 | 88.22 | 85.54 | 92.21 | 69.76 | 64.07 Galician | 99.93 | 98.04 | 99.93 | 96.71 | 96.07 | 99.74 | 95.70 | 96.76 | 80.48 | 77.01 German | 99.93 | 92.25 | 99.91 | 91.27 | 95.13 | 80.25 | 76.21 | 96.06 | 78.21 | 73.11 Gothic | 100.00 | 23.51 | 100.00 | 93.29 | 94.18 | 85.61 | 83.54 | 92.19 | 65.74 | 58.31 Greek | 99.87 | 88.67 | 99.87 | 94.03 | 94.03 | 89.43 | 87.52 | 91.93 | 81.52 | 77.84 Hebrew | 99.89 | 98.57 | 88.15 | 84.30 | 84.30 | 81.45 | 80.80 | 84.79 | 66.16 | 61.38 Hindi | 100.00 | 98.46 | 100.00 | 95.79 | 94.81 | 90.21 | 87.63 | 98.11 | 91.21 | 86.82 Hungarian | 99.91 | 94.55 | 99.91 | 92.23 | 99.91 | 70.56 | 69.72 | 88.32 | 71.09 | 64.62 Indonesian | 99.99 | 90.83 | 99.99 | 93.35 | 99.99 | 99.51 | 93.34 | 99.99 | 80.44 | 73.81 Irish | 99.06 | 92.44 | 99.06 | 88.95 | 86.91 | 74.92 | 71.60 | 85.33 | 73.33 | 62.69 Italian-ParTUT | 99.65 | 95.72 | 99.49 | 94.83 | 94.51 | 94.94 | 93.33 | 95.24 | 81.28 | 77.79 Italian | 99.82 | 93.20 | 99.70 | 96.92 | 96.51 | 97.01 | 95.62 | 97.28 | 86.86 | 84.08 Japanese | 89.53 | 99.71 | 89.53 | 87.12 | 89.53 | 89.50 | 87.12 | 88.98 | 74.40 | 73.49 Kazakh | 97.70 | 100.00 | 97.73 | 62.50 | 63.64 | 62.50 | 57.95 | 71.59 | 34.09 | 20.45 Korean | 99.45 | 91.10 | 99.45 | 93.07 | 87.45 | 99.11 | 87.45 | 98.86 | 62.62 | 55.37 Latin-ITTB | 99.88 | 77.38 | 99.88 | 96.79 | 87.52 | 90.21 | 85.49 | 97.13 | 73.62 | 68.73 Latin-PROIEL | 99.99 | 19.76 | 99.99 | 95.07 | 95.35 | 87.80 | 86.80 | 94.33 | 64.54 | 58.83 Latin | 99.75 | 93.33 | 99.75 | 86.66 | 70.90 | 74.85 | 70.90 | 60.71 | 56.86 | 46.95 Latvian | 98.91 | 96.48 | 98.91 | 89.78 | 75.89 | 82.49 | 75.45 | 86.64 | 67.84 | 61.20 Norwegian-Bokmaal | 99.89 | 96.91 | 99.89 | 97.21 | 99.89 | 95.90 | 95.15 | 97.00 | 87.03 | 83.90 Norwegian-Nynorsk | 99.92 | 93.05 | 99.92 | 96.55 | 99.92 | 95.38 | 94.34 | 96.80 | 84.76 | 81.51 Old_Church_Slavonic | 100.00 | 37.09 | 100.00 | 93.73 | 93.95 | 86.95 | 85.72 | 90.19 | 70.05 | 63.33 Persian | 100.00 | 97.14 | 99.69 | 95.89 | 95.71 | 95.81 | 95.12 | 88.74 | 83.23 | 79.18 Polish | 100.00 | 99.56 | 99.87 | 95.22 | 82.15 | 82.27 | 81.33 | 93.01 | 84.82 | 78.66 Portuguese-BR | 99.96 | 96.65 | 99.83 | 97.21 | 97.21 | 99.65 | 97.19 | 98.77 | 88.04 | 85.90 Portuguese | 99.82 | 89.27 | 99.74 | 96.59 | 73.16 | 93.39 | 72.01 | 96.59 | 87.18 | 84.09 Romanian | 99.55 | 95.16 | 99.55 | 96.56 | 95.52 | 95.73 | 95.20 | 95.88 | 84.49 | 79.08 Russian-SynTagRus | 99.68 | 97.67 | 99.68 | 97.85 | 99.68 | 93.35 | 92.91 | 95.51 | 89.13 | 86.31 Russian | 99.92 | 96.18 | 99.92 | 94.86 | 94.08 | 84.09 | 81.99 | 74.78 | 79.80 | 74.77 Slovak | 100.00 | 77.85 | 100.00 | 92.97 | 74.87 | 77.05 | 74.65 | 84.61 | 79.43 | 73.46 Slovenian-SST | 99.70 | 14.33 | 99.70 | 89.13 | 81.70 | 81.51 | 79.44 | 91.29 | 52.73 | 45.01 Slovenian | 99.94 | 99.59 | 99.94 | 96.29 | 88.18 | 88.45 | 86.46 | 94.78 | 84.05 | 80.61 Spanish-AnCora | 99.98 | 96.33 | 99.94 | 98.10 | 98.10 | 97.53 | 96.82 | 98.09 | 87.31 | 84.33 Spanish | 99.91 | 98.07 | 99.80 | 96.08 | 99.79 | 96.80 | 94.33 | 95.44 | 87.14 | 83.98 Swedish-LinES | 99.97 | 87.28 | 99.97 | 94.62 | 92.07 | 99.97 | 90.81 | 99.97 | 79.15 | 74.21 Swedish | 99.77 | 95.59 | 99.77 | 94.88 | 93.21 | 94.04 | 92.05 | 95.45 | 77.37 | 73.13 Turkish | 99.87 | 96.98 | 97.88 | 90.59 | 89.59 | 85.60 | 83.39 | 88.17 | 60.79 | 53.48 Ukrainian | 99.41 | 68.75 | 99.41 | 85.31 | 65.44 | 65.38 | 64.26 | 83.87 | 61.97 | 53.70 Urdu | 99.99 | 98.37 | 99.99 | 92.43 | 90.46 | 80.29 | 76.19 | 93.19 | 82.99 | 76.12 Uyghur | 100.00 | 70.00 | 100.00 | 70.19 | 74.04 | 100.00 | 70.19 | 100.00 | 55.77 | 34.62 Vietnamese | 83.99 | 96.28 | 83.99 | 75.60 | 73.62 | 83.84 | 73.61 | 83.09 | 44.98 | 40.45
Accuracies when processing gold tokenized texts (FORMs only)
Language | UPOS | XPOS | Feats | AllTags | Lemmas | UAS | LAS ---------------------+-------+--------+-------+---------+--------+-------+------ Ancient_Greek-PROIEL | 96.01 | 96.16 | 88.73 | 87.60 | 92.12 | 77.35 | 72.03 Ancient_Greek | 81.54 | 72.17 | 86.88 | 72.17 | 83.47 | 62.12 | 55.23 Arabic | 94.57 | 88.80 | 88.92 | 87.71 | 92.20 | 80.13 | 73.04 Basque | 92.80 | ----- | 87.32 | 84.91 | 93.31 | 74.59 | 69.23 Bulgarian | 97.72 | 94.52 | 95.43 | 93.61 | 94.62 | 88.02 | 83.22 Catalan | 98.15 | 98.15 | 97.51 | 96.88 | 98.17 | 88.37 | 85.28 Chinese | 91.21 | 91.12 | 98.72 | 89.84 | 99.98 | 74.03 | 68.75 Croatian | 96.15 | ----- | 85.52 | 84.40 | 95.17 | 81.99 | 76.40 Czech-CAC | 98.78 | 90.81 | 89.79 | 89.19 | 97.21 | 86.81 | 83.50 Czech-CLTT | 92.06 | 80.18 | 81.31 | 80.09 | 93.33 | 72.65 | 69.01 Czech | 98.48 | 93.34 | 92.49 | 92.13 | 98.12 | 88.14 | 84.68 Danish | 95.19 | ----- | 94.32 | 93.21 | 95.15 | 78.40 | 74.74 Dutch-LassySmall | 95.65 | ----- | 95.15 | 93.88 | 96.79 | 79.59 | 75.46 Dutch | 94.07 | 90.86 | 91.83 | 89.84 | 92.06 | 79.57 | 74.55 English-LinES | 94.75 | 92.86 | ----- | 91.15 | ----- | 79.00 | 74.68 English-ParTUT | 94.39 | 94.08 | 93.33 | 92.01 | 96.86 | 78.26 | 74.23 English | 94.43 | 93.80 | 95.25 | 92.37 | 97.03 | 83.83 | 80.13 Estonian | 87.52 | 89.93 | 81.43 | 79.15 | 79.30 | 69.26 | 60.16 Finnish-FTB | 92.34 | 91.37 | 92.72 | 89.54 | 89.03 | 80.72 | 76.10 Finnish | 94.52 | 95.65 | 91.31 | 90.32 | 86.76 | 79.97 | 75.37 French-ParTUT | 92.34 | 91.61 | 89.45 | 87.65 | 91.88 | 77.82 | 73.67 French-Sequoia | 96.60 | ----- | 95.20 | 94.08 | 97.62 | 84.35 | 81.93 French | 97.08 | 100.00 | 96.82 | 95.90 | 97.70 | 88.72 | 86.36 Galician-TreeGal | 92.08 | 88.09 | 89.53 | 86.71 | 93.52 | 72.32 | 66.43 Galician | 96.77 | 96.12 | 99.82 | 95.75 | 96.83 | 80.66 | 77.17 German | 91.39 | 95.29 | 80.39 | 76.35 | 96.16 | 79.40 | 74.11 Gothic | 94.22 | 95.01 | 86.13 | 84.55 | 92.14 | 76.42 | 68.92 Greek | 94.17 | 94.17 | 89.56 | 87.63 | 92.03 | 82.37 | 78.69 Hebrew | 95.72 | 95.72 | 92.46 | 91.81 | 95.62 | 83.93 | 78.03 Hindi | 95.79 | 94.82 | 90.23 | 87.64 | 98.12 | 91.29 | 86.90 Hungarian | 92.31 | ----- | 70.62 | 69.78 | 88.40 | 71.52 | 65.04 Indonesian | 93.36 | ----- | 99.52 | 93.35 | ----- | 80.76 | 74.08 Irish | 90.04 | 88.00 | 76.00 | 72.68 | 86.11 | 74.57 | 63.47 Italian-ParTUT | 95.16 | 94.85 | 95.27 | 93.63 | 95.59 | 82.02 | 78.47 Italian | 97.23 | 96.83 | 97.30 | 95.93 | 97.52 | 87.77 | 85.04 Japanese | 96.72 | ----- | 99.94 | 96.72 | 99.33 | 94.31 | 92.94 Kazakh | 65.17 | 66.29 | 64.04 | 59.55 | 74.16 | 37.08 | 23.60 Korean | 93.68 | 88.05 | 99.68 | 88.05 | 99.37 | 63.71 | 56.41 Latin-ITTB | 96.86 | 87.58 | 90.27 | 85.52 | 97.23 | 75.95 | 71.07 Latin-PROIEL | 95.43 | 95.65 | 88.38 | 87.49 | 94.39 | 75.31 | 69.11 Latin | 86.64 | 70.95 | 75.01 | 70.95 | 60.99 | 56.93 | 47.13 Latvian | 90.81 | 76.79 | 83.45 | 76.34 | 87.62 | 69.41 | 62.68 Norwegian-Bokmaal | 97.34 | ----- | 96.02 | 95.28 | 97.11 | 87.52 | 84.38 Norwegian-Nynorsk | 96.74 | ----- | 95.55 | 94.53 | 96.93 | 85.79 | 82.49 Old_Church_Slavonic | 94.07 | 94.27 | 87.46 | 86.23 | 90.42 | 80.41 | 73.19 Persian | 96.17 | 96.00 | 96.10 | 95.40 | 88.98 | 83.89 | 79.81 Polish | 95.34 | 82.28 | 82.41 | 81.44 | 93.15 | 85.18 | 79.01 Portuguese-BR | 97.40 | 97.40 | 99.82 | 97.38 | 98.95 | 88.37 | 86.26 Portuguese | 97.00 | 73.40 | 93.65 | 72.25 | 96.87 | 88.37 | 85.20 Romanian | 96.98 | 95.91 | 96.12 | 95.59 | 96.32 | 85.22 | 79.66 Russian-SynTagRus | 98.20 | ----- | 93.64 | 93.22 | 95.78 | 89.69 | 86.84 Russian | 94.95 | 94.17 | 84.11 | 82.02 | 74.85 | 80.13 | 75.07 Slovak | 93.14 | 75.12 | 77.25 | 74.90 | 84.61 | 81.81 | 75.55 Slovenian-SST | 90.00 | 82.96 | 83.16 | 81.09 | 91.93 | 63.71 | 55.39 Slovenian | 96.34 | 88.24 | 88.50 | 86.50 | 94.82 | 84.16 | 80.72 Spanish-AnCora | 98.16 | 98.16 | 97.59 | 96.89 | 98.17 | 87.53 | 84.54 Spanish | 96.24 | 100.00 | 96.99 | 94.48 | 95.61 | 87.50 | 84.29 Swedish-LinES | 94.63 | 92.16 | ----- | 90.89 | ----- | 79.72 | 74.72 Swedish | 95.17 | 93.48 | 94.29 | 92.32 | 95.68 | 77.94 | 73.64 Turkish | 92.25 | 91.15 | 87.25 | 84.89 | 89.84 | 63.41 | 55.70 Ukrainian | 86.19 | 66.07 | 65.94 | 64.83 | 84.35 | 62.66 | 54.17 Urdu | 92.45 | 90.47 | 80.28 | 76.20 | 93.20 | 83.05 | 76.15 Uyghur | 69.23 | 73.08 | ----- | 69.23 | ----- | 62.50 | 38.46 Vietnamese | 88.68 | 86.29 | 99.77 | 86.28 | 98.78 | 63.99 | 56.34
Accuracies when processing gold tokenized and gold POS files (i.e., all columns except HEAD and DEPREL are gold-standard)
Language | UAS | LAS ---------------------+-------+------ Ancient_Greek-PROIEL | 79.83 | 75.72 Ancient_Greek | 66.91 | 61.65 Arabic | 83.13 | 78.11 Basque | 81.03 | 76.88 Bulgarian | 91.86 | 87.56 Catalan | 90.82 | 88.35 Chinese | 82.14 | 79.37 Croatian | 84.66 | 80.76 Czech-CAC | 88.87 | 86.57 Czech-CLTT | 81.58 | 78.95 Czech | 90.72 | 88.19 Danish | 83.82 | 81.13 Dutch-LassySmall | 83.53 | 80.34 Dutch | 85.99 | 82.43 English-LinES | 83.36 | 80.51 English-ParTUT | 83.75 | 81.29 English | 87.68 | 85.82 Estonian | 81.19 | 76.37 Finnish-FTB | 87.63 | 85.14 Finnish | 84.67 | 82.12 French-ParTUT | 84.22 | 80.61 French-Sequoia | 88.11 | 86.66 French | 90.77 | 89.02 Galician-TreeGal | 79.19 | 74.48 Galician | 82.93 | 80.55 German | 86.20 | 84.06 Gothic | 81.35 | 76.51 Greek | 85.63 | 83.71 Hebrew | 87.32 | 83.18 Hindi | 94.23 | 91.07 Hungarian | 77.80 | 73.98 Indonesian | 82.55 | 78.43 Irish | 77.89 | 71.09 Italian-ParTUT | 87.33 | 85.16 Italian | 90.27 | 88.44 Japanese | 95.94 | 95.48 Kazakh | 44.94 | 34.83 Korean | 68.10 | 62.06 Latin-ITTB | 81.40 | 77.91 Latin-PROIEL | 78.57 | 74.36 Latin | 65.33 | 60.04 Latvian | 76.66 | 72.71 Norwegian-Bokmaal | 91.12 | 88.78 Norwegian-Nynorsk | 90.55 | 87.99 Old_Church_Slavonic | 84.28 | 79.44 Persian | 88.10 | 85.16 Polish | 90.88 | 87.35 Portuguese-BR | 91.04 | 89.57 Portuguese | 91.11 | 89.45 Romanian | 87.36 | 82.25 Russian-SynTagRus | 91.16 | 89.63 Russian | 83.73 | 80.84 Slovak | 87.17 | 83.83 Slovenian-SST | 73.76 | 67.31 Slovenian | 90.45 | 89.15 Spanish-AnCora | 90.10 | 87.55 Spanish | 89.01 | 86.69 Swedish-LinES | 85.10 | 81.38 Swedish | 84.07 | 80.40 Turkish | 66.02 | 60.27 Ukrainian | 74.03 | 69.30 Urdu | 87.13 | 81.62 Uyghur | 75.00 | 53.85 Vietnamese | 69.63 | 66.22
SyntaxNet
See the SyntaxNet repository on Github for a table of SyntaxNet scores: https://github.com/tensorflow/models/blob/master/syntaxnet/g3doc/conll2017/README.md. Note that these scores are obtained by parsers run on gold-standard segmentation, i.e. they compare to the last table above for UDPipe, but they are not realistic with respect to the shared task setup.