Multilingual Parsing from Raw Text to Universal Dependencies

A CoNLL 2018 shared task.

The proposed task is a follow-up of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies. We first summarize aspects that will be new in 2018; then we provide a more detailed description of the shared task for readers who are not familiar with the 2017 task.

There will be three main evaluation metrics. None of them is more important than the others and we will not combine them into a single ranking. Participants who want to decrease task complexity may concentrate on improvements in just one metric; however, all participating systems will be evaluated with all three metrics, and participants are strongly encouraged to output all relevant annotation (syntax + morphology), even if they just copy values predicted by the baseline model.

The three metrics are described in more detail here. All three include word segmentation and labeled dependency relations. One of them is identical to the 2017 main metric so that results can be compared. The other two metrics focus on content words and include morphological features and lemmatization, respectively.

Instead of surprise languages, there will be a category of low-resource languages that have little or no training data. The names of the languages, as well as whatever sample data may be available, will not be kept as surprise.

There will be new languages that were not part of the 2017 evaluation (Afrikaans and Serbian already satisfy the requirements; others may be available when the training data is released).

Description of the Shared Task

The focus of the task is learning syntactic dependency parsers that can work in a real-world setting, starting from raw text, and that can work over many typologically different languages, even low-resource languages for which there is little or no training data, by exploiting a common syntactic annotation standard. This task has been made possible by the Universal Dependencies initiative (UD,, which has developed treebanks for 60+ languages with cross-linguistically consistent annotation and recoverability of the original raw texts.

Participating systems will have to find labeled syntactic dependencies between words, i.e. a syntactic head for each word, and a label classifying the type of the dependency relation. In addition to syntactic dependencies, prediction of morphology and lemmatization will be evaluated. There will be multiple test sets in various languages but all data sets will adhere to the common annotation style of UD. Participants will be asked to parse raw text where no gold-standard pre-processing (tokenization, lemmas, morphology) is available. We will provide data preprocessed by a baseline system (UDPipe, so that the participants can focus on improving just one part of the processing pipeline, if they want to. We believe that this makes the task reasonably accessible for everyone.

We do not plan on running separate open and closed tracks. All our tracks will be formally closed, but the list of permitted resources is rather broad and includes large raw corpora and parallel corpora (see the Data description).

The task is open to everyone. The organizers rely, as is usual in large shared tasks, on the honesty of all participants who might have some prior knowledge of part of the data that will eventually be used for evaluation, not to unfairly use such knowledge. The only exception are the co-chairs of the organizing team, who cannot submit a system, and who will serve as an authority to resolve any disputes concerning ethical issues or completeness of system descriptions.


The organization of the shared task was partially supported by the following projects:

  • Czech Science Foundation (GAČR) project No. 15-10472S.
  • CRACKER, an EU H2020 project.