1. Team name + primary software system ID: 2. Tokenization, word and sentence segmentation 2a. Comments to Tokenization, word and sentence segmentation (optional) 3. Morphology (lemmas, UPOS, XPOS, features) 3a. Comments to Morphology (lemmas, UPOS, XPOS, features) (optional) 4. Parsing: describe if your system is based on a single parser or ensamble, identify all parsers used 5. Word Embeddings 5a. Comments to Word Embeddings (optional) 6. Additional Data - Apart from the UD 2.0 training data and the raw data mentioned above, my system uses the following data permitted in the shared task (write "None" is no such data is used): 7. Multilinguality and surprise languages 7a. Comments to Multilinguality and Surprise Languages (optional)
C2L2 Software 5 Baseline UDPipe
Baseline UDPipe, No morphology, We used UPOS tags as an auxiliary task. And only for delexicalized parsers, we used morphological features.
We used an ensemble of first-order MST parsers and arc-eager and arc-hybrid transition-based parsers. No vectors, We have dense vector representations for words, but those are initialized randomly, and trained along with the parsing tasks.
None Cross for surprise, Cross-lingual for small treebanks, One fixed model for PUD We trained delexicalized parsers as our "cross-lingual techniques"
CLCL (Genève) software2-P Baseline UDPipe
Baseline UDPipe
Our system is a single parser We train our own word embeddings as part of training the parser, using only the training treebank We tried initialising with Facebook embeddings on a sample of languages, but random initialisation works better for us None Cross for surprise Cross-lingual technique used: we identified the most similar languages to the surprise language with a string-based technique, concatenated the treebanks, trained and tested on the surprise languages.
darc (Tübingen) + software1 Retrained UDPipe myself
Own
We implemented a transition-based parser. UD vectors
None Cross for surprise, One fixed model for PUD
fbaml Own
Own No Lemmas in our system a single parser Baseline vectors
None Train on sample
HIT-SCIR+software4 Own
No morphology
We used either single parser, ensembled parser or transferred parser for different languages, according to performance on the development dataset. Baseline vectors
We used the OPUS parallel data for a subset of language pairs. Cross for surprise, Cross-lingual for small treebanks, Domain adaptation for PUD, One fixed model for PUD
IMS + Software2 Improved UDPipe only apply tokenization on ar, he, ja, vi, zh; sentence segmentation on ar, cu, en, et, got, grc_proiel, la, la_proiel, la_ittb, nl_lassysmall, sl_sst; otherwise udpipe Own
we reparse with outputs of multiple different parsers. one graph-based perceptron (Carreras decoder), one transition-based perceptron with beam search (with Swap system), one greedy transition-based neural parser with character model (Swap). Baseline vectors
None Cross for surprise, Train on sample, Union of models for PUD
Koç University, software3 Baseline UDPipe
Baseline UDPipe Only used UPOS, did not yet explore other features. Single parser. Crawled data own vectors Word and context embeddings were produced by a BLSTM trained on raw data. None Cross for surprise, Cross-lingual for small treebanks
LATTICE Software7 Baseline UDPipe
Baseline UDPipe
We are using single parser but different features were used for each languages Baseline vectors, Facebook vectors, I computed my own vectors on other permitted data (please indicate in the comment) 1. Mainly, we used pre-computed word embedding by the organizers.
2. We've used multilingual word embeddings that made by Europar and WMT data, we can find resources from OPUS website.
3. We've also used embeddings from Facebook and changed it multilingual way for Surprise languages.
For the details,
- Wikipedia dumps for surprise languages
- Europar7 in OPUS for generating multi-word embeddings to make a model for all the *_partut
- WMT 2011 in OPUS for generating multi-word embeddings (just English)
- Bilingual Dictionaries In OPUS for surprise languages
- Language hot encodings made by hand...
Cross for surprise, Cross-lingual for small treebanks, Domain adaptation for PUD, Union of models for PUD Actually, we found out using multilingual embedding was effective even resource rich languages. However, it is almost impossible to mix up multi-word embeddings by using Facebook one(monolingual) because file size was too big. So that we could not apply some languages with multilingual approach.
Anyway, Thanks for your service.. It was really smooth :)
LIMSI 2 Baseline UDPipe, Improved UDPipe Does not care about segmentation in general, but with 4 exceptions: for vi, tokenization trick on top of UDPipe's tokenization; for ja/ja_pud/zh, reads plain text sentences from UDPipe's output and retokenizes. Baseline UDPipe When using custom tokenization: morphology reannotated but still with the official UDPipe model. It depends on the language. For big treebanks: the official UDPipe model. For medium treebanks: a single PanParser model (in-house ArcEager structured perceptron, with dynamic oracle). For the smallest treebanks (and a few larger ones, tuning this choice on the devsets): a custom combination of UDPipe and PanParser, specially retrained so that each one annotates part of the sentence (with additional ensembling). Baseline vectors
None Cross for surprise, Cross-lingual for small treebanks, One fixed model for PUD
LyS-FASTPARSE + software5 Baseline UDPipe
Baseline UDPipe
Single parser: a bidirectional LSTM non-projective transition-based parser Baseline vectors word embeddings pre-computed by the organizers were skipped for the largest (~ top largest 20 treebanks) models due to the lack of resources/time None Cross for surprise, One fixed model for PUD
Mengest software1-P Baseline UDPipe
Baseline UDPipe My system only uses lemmas, UPOS and XPOS. My system based on a single parser: Transition-based BIST parser Crawled data own vectors
None Canonical surprise, One fixed model for PUD
MetaRomance Baseline UDPipe
Baseline UDPipe
single parser: unsupervised strategy based on cross-lingual rules for Romance languages. We applied the same parser with the same rules for all languages. No vectors
None Canonical surprise
METU + Software 2 Baseline UDPipe
Baseline UDPipe
A single parser: Chen & Manning, 2014 Baseline vectors
PTB + CCGbank Cross-lingual for small treebanks, One fixed model for PUD
MQuni Baseline UDPipe
No morphology
Our system is based on a single parser No vectors We do not use any external resource such as pre-trained word embeddings. We use a fixed random seed and a fixed set of hyper-parameters for all treebanks. None Train on sample, One fixed model for PUD
MQuni + software2 Baseline UDPipe
No morphology
My system is based on a single parser. In fact, it is a joint system for POS tagging and graph-based dependency parsing, trained with a fixed set of hyper-parameters (i.e. no hyper-parameter tuning) for every language. I do not use *pre-trained* word vectors in any experiment, I only use a fixed random seed.
None Train on sample, One fixed model for PUD Mean rank or MRR should be officially used to rank participating systems. LAS should only be used to compute system rank for each language.
NAIST SATO (software1) Baseline UDPipe
Baseline UDPipe
single parser Baseline vectors
None Canonical surprise, One fixed model for PUD We use Adversarial Training technique for different domain (but same language
) datasets.
OpenU-NLP-Lab software6 Baseline UDPipe, Improved UDPipe
Baseline UDPipe, Udpipe used for a handful of languages and surprise languages, otherwise we did our own morphology
Single parser, yap No vectors
None Cross-lingual for small treebanks, One fixed model for PUD
Orange-Deskiñ software1-P Baseline UDPipe
No morphology We used lemmas where available, didn't have the time to integrage morphological features (meanwhile it's done, and gives a few points more...) modified BistParser with modified underlying pyCNN neural network library.
trained individually on all languages, we then chose the best configuration for each language.
Crawled data own vectors made the difference! None Cross for surprise, One fixed model for PUD For the surprise languages we chose typological close languages (cs for hsb, fi for sme, fa for kmr and hi (!) for bxr. We tried using the same approach for kk and ug (using tr) but surprisingly for kk and ug this did not work as expected
ParisNLP (software1) Baseline UDPipe, Improved UDPipe, Own we have 4 configurations depending on the language: (i) full UDPipe, (ii) UDPipe tokenization/segmentation + our own custom tagging and morph (using only a subset of morph features), (iii) full custom (tokenization, sentence segmentation/tagging + morf), (iv) same as (iii) but using UDPipe's sentence segmentation instead of ours. Our tokenizer/segmenter is a data-driven word model. Word segmentation (e.g. building "2-3 gonna / 2 going / 3 to" from the token "gonna") was extracted from the UDPipe baseline, after we synchronized it with the output of our tokenisation. Baseline UDPipe, Own, Apertium/Giellatekno All lemmas were extracted from UDPipe. Whenever possible, our taggers where enriched with lexical features coming from lexica extracted from raw and bilingual corpora and from Apertium and Giellatekno's morphological analysers. In a few cases, the extracted lexicons were extended using the word embeddings provided. Among the corpora for which we used our own tagger, most of them (except for Japanese, 2 small languages and all 4 surprise languages) took advantage of such lexical features. Our tagger is an extension of the Melt Tagger (Denis and Sagot 2012) where the learning component is now done with vowpal wabbit in OAA mode. Single parser. (DyalogSR (de la Clergerie 2013) extended from a transition based with a MST model (so 2 classical models) and then both of them also "neuralized" with Dynet (classic, bi-lstm, char-models, etc..) the char model was not used for all languages. All of our models are feature-rich (including the neural ones). Each models with different beams and other hyperparameters were tested on the dev set and we used the best as the final model.
Note: as you may know, our official run used all generic delexicalized models (see below multilinguality) instead of the best models because of our working assumption that then dev and trial metadata will be the same as the test one. We were in fact invited to not use the missing information 3 days before the initial deadline but the information was lost to us. Should there be a second edition, please tell the organizers that the trial metadata should be the same as the test. There's no way to know otherwise. It may look funny but it's not when one has invested a lot of efforts for nothing.
Baseline vectors, Crawled data own vectors, we also extracted pseudo brown clusters from the pre-computed embedding due to the impossibility to process all the provided data, we couldn't retrain the embeddings on our own tokenization. So where we used our own tokenization, the embeddings were very likely suboptimal.
The embeddings we re-computed ourselves only cover several languages, and were only used for extending a few of the external lexicons used by our tagger.
For POS tagging, see above.
For parsing: none
We even did not use the UD_French Treebank, as no one asked for it, so we decided it wouldn't be fair to use it to artificially improve our French results by augmenting the training data-it would have worked well for the sequoia treebank - we'll test that if we have the time before the deadline)
Cross for surprise, Cross-lingual for small treebanks, One fixed model for PUD We relied on language typology and trained delexicalized models using typologically related languages (South Slavic corpora for Upper Sorbian, as the result of a typo -it should have been West Slavic corpora-, and Finnish+Estonian for North Saami) or, as a default, all 46 corpora.
RACAI - software1 Own
Own
Our system uses RBG parser Crawled data own vectors
None Cross for surprise
Stanford software1-P Baseline UDPipe
Own Our system ignores lemmas and features, predicting and utilizing only UPOS and XPOS tags. For surprise languages, it uses UDPipe's predicted UPOS tags. Single parser for regular languages, ensemble delexicalized parser for surprise languages. Baseline vectors, Facebook vectors Gothic had no pretrained CoNLL vectors, so we used Facebook vectors only for that language. None Cross for surprise, One fixed model for PUD
Team: IIT Kharagpur NLP Research Group. Primary software id: software3 Baseline UDPipe
Baseline UDPipe
We used the Parsito parser implemented in the UDPipe pipeline Baseline vectors
None Cross for surprise, Union of models for PUD, One fixed model for PUD If the additional parallel test set is in a language for which there are multiple training treebanks, we choose between a combined model of all the treebanks and a model trained on a single treebank based on the performance of the models on the development sets of the available treebanks.

For surprise languages we used delexicalized parser models trained on combination of source treebanks that gave best performance on the sample data. We also used syntactic transformations on some of the source language treebanks based on WALS features of the surprise languages to improve performance.
TRL software1-P Own Used in-house systems for some languages and applied general ones for others Own Used in-house systems for some languages and naive mappers for others simple delexicalized statistics (just PoS and distance) + deterministic rules No vectors
None, but used some in-house tokenizers One fixed model for PUD
TurkuNLP Software1 Baseline UDPipe
Own We predicted upos and features using UDPipe in the standard way, in xpos field we concatenated upos+features (and overwrote previous xpos), and predicted also this with UDPipe Single parser, UDPipe with pre-trained embeddings Crawled data own vectors We pool crawled data + treebank data for each language and syntactic analysis are used to create word embeddings None One fixed model for PUD, Suprise languages: I just pick one existing parsing model for each suprise language No cross-lingual, nor cross-treebank training, except that the word embeddings were created from data which is sometimes parsed with a parser trained on different treebank
UALing + Software1 Baseline UDPipe
Baseline UDPipe
We focus on corpus compression by measuring similarity between dev and training. If there are no dev data, we use the entire training data. For surprise languages, we measure similarity between sample and entire training set (64 languages). Therefore, we use only 76% of the original training data. Parsing including all preprocessing is entirely dependent on UDPipe. No vectors
None Cross for surprise
ÚFAL – UDPipe 1.2, software1 Own I use UDPipe 1.2 which has higher GRU dimension; plus I use additional segmentation tricks for `sl_sst`, `got` and `la_proiel`. Own The morphology is basically the baseline one, just trained on the whole training data and using development for hyperparameter search Single-model UDPipe. UD vectors I precompute word embeddings on UD data itself -- so no additional data is needed for the word embeddings. None Cross for surprise, One fixed model for PUD If there are multiple treebanks for one language, I try to extend the training set for a treebank by using a limited number of training data from other treebanks of the same language.
UParse + 1 Baseline UDPipe
Baseline UDPipe
We use combination of monolingual and multilingual parsers, and also UDPipe parser for treebanks where our system does not outperforms the baseline on the development set. Facebook vectors, I computed my own vectors on other permitted data (please indicate in the comment) We also use OPUS parallel data to extend the coverage of the word embeddings. None Cross for surprise, One fixed model for PUD
Uppsala Software 1 Own We model joint sentence and word segmentation as a sequence labelling problem No morphology
We use a single parser Crawled data own vectors
None Cross for surprise, Cross-lingual for small treebanks, One fixed model for PUD
UT Baseline UDPipe
Own, Apertium/Giellatekno Morphology was exported from Apertium as a UDpipe-compatible 5 column tsv file Ensemble of UDpipe and BiST Facebook vectors
None Cross for surprise, We looked for the model that got the best results on the sample and then added the Apertium morphology.
Wanghao-ftd-SJTU Software 2 Own
Baseline UDPipe
One treebank,One Parsing. No vectors
None delexicalized & gold PoS for surprise language