LAS Ranking

    1. HIT-SCIR (Harbin)             75.84 ± 0.14 [OK]  (p<0.001)
    2. TurkuNLP (Turku)              73.28 ± 0.14 [OK]  (p=0.039)
  3-5. UDPipe Future (Praha)         73.11 ± 0.13 [OK]  (p=0.221)
  3-5. LATTICE (Paris)               73.02 ± 0.14 [OK]  (p=0.461)
  3-5. ICS PAS (Warszawa)            73.02 ± 0.14 [OK]  (p<0.001)
    6. CEA LIST (Paris)              72.56 ± 0.14 [OK]  (p=0.036)
  7-8. Uppsala (Uppsala)             72.37 ± 0.15 [OK]  (p=0.191)
  7-8. Stanford (Stanford)           72.29 ± 0.14 [OK]  (p<0.001)
 9-10. AntNLP (Shanghai)             70.90 ± 0.15 [OK]  (p=0.242)
 9-10. NLP-Cube (București)          70.82 ± 0.14 [OK]  (p=0.032)
   11. ParisNLP (Paris)              70.64 ± 0.14 [OK]  (p<0.001)
   12. SLT-Interactions (Bengaluru)  69.98 ± 0.14 [OK]  (p<0.001)
   13. IBM NY (Yorktown Heights)     69.11 ± 0.16 [OK]  (p<0.001)
   14. UniMelb (Melbourne)           68.66 ± 0.15 [OK]  (p=0.002)
   15. LeisureX (Shanghai)           68.31 ± 0.16 [OK]  (p<0.001)
   16. KParse (İstanbul)             66.58 ± 0.16 [OK]  (p=0.015)
   17. Fudan (Shanghai)              66.34 ± 0.15 [OK]  (p<0.001)
   18. BASELINE UDPipe 1.2 (Praha)   65.80 ± 0.15 [OK]  (p=0.048)
   19. Phoenix (Shanghai)            65.61 ± 0.16 [OK]  (p<0.001)
   20. CUNI x-ling (Praha)           64.87 ± 0.16 [OK]  (p<0.001)
   21. BOUN (İstanbul)               63.54 ± 0.15 [OK]  (p<0.001)
   22. ONLP lab (Ra'anana)           58.35 ± 0.15 [81]  (p<0.001)
   23. iParse (Pittsburgh)           55.83 ± 0.11 [65]  (p<0.001)
   24. HUJI (Yerushalayim)           53.69 ± 0.15 [80]  (p<0.001)
   25. ArmParser (Yerevan)           47.02 ± 0.11 [66]  (p<0.001)
   26. SParse (İstanbul)              1.95 ± 0.00 [2]

MLAS Ranking

    1. UDPipe Future (Praha)         61.25 ± 0.13  (p=0.007)
  2-3. TurkuNLP (Turku)              60.99 ± 0.14  (p=0.254)
  2-3. Stanford (Stanford)           60.92 ± 0.13  (p<0.001)
    4. ICS PAS (Warszawa)            60.25 ± 0.13  (p<0.001)
    5. CEA LIST (Paris)              59.92 ± 0.14  (p<0.001)
    6. HIT-SCIR (Harbin)             59.78 ± 0.14  (p<0.001)
    7. Uppsala (Uppsala)             59.20 ± 0.15  (p<0.001)
    8. NLP-Cube (București)          57.32 ± 0.14  (p<0.001)
    9. LATTICE (Paris)               57.01 ± 0.14  (p<0.001)
   10. AntNLP (Shanghai)             55.92 ± 0.13  (p=0.034)
   11. ParisNLP (Paris)              55.74 ± 0.14  (p<0.001)
   12. SLT-Interactions (Bengaluru)  54.52 ± 0.13  (p<0.001)
13-14. LeisureX (Shanghai)           53.70 ± 0.14  (p=0.239)
13-14. UniMelb (Melbourne)           53.62 ± 0.14  (p<0.001)
   15. KParse (İstanbul)             53.25 ± 0.15  (p<0.001)
   16. Fudan (Shanghai)              52.69 ± 0.15  (p=0.005)
17-18. BASELINE UDPipe 1.2 (Praha)   52.42 ± 0.14  (p=0.066)
17-18. Phoenix (Shanghai)            52.26 ± 0.15  (p<0.001)
19-20. BOUN (İstanbul)               50.40 ± 0.15  (p=0.494)
19-20. CUNI x-ling (Praha)           50.35 ± 0.15  (p<0.001)
   21. ONLP lab (Ra'anana)           46.09 ± 0.15  (p<0.001)
   22. iParse (Pittsburgh)           45.65 ± 0.12  (p<0.001)
   23. HUJI (Yerushalayim)           44.60 ± 0.14  (p<0.001)
   24. IBM NY (Yorktown Heights)     40.61 ± 0.13  (p<0.001)
   25. ArmParser (Yerevan)           36.28 ± 0.12  (p<0.001)
   26. SParse (İstanbul)              1.68 ± 0.00

BLEX Ranking

    1. TurkuNLP (Turku)              66.09 ± 0.13  (p<0.001)
    2. HIT-SCIR (Harbin)             65.33 ± 0.13  (p<0.001)
  3-4. UDPipe Future (Praha)         64.49 ± 0.14  (p=0.301)
  3-4. ICS PAS (Warszawa)            64.44 ± 0.14  (p<0.001)
    5. Stanford (Stanford)           64.04 ± 0.13  (p<0.001)
  6-7. LATTICE (Paris)               62.39 ± 0.14  (p=0.071)
  6-7. CEA LIST (Paris)              62.23 ± 0.15  (p<0.001)
    8. AntNLP (Shanghai)             60.91 ± 0.14  (p=0.017)
    9. ParisNLP (Paris)              60.70 ± 0.14  (p<0.001)
   10. SLT-Interactions (Bengaluru)  59.68 ± 0.14  (p<0.001)
   11. UniMelb (Melbourne)           58.67 ± 0.14  (p=0.009)
   12. LeisureX (Shanghai)           58.42 ± 0.14  (p<0.001)
13-14. BASELINE UDPipe 1.2 (Praha)   55.80 ± 0.15  (p=0.218)
13-14. Phoenix (Shanghai)            55.71 ± 0.15  (p=0.044)
   15. NLP-Cube (București)          55.52 ± 0.14  (p=0.007)
   16. KParse (İstanbul)             55.26 ± 0.15  (p<0.001)
17-18. CUNI x-ling (Praha)           54.07 ± 0.15  (p=0.360)
17-18. Fudan (Shanghai)              54.03 ± 0.15  (p<0.001)
   19. BOUN (İstanbul)               53.45 ± 0.15  (p<0.001)
   20. iParse (Pittsburgh)           48.71 ± 0.11  (p<0.001)
   21. HUJI (Yerushalayim)           48.05 ± 0.15  (p<0.001)
   22. ArmParser (Yerevan)           39.18 ± 0.12  (p<0.001)
   23. IBM NY (Yorktown Heights)     32.55 ± 0.13  (p<0.001)
   24. Uppsala (Uppsala)             32.09 ± 0.13  (p<0.001)
   25. ONLP lab (Ra'anana)           28.29 ± 0.12  (p<0.001)
   26. SParse (İstanbul)              1.71 ± 0.00

Other rankings

All scores were computed by the official evaluation script. The 95% confidence intervals and p-values were computed by Udapi using gold re-segmentation and bootstrap resampling. The p-values were computed by a paired bootstrap test for a given system and the system on the following line. System pairs with p<0.05 are considered significantly different, other pairs are assigned the same range of ranks.

Outputs of system runs are available from http://hdl.handle.net/11234/1-2885.