home edit page issue tracker

This page pertains to UD version 2.

UD for Icelandic

Tokenization and Word Segmentation

Morphology

The PoS-tags follow the universal tag set and do not add any language-specific PoS-tags. The morphological features follow the Icelandic tagging scheme described in (Helgadóttir et. al., 2012). PoS-tags and morphological features were converted automatically to the UD scheme, see details in (Jónsdóttir, 2020).

Sigrún Helgadóttir, Ásta Svavarsdóttir, Eiríkur Rögnvaldsson, Kristín Bjarnadóttir and Hrafn Loftsson. 2012. The Tagged Icelandic Corpus (MÍM). Proceedings of the Workshop on Language Technology for Normalisation of Less-Resourced Languages -SaLTMiL 8 – AfLaT2012, s. 67-72. Istanbul, Turkey. Available online at malheildir.arnastofnun.is/mim Hildur Jónsdóttir. 2020. A Parallel Icelandic Dependency Treebank: Creation, Annotation and Evaluation. MA thesis, University of Iceland. https://skemman.is/handle/1946/34784.

Tags

Features

Syntax

Subjects have the following characteristics:

Objects have the following characteristics:

The following subtypes are used: acl:relcl for relative clauses compound:prt for verb particles flat:name for exocentric complex names flat:foreign for foreign names nmod:poss for possessive/genitive modifiers obl:arg for oblique arguments that are not adjuncts

Treebanks

There are two Icelandic UD treebanks in preparation: