home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Icelandic-Modern: POS Tags: NUM

There are 245 NUM lemmas (4%), 274 NUM types (3%) and 1048 NUM tokens (1%). Out of 16 observed tags, the rank of NUM is: 6 in number of lemmas, 6 in number of types and 14 in number of tokens.

The 10 most frequent NUM lemmas: tveir, þrír, hundrað, fjórir, 200, núll, fimm, fimmtíu, sextán, tíu

The 10 most frequent NUM types: 100, 2, 200, 0, 50, tveimur, tvö, 2012, 3, 16

The 10 most frequent ambiguous lemmas: hundrað (NUM 51, NOUN 1), þúsund (NUM 18, NOUN 2), átján (NUM 17, NOUN 1), átta (NUM 17, VERB 12), 2016 (NUM 13, X 2), einn (DET 170, NUM 11, ADV 1), 22 (NUM 3, X 1), 110 (NUM 2, NOUN 1), 23. (NUM 2, ADJ 1), fjórði (ADJ 10, NUM 2)

The 10 most frequent ambiguous types: 3 (NUM 18, X 1), 18 (NUM 16, NOUN 1), 2016 (NUM 13, X 2), 5 (NUM 10, ADJ 1), átta (NUM 9, VERB 7), 8 (NUM 8, X 1), níu (NUM 8, NOUN 1), 1 (DET 17, NUM 6), þúsund (NUM 6, NOUN 1), 22 (NUM 3, X 1)

Morphology

The form / lemma ratio of NUM is 1.118367 (the average of all parts of speech is 1.738114).

The 1st highest number of forms (9) was observed with the lemma “tveir”: 2, 2, tveggja, tveim, tveimur, tveir, tvo, tvær, tvö.

The 2nd highest number of forms (8) was observed with the lemma “þrír”: 3, 3, þremur, þriggja, þrjá, þrjár, þrjú, þrír.

The 3rd highest number of forms (6) was observed with the lemma “fjórir”: 4, fjóra, fjórir, fjórum, fjögur, fjögurra.

NUM occurs with 6 features: NumType (911; 87% instances), Case (353; 34% instances), Gender (224; 21% instances), Number (224; 21% instances), Definite (1; 0% instances), Degree (1; 0% instances)

NUM occurs with 13 feature-value pairs: Case=Acc, Case=Dat, Case=Gen, Case=Nom, Definite=Ind, Degree=Pos, Gender=Fem, Gender=Masc, Gender=Neut, NumType=Card, NumType=Ord, Number=Plur, Number=Sing

NUM occurs with 24 feature combinations. The most frequent feature combination is NumType=Card (685 tokens). Examples: 100, 2, 200, 50, 2012, 3, 2016, 2010, 4, 10

Relations

NUM nodes are attached to their parents using 16 different relations: nummod (578; 55% instances), obl (184; 18% instances), conj (94; 9% instances), appos (59; 6% instances), xcomp (36; 3% instances), nmod:poss (34; 3% instances), dep (13; 1% instances), advcl (10; 1% instances), acl:relcl (9; 1% instances), amod (9; 1% instances), obj (7; 1% instances), root (7; 1% instances), nsubj (4; 0% instances), compound (2; 0% instances), acl (1; 0% instances), iobj (1; 0% instances)

Parents of NUM nodes belong to 12 different parts of speech: NOUN (677; 65% instances), NUM (151; 14% instances), PROPN (92; 9% instances), VERB (80; 8% instances), ADV (10; 1% instances), ADJ (8; 1% instances), DET (7; 1% instances), (7; 1% instances), AUX (5; 0% instances), ADP (4; 0% instances), PART (4; 0% instances), PRON (3; 0% instances)

677 (65%) NUM nodes are leaves.

216 (21%) NUM nodes have one child.

105 (10%) NUM nodes have two children.

50 (5%) NUM nodes have three or more children.

The highest child degree of a NUM node is 14.

Children of NUM nodes are attached using 21 different relations: punct (198; 32% instances), conj (111; 18% instances), cc (62; 10% instances), nummod (62; 10% instances), obl (50; 8% instances), case (33; 5% instances), amod (27; 4% instances), advmod (20; 3% instances), mark (17; 3% instances), cop (11; 2% instances), det (6; 1% instances), nmod:poss (6; 1% instances), nsubj (5; 1% instances), advcl (3; 0% instances), acl:relcl (2; 0% instances), dep (2; 0% instances), acl (1; 0% instances), appos (1; 0% instances), compound (1; 0% instances), compound:prt (1; 0% instances), obj (1; 0% instances)

Children of NUM nodes belong to 14 different parts of speech: PUNCT (198; 32% instances), NUM (151; 24% instances), NOUN (76; 12% instances), CCONJ (62; 10% instances), ADP (34; 5% instances), ADV (21; 3% instances), SCONJ (17; 3% instances), ADJ (15; 2% instances), DET (13; 2% instances), AUX (11; 2% instances), VERB (11; 2% instances), PRON (7; 1% instances), PROPN (3; 0% instances), X (1; 0% instances)