home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Finnish-OOD: POS Tags: NUM

There are 186 NUM lemmas (3%), 209 NUM types (3%) and 381 NUM tokens (2%). Out of 15 observed tags, the rank of NUM is: 6 in number of lemmas, 7 in number of types and 10 in number of tokens.

The 10 most frequent NUM lemmas: yksi, 2, 40, 20, kaksi, kolme, 100, 5, 10, 60

The 10 most frequent NUM types: 2, 40, yksi, 20, 5, 10, 100, 60, 90, kaksi

The 10 most frequent ambiguous lemmas: yksi (NUM 18, PRON 1), pari (NUM 8, NOUN 1), puoli (NOUN 14, NUM 2), toinen (ADJ 12, PRON 12, NUM 1)

The 10 most frequent ambiguous types: yksi (NUM 10, PRON 1), 8 (NUM 3, PUNCT 1)

Morphology

The form / lemma ratio of NUM is 1.123656 (the average of all parts of speech is 1.566190).

The 1st highest number of forms (5) was observed with the lemma “sata”: sadan, sadasta, sata, satoja, satojen.

The 2nd highest number of forms (5) was observed with the lemma “yksi”: yhdellä, yhden, yhdestä, yhtenä, yksi.

The 3rd highest number of forms (4) was observed with the lemma “kolme”: kolme, kolmeen, kolmen, kolmessa.

NUM occurs with 4 features: NumType (339; 89% instances), Case (83; 22% instances), Number (82; 22% instances), Typo (1; 0% instances)

NUM occurs with 13 feature-value pairs: Case=Abl, Case=Ade, Case=Ela, Case=Ess, Case=Gen, Case=Ill, Case=Ine, Case=Nom, Case=Par, NumType=Card, Number=Plur, Number=Sing, Typo=Yes

NUM occurs with 15 feature combinations. The most frequent feature combination is NumType=Card (259 tokens). Examples: 2, 40, 20, 5, 10, 100, 60, 90, 2014, yksi

Relations

NUM nodes are attached to their parents using 17 different relations: nummod (216; 57% instances), obl (53; 14% instances), root (27; 7% instances), flat (15; 4% instances), nmod (15; 4% instances), parataxis (13; 3% instances), nmod:poss (10; 3% instances), conj (8; 2% instances), obj (4; 1% instances), orphan (4; 1% instances), advcl (3; 1% instances), appos (3; 1% instances), flat:foreign (3; 1% instances), flat:name (2; 1% instances), nsubj (2; 1% instances), nsubj:cop (2; 1% instances), ccomp (1; 0% instances)

Parents of NUM nodes belong to 11 different parts of speech: NOUN (223; 59% instances), VERB (57; 15% instances), (27; 7% instances), SYM (26; 7% instances), PROPN (16; 4% instances), ADJ (11; 3% instances), ADV (6; 2% instances), NUM (6; 2% instances), X (6; 2% instances), PRON (2; 1% instances), AUX (1; 0% instances)

270 (71%) NUM nodes are leaves.

53 (14%) NUM nodes have one child.

32 (8%) NUM nodes have two children.

26 (7%) NUM nodes have three or more children.

The highest child degree of a NUM node is 7.

Children of NUM nodes are attached using 19 different relations: punct (56; 26% instances), advmod (43; 20% instances), nsubj:cop (35; 16% instances), case (26; 12% instances), nmod (8; 4% instances), conj (7; 3% instances), discourse (7; 3% instances), cop (6; 3% instances), obl (6; 3% instances), advcl (4; 2% instances), appos (4; 2% instances), mark (4; 2% instances), parataxis (4; 2% instances), cc (3; 1% instances), orphan (2; 1% instances), aux (1; 0% instances), det (1; 0% instances), nmod:poss (1; 0% instances), nummod (1; 0% instances)

Children of NUM nodes belong to 14 different parts of speech: PUNCT (56; 26% instances), NOUN (47; 21% instances), ADV (39; 18% instances), ADP (25; 11% instances), SYM (10; 5% instances), AUX (7; 3% instances), ADJ (6; 3% instances), NUM (6; 3% instances), PRON (6; 3% instances), VERB (5; 2% instances), SCONJ (4; 2% instances), CCONJ (3; 1% instances), PROPN (3; 1% instances), X (2; 1% instances)