Treebank Statistics: UD_Finnish-OOD: POS Tags: NUM
There are 186 NUM
lemmas (3%), 209 NUM
types (3%) and 381 NUM
tokens (2%).
Out of 15 observed tags, the rank of NUM
is: 6 in number of lemmas, 7 in number of types and 10 in number of tokens.
The 10 most frequent NUM
lemmas: yksi, 2, 40, 20, kaksi, kolme, 100, 5, 10, 60
The 10 most frequent NUM
types: 2, 40, yksi, 20, 5, 10, 100, 60, 90, kaksi
The 10 most frequent ambiguous lemmas: yksi (NUM 18, PRON 1), pari (NUM 8, NOUN 1), puoli (NOUN 14, NUM 2), toinen (ADJ 12, PRON 12, NUM 1)
The 10 most frequent ambiguous types: yksi (NUM 10, PRON 1), 8 (NUM 3, PUNCT 1)
- yksi
- 8
Morphology
The form / lemma ratio of NUM
is 1.123656 (the average of all parts of speech is 1.566190).
The 1st highest number of forms (5) was observed with the lemma “sata”: sadan, sadasta, sata, satoja, satojen.
The 2nd highest number of forms (5) was observed with the lemma “yksi”: yhdellä, yhden, yhdestä, yhtenä, yksi.
The 3rd highest number of forms (4) was observed with the lemma “kolme”: kolme, kolmeen, kolmen, kolmessa.
NUM
occurs with 4 features: NumType (339; 89% instances), Case (83; 22% instances), Number (82; 22% instances), Typo (1; 0% instances)
NUM
occurs with 13 feature-value pairs: Case=Abl
, Case=Ade
, Case=Ela
, Case=Ess
, Case=Gen
, Case=Ill
, Case=Ine
, Case=Nom
, Case=Par
, NumType=Card
, Number=Plur
, Number=Sing
, Typo=Yes
NUM
occurs with 15 feature combinations.
The most frequent feature combination is NumType=Card
(259 tokens).
Examples: 2, 40, 20, 5, 10, 100, 60, 90, 2014, yksi
Relations
NUM
nodes are attached to their parents using 17 different relations: nummod (216; 57% instances), obl (53; 14% instances), root (27; 7% instances), flat (15; 4% instances), nmod (15; 4% instances), parataxis (13; 3% instances), nmod:poss (10; 3% instances), conj (8; 2% instances), obj (4; 1% instances), orphan (4; 1% instances), advcl (3; 1% instances), appos (3; 1% instances), flat:foreign (3; 1% instances), flat:name (2; 1% instances), nsubj (2; 1% instances), nsubj:cop (2; 1% instances), ccomp (1; 0% instances)
Parents of NUM
nodes belong to 11 different parts of speech: NOUN (223; 59% instances), VERB (57; 15% instances), (27; 7% instances), SYM (26; 7% instances), PROPN (16; 4% instances), ADJ (11; 3% instances), ADV (6; 2% instances), NUM (6; 2% instances), X (6; 2% instances), PRON (2; 1% instances), AUX (1; 0% instances)
270 (71%) NUM
nodes are leaves.
53 (14%) NUM
nodes have one child.
32 (8%) NUM
nodes have two children.
26 (7%) NUM
nodes have three or more children.
The highest child degree of a NUM
node is 7.
Children of NUM
nodes are attached using 19 different relations: punct (56; 26% instances), advmod (43; 20% instances), nsubj:cop (35; 16% instances), case (26; 12% instances), nmod (8; 4% instances), conj (7; 3% instances), discourse (7; 3% instances), cop (6; 3% instances), obl (6; 3% instances), advcl (4; 2% instances), appos (4; 2% instances), mark (4; 2% instances), parataxis (4; 2% instances), cc (3; 1% instances), orphan (2; 1% instances), aux (1; 0% instances), det (1; 0% instances), nmod:poss (1; 0% instances), nummod (1; 0% instances)
Children of NUM
nodes belong to 14 different parts of speech: PUNCT (56; 26% instances), NOUN (47; 21% instances), ADV (39; 18% instances), ADP (25; 11% instances), SYM (10; 5% instances), AUX (7; 3% instances), ADJ (6; 3% instances), NUM (6; 3% instances), PRON (6; 3% instances), VERB (5; 2% instances), SCONJ (4; 2% instances), CCONJ (3; 1% instances), PROPN (3; 1% instances), X (2; 1% instances)