home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_North_Sami-Giella: POS Tags: NUM

There are 111 NUM lemmas (2%), 142 NUM types (2%) and 348 NUM tokens (1%). Out of 14 observed tags, the rank of NUM is: 6 in number of lemmas, 8 in number of types and 12 in number of tokens.

The 10 most frequent NUM lemmas: guokte, okta, golbma, máŋga, vihtta, moadde, njeallje, galle, guhtta, 12

The 10 most frequent NUM types: guokte, golbma, ovtta, okta, moadde, máŋga, golmma, vihtta, guovtti, máŋgga

The 10 most frequent ambiguous lemmas: vihtta (NUM 19, NOUN 1), 10 (ADJ 2, NUM 2), 9 (ADJ 1, NUM 1)

The 10 most frequent ambiguous types: oktan (ADV 12, NUM 1)

Morphology

The form / lemma ratio of NUM is 1.279279 (the average of all parts of speech is 1.749612).

The 1st highest number of forms (7) was observed with the lemma “guokte”: guokte, guoktin, guovtte, guovtti, guovttit, guvttiid, guvttiin.

The 2nd highest number of forms (5) was observed with the lemma “golbma”: golbma, golmma, golmmaide, golmmaiguin, golmmain.

The 3rd highest number of forms (5) was observed with the lemma “okta”: okta, oktan, ovtta, ovttaid, ovttain.

NUM occurs with 3 features: NumType (348; 100% instances), Number (346; 99% instances), Case (344; 99% instances)

NUM occurs with 10 feature-value pairs: Case=Acc, Case=Com, Case=Ess, Case=Gen, Case=Ill, Case=Loc, Case=Nom, NumType=Card, Number=Plur, Number=Sing

NUM occurs with 13 feature combinations. The most frequent feature combination is Case=Nom|Number=Sing|NumType=Card (113 tokens). Examples: okta, guokte, golbma, máŋga, njeallje, vihtta, moadde, 1971, 2005, 50

Relations

NUM nodes are attached to their parents using 13 different relations: nummod (239; 69% instances), obl (48; 14% instances), amod (37; 11% instances), nsubj (6; 2% instances), root (4; 1% instances), conj (3; 1% instances), obj (3; 1% instances), nmod:poss (2; 1% instances), xcomp (2; 1% instances), advcl (1; 0% instances), flat (1; 0% instances), nmod (1; 0% instances), parataxis (1; 0% instances)

Parents of NUM nodes belong to 9 different parts of speech: NOUN (271; 78% instances), VERB (55; 16% instances), ADJ (5; 1% instances), PROPN (5; 1% instances), NUM (4; 1% instances), (4; 1% instances), ADV (2; 1% instances), ADP (1; 0% instances), PRON (1; 0% instances)

262 (75%) NUM nodes are leaves.

78 (22%) NUM nodes have one child.

3 (1%) NUM nodes have two children.

5 (1%) NUM nodes have three or more children.

The highest child degree of a NUM node is 4.

Children of NUM nodes are attached using 13 different relations: advmod (38; 38% instances), case (13; 13% instances), det (11; 11% instances), amod (10; 10% instances), punct (8; 8% instances), cop (4; 4% instances), nmod:poss (4; 4% instances), nsubj (4; 4% instances), cc (3; 3% instances), conj (3; 3% instances), acl:relcl (1; 1% instances), mark (1; 1% instances), obl (1; 1% instances)

Children of NUM nodes belong to 10 different parts of speech: ADV (38; 38% instances), PRON (15; 15% instances), ADP (13; 13% instances), NOUN (12; 12% instances), PUNCT (8; 8% instances), AUX (4; 4% instances), NUM (4; 4% instances), ADJ (3; 3% instances), CCONJ (3; 3% instances), SCONJ (1; 1% instances)