home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Slovenian-SSJ: POS Tags: NUM

There are 483 NUM lemmas (3%), 522 NUM types (2%) and 1921 NUM tokens (1%). Out of 16 observed tags, the rank of NUM is: 6 in number of lemmas, 6 in number of types and 14 in number of tokens.

The 10 most frequent NUM lemmas: en, dva, trije, štirje, pet, eden, deset, tisoč, šest, sto

The 10 most frequent NUM types: eno, tri, dva, dveh, eden, ena, tisoč, štiri, dve, štirih

The 10 most frequent ambiguous lemmas: V (NOUN 2, NUM 1), V. (X 3, NUM 1), X (NOUN 1, NUM 1)

The 10 most frequent ambiguous types: V (ADP 423, NOUN 2, NUM 1), V. (X 3, NUM 1), X (NOUN 1, NUM 1), dvajsetih (ADJ 1, NUM 1)

Morphology

The form / lemma ratio of NUM is 1.080745 (the average of all parts of speech is 1.892155).

The 1st highest number of forms (10) was observed with the lemma “en”: en, ena, ene, enega, enem, enemu, eni, enih, enim, eno.

The 2nd highest number of forms (5) was observed with the lemma “trije”: treh, trem, tremi, tri, trije.

The 3rd highest number of forms (5) was observed with the lemma “štirje”: štiri, štirih, štirim, štirimi, štirje.

NUM occurs with 5 features: NumForm (1921; 100% instances), NumType (1921; 100% instances), Case (737; 38% instances), Number (737; 38% instances), Gender (484; 25% instances)

NUM occurs with 18 feature-value pairs: Case=Acc, Case=Dat, Case=Gen, Case=Ins, Case=Loc, Case=Nom, Gender=Fem, Gender=Masc, Gender=Neut, NumForm=Digit, NumForm=Roman, NumForm=Word, NumType=Card, NumType=Ord, NumType=Sets, Number=Dual, Number=Plur, Number=Sing

NUM occurs with 61 feature combinations. The most frequent feature combination is NumForm=Digit|NumType=Card (921 tokens). Examples: 10, 15, 2000, 50, 3, 30, 20, 6, 40, 2

Relations

NUM nodes are attached to their parents using 15 different relations: nummod (1610; 84% instances), conj (103; 5% instances), obl (84; 4% instances), flat (32; 2% instances), nsubj (27; 1% instances), parataxis (24; 1% instances), root (13; 1% instances), obj (9; 0% instances), dep (8; 0% instances), appos (3; 0% instances), nmod (3; 0% instances), acl (2; 0% instances), ccomp (1; 0% instances), iobj (1; 0% instances), xcomp (1; 0% instances)

Parents of NUM nodes belong to 10 different parts of speech: NOUN (1516; 79% instances), VERB (133; 7% instances), NUM (127; 7% instances), ADJ (72; 4% instances), PROPN (50; 3% instances), (13; 1% instances), ADV (5; 0% instances), DET (2; 0% instances), X (2; 0% instances), PRON (1; 0% instances)

1388 (72%) NUM nodes are leaves.

431 (22%) NUM nodes have one child.

71 (4%) NUM nodes have two children.

31 (2%) NUM nodes have three or more children.

The highest child degree of a NUM node is 7.

Children of NUM nodes are attached using 16 different relations: punct (145; 21% instances), case (130; 19% instances), advmod (119; 17% instances), conj (108; 15% instances), nmod (74; 11% instances), cc (37; 5% instances), flat (32; 5% instances), cop (16; 2% instances), nsubj (11; 2% instances), acl (9; 1% instances), amod (6; 1% instances), aux (3; 0% instances), mark (3; 0% instances), advcl (2; 0% instances), csubj (1; 0% instances), obl (1; 0% instances)

Children of NUM nodes belong to 15 different parts of speech: PUNCT (145; 21% instances), ADP (130; 19% instances), NUM (127; 18% instances), ADV (79; 11% instances), NOUN (67; 10% instances), CCONJ (35; 5% instances), PART (30; 4% instances), DET (25; 4% instances), AUX (19; 3% instances), ADJ (13; 2% instances), VERB (13; 2% instances), PRON (5; 1% instances), SCONJ (4; 1% instances), PROPN (3; 0% instances), X (2; 0% instances)