home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Slovenian-SSJ: POS Tags: NUM

There are 1121 NUM lemmas (4%), 1164 NUM types (2%) and 5585 NUM tokens (2%). Out of 17 observed tags, the rank of NUM is: 6 in number of lemmas, 6 in number of types and 14 in number of tokens.

The 10 most frequent NUM lemmas: en, dva, trije, 2, štirje, 1, pet, eden, 10, 3

The 10 most frequent NUM types: 2, eno, 1, dve, dva, dveh, tri, ena, eden, 10

The 10 most frequent ambiguous lemmas: dva (NUM 278, X 1), 1 (NUM 89, X 1), pet (NUM 82, ADJ 1, X 1), 50 (NUM 40, X 1), 16 (NUM 33, X 1), I. (NUM 18, X 3), 500 (NUM 13, X 1), 250 (NUM 10, X 1), II (NUM 8, X 2), I (NUM 3, NOUN 1, X 1)

The 10 most frequent ambiguous types: 1 (NUM 89, X 1), pet (NUM 43, X 1), 50 (NUM 40, X 1), 16 (NUM 33, X 1), sedem (NUM 24, VERB 1), I. (NUM 18, X 3), 500 (NUM 13, X 1), 250 (NUM 10, X 1), II (NUM 8, X 2), tridesetih (NUM 5, ADJ 2)

Morphology

The form / lemma ratio of NUM is 1.038359 (the average of all parts of speech is 1.932008).

The 1st highest number of forms (11) was observed with the lemma “en”: en, ena, ene, enega, enem, enemu, enga, eni, enih, enim, eno.

The 2nd highest number of forms (5) was observed with the lemma “trije”: treh, trem, tremi, tri, trije.

The 3rd highest number of forms (5) was observed with the lemma “štirje”: štiri, štirih, štirim, štirimi, štirje.

NUM occurs with 5 features: NumForm (5584; 100% instances), NumType (5584; 100% instances), Case (1468; 26% instances), Number (1468; 26% instances), Gender (1013; 18% instances)

NUM occurs with 18 feature-value pairs: Case=Acc, Case=Dat, Case=Gen, Case=Ins, Case=Loc, Case=Nom, Gender=Fem, Gender=Masc, Gender=Neut, NumForm=Digit, NumForm=Roman, NumForm=Word, NumType=Card, NumType=Ord, NumType=Sets, Number=Dual, Number=Plur, Number=Sing

NUM occurs with 64 feature combinations. The most frequent feature combination is NumForm=Digit|NumType=Card (3405 tokens). Examples: 2, 1, 10, 3, 6, 30, 20, 4, 2000, 15

Relations

NUM nodes are attached to their parents using 18 different relations: nummod (4298; 77% instances), conj (445; 8% instances), obl (166; 3% instances), flat (165; 3% instances), nmod (95; 2% instances), appos (77; 1% instances), list (74; 1% instances), nsubj (63; 1% instances), dep (55; 1% instances), root (44; 1% instances), parataxis (42; 1% instances), orphan (24; 0% instances), obj (19; 0% instances), acl (6; 0% instances), ccomp (6; 0% instances), xcomp (3; 0% instances), iobj (2; 0% instances), advcl (1; 0% instances)

Parents of NUM nodes belong to 11 different parts of speech: NOUN (3604; 65% instances), NUM (650; 12% instances), PROPN (575; 10% instances), VERB (272; 5% instances), ADJ (173; 3% instances), X (148; 3% instances), SYM (93; 2% instances), (44; 1% instances), ADV (16; 0% instances), DET (8; 0% instances), PRON (2; 0% instances)

3690 (66%) NUM nodes are leaves.

1345 (24%) NUM nodes have one child.

320 (6%) NUM nodes have two children.

230 (4%) NUM nodes have three or more children.

The highest child degree of a NUM node is 9.

Children of NUM nodes are attached using 25 different relations: punct (981; 34% instances), conj (462; 16% instances), advmod (392; 14% instances), case (272; 9% instances), nmod (174; 6% instances), flat (167; 6% instances), cc (114; 4% instances), cop (52; 2% instances), list (37; 1% instances), nsubj (36; 1% instances), appos (35; 1% instances), orphan (23; 1% instances), det (19; 1% instances), amod (17; 1% instances), parataxis (14; 0% instances), dep (13; 0% instances), nummod (12; 0% instances), mark (11; 0% instances), aux (10; 0% instances), acl (9; 0% instances), obl (6; 0% instances), csubj (5; 0% instances), advcl (2; 0% instances), cc:preconj (1; 0% instances), vocative (1; 0% instances)

Children of NUM nodes belong to 16 different parts of speech: PUNCT (981; 34% instances), NUM (650; 23% instances), ADP (265; 9% instances), NOUN (217; 8% instances), ADV (193; 7% instances), DET (128; 4% instances), PART (126; 4% instances), CCONJ (109; 4% instances), AUX (62; 2% instances), ADJ (34; 1% instances), VERB (26; 1% instances), X (21; 1% instances), PROPN (15; 1% instances), SYM (15; 1% instances), SCONJ (14; 0% instances), PRON (9; 0% instances)