home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Upper_Sorbian-UFAL: POS Tags: NUM

There are 187 NUM lemmas (6%), 197 NUM types (4%) and 382 NUM tokens (3%). Out of 16 observed tags, the rank of NUM is: 5 in number of lemmas, 5 in number of types and 9 in number of tokens.

The 10 most frequent NUM lemmas: jedyn, 2, 1, 6, dwaj, 4, 3, 5, 7, I

The 10 most frequent NUM types: 2, 1, 6, 4, 3, jedyn, 5, 7, I, 000

The 10 most frequent ambiguous lemmas: tři (NUM 6, NOUN 1), milion (NOUN 6, NUM 3), wobaj (DET 1, NUM 1)

The 10 most frequent ambiguous types: sta (NUM 2, VERB 1)

Morphology

The form / lemma ratio of NUM is 1.053476 (the average of all parts of speech is 1.418889).

The 1st highest number of forms (6) was observed with the lemma “jedyn”: jedna, jedneho, jednu, jedny, jednym, jedyn.

The 2nd highest number of forms (4) was observed with the lemma “dwaj”: dwaj, dweju, dwě, dwěmaj.

The 3rd highest number of forms (2) was observed with the lemma “tři”: traje, tři.

NUM occurs with 7 features: NumType (381; 100% instances), Gender (36; 9% instances), Case (34; 9% instances), Number (32; 8% instances), Animacy (18; 5% instances), Abbr (7; 2% instances), PronType (1; 0% instances)

NUM occurs with 16 feature-value pairs: Abbr=Yes, Animacy=Anim, Animacy=Inan, Case=Acc, Case=Gen, Case=Ins, Case=Loc, Case=Nom, Gender=Fem, Gender=Masc, Gender=Neut, NumType=Card, Number=Dual, Number=Plur, Number=Sing, PronType=Tot

NUM occurs with 25 feature combinations. The most frequent feature combination is NumType=Card (341 tokens). Examples: 2, 1, 6, 4, 3, 5, 7, I, 000, 10

Relations

NUM nodes are attached to their parents using 12 different relations: nummod (193; 51% instances), nummod:gov (64; 17% instances), conj (41; 11% instances), compound (33; 9% instances), obl (18; 5% instances), parataxis (14; 4% instances), appos (7; 2% instances), nmod (4; 1% instances), root (3; 1% instances), list (2; 1% instances), obj (2; 1% instances), amod (1; 0% instances)

Parents of NUM nodes belong to 8 different parts of speech: NOUN (208; 54% instances), NUM (75; 20% instances), VERB (35; 9% instances), PROPN (32; 8% instances), SYM (17; 4% instances), ADJ (7; 2% instances), X (5; 1% instances), (3; 1% instances)

162 (42%) NUM nodes are leaves.

141 (37%) NUM nodes have one child.

45 (12%) NUM nodes have two children.

34 (9%) NUM nodes have three or more children.

The highest child degree of a NUM node is 5.

Children of NUM nodes are attached using 13 different relations: punct (183; 52% instances), conj (43; 12% instances), compound (33; 9% instances), nmod (31; 9% instances), case (19; 5% instances), cc (14; 4% instances), advmod:emph (10; 3% instances), advmod (5; 1% instances), cop (4; 1% instances), nsubj (3; 1% instances), nummod (3; 1% instances), appos (1; 0% instances), orphan (1; 0% instances)

Children of NUM nodes belong to 12 different parts of speech: PUNCT (183; 52% instances), NUM (75; 21% instances), NOUN (35; 10% instances), ADP (18; 5% instances), ADV (16; 5% instances), CCONJ (12; 3% instances), AUX (4; 1% instances), PROPN (2; 1% instances), SYM (2; 1% instances), ADJ (1; 0% instances), PRON (1; 0% instances), VERB (1; 0% instances)