Treebank Statistics: UD_Czech-PUD: POS Tags: NUM
There are 222 NUM
lemmas (4%), 243 NUM
types (3%) and 459 NUM
tokens (2%).
Out of 15 observed tags, the rank of NUM
is: 6 in number of lemmas, 6 in number of types and 12 in number of tokens.
The 10 most frequent NUM
lemmas: jeden, dva, oba, tři, čtyři, deset, 1, 3, šest, 20
The 10 most frequent NUM
types: dva, čtyři, dvou, dvě, jedné, 1, 3, jeden, 20, dvěma
The 10 most frequent ambiguous lemmas: jeden (NUM 37, ADJ 1), I (NUM 5, PRON 1), pár (NOUN 4, NUM 1), tisíc (NOUN 3, NUM 1)
The 10 most frequent ambiguous types: I (NUM 5, CCONJ 2), V (ADP 88, NUM 1), jednou (ADV 2, NUM 1), pár (NOUN 4, NUM 1)
- I
- V
- jednou
- pár
Morphology
The form / lemma ratio of NUM
is 1.094595 (the average of all parts of speech is 1.426331).
The 1st highest number of forms (9) was observed with the lemma “jeden”: jeden, jedna, jedno, jednoho, jednom, jednou, jednu, jedné, jedním.
The 2nd highest number of forms (4) was observed with the lemma “dva”: dva, dvou, dvě, dvěma.
The 3rd highest number of forms (4) was observed with the lemma “tři”: třech, třemi, tři, tří.
NUM
occurs with 5 features: NumForm (459; 100% instances), NumType (459; 100% instances), Case (140; 31% instances), Number (140; 31% instances), Gender (62; 14% instances)
NUM
occurs with 17 feature-value pairs: Case=Acc
, Case=Dat
, Case=Gen
, Case=Ins
, Case=Loc
, Case=Nom
, Gender=Fem
, Gender=Fem,Neut
, Gender=Masc
, Gender=Masc,Neut
, Gender=Neut
, NumForm=Digit
, NumForm=Roman
, NumForm=Word
, NumType=Card
, Number=Plur
, Number=Sing
NUM
occurs with 26 feature combinations.
The most frequent feature combination is NumForm=Digit|NumType=Card
(303 tokens).
Examples: 1, 3, 20, 2014, 2015, 5, 10, 100, 1492, 2010
Relations
NUM
nodes are attached to their parents using 16 different relations: nummod (319; 69% instances), nummod:gov (74; 16% instances), conj (18; 4% instances), nsubj (11; 2% instances), nmod (7; 2% instances), appos (6; 1% instances), compound (5; 1% instances), obl (5; 1% instances), root (5; 1% instances), advcl (2; 0% instances), obl:arg (2; 0% instances), amod (1; 0% instances), ccomp (1; 0% instances), flat (1; 0% instances), obj (1; 0% instances), parataxis (1; 0% instances)
Parents of NUM
nodes belong to 9 different parts of speech: NOUN (371; 81% instances), VERB (25; 5% instances), PROPN (22; 5% instances), NUM (19; 4% instances), SYM (13; 3% instances), (5; 1% instances), ADJ (2; 0% instances), DET (1; 0% instances), PRON (1; 0% instances)
321 (70%) NUM
nodes are leaves.
106 (23%) NUM
nodes have one child.
14 (3%) NUM
nodes have two children.
18 (4%) NUM
nodes have three or more children.
The highest child degree of a NUM
node is 7.
Children of NUM
nodes are attached using 21 different relations: punct (86; 41% instances), nmod (29; 14% instances), conj (16; 8% instances), case (13; 6% instances), cc (13; 6% instances), cop (9; 4% instances), advmod:emph (8; 4% instances), nsubj (7; 3% instances), obl (5; 2% instances), compound (4; 2% instances), advmod (3; 1% instances), amod (3; 1% instances), mark (3; 1% instances), det (2; 1% instances), obl:arg (2; 1% instances), acl (1; 0% instances), advcl (1; 0% instances), appos (1; 0% instances), aux (1; 0% instances), orphan (1; 0% instances), parataxis (1; 0% instances)
Children of NUM
nodes belong to 15 different parts of speech: PUNCT (86; 41% instances), NOUN (35; 17% instances), NUM (19; 9% instances), CCONJ (14; 7% instances), ADP (13; 6% instances), ADV (10; 5% instances), AUX (10; 5% instances), ADJ (5; 2% instances), PROPN (4; 2% instances), DET (3; 1% instances), PRON (3; 1% instances), SCONJ (3; 1% instances), VERB (2; 1% instances), PART (1; 0% instances), SYM (1; 0% instances)