Treebank Statistics: UD_Polish-PDB: POS Tags: NUM
There are 407 NUM
lemmas (1%), 460 NUM
types (1%) and 2633 NUM
tokens (1%).
Out of 17 observed tags, the rank of NUM
is: 6 in number of lemmas, 6 in number of types and 14 in number of tokens.
The 10 most frequent NUM
lemmas: dwa, trzy, cztery, pięć, 10, 3, 2, dziesięć, 30, 5
The 10 most frequent NUM
types: dwie, dwa, dwóch, trzy, trzech, 10, cztery, 3, pięć, 2
The 10 most frequent ambiguous lemmas: 10 (NUM 84, X 10, ADJ 9), 3 (NUM 54, X 53, ADJ 9), 2 (X 98, NUM 50, ADJ 19), 30 (NUM 43, ADJ 13, X 5), 5 (NUM 42, X 27, ADJ 9, NOUN 1), 15 (NUM 33, ADJ 16, X 6), 20 (NUM 31, ADJ 16, X 4), 4 (X 46, NUM 30, ADJ 10), 50 (NUM 30, X 2), 12 (NUM 29, ADJ 8, X 8)
The 10 most frequent ambiguous types: 10 (NUM 84, X 10, ADJ 9), 3 (NUM 54, X 53, ADJ 9), 2 (X 98, NUM 50, ADJ 19), 30 (NUM 43, ADJ 13, X 6), 5 (NUM 42, X 27, ADJ 9), 15 (NUM 33, ADJ 16, X 6, NOUN 1), 20 (NUM 31, ADJ 16, X 4), 4 (X 46, NUM 30, ADJ 10), 50 (NUM 30, X 2), 12 (NUM 29, X 8, ADJ 7)
- 10
- 3
- 2
- 30
- 5
- 15
- 20
- 4
- 50
- 12
Morphology
The form / lemma ratio of NUM
is 1.130221 (the average of all parts of speech is 1.966055).
The 1st highest number of forms (10) was observed with the lemma “dwa”: dwa, dwaj, dwie, dwiema, dwoje, dwojgiem, dwoma, dwu, dwóch, dwóm.
The 2nd highest number of forms (8) was observed with the lemma “cztery”: Czerech, Czworo, czterech, czterej, czterem, czterema, cztery, czworga.
The 3rd highest number of forms (8) was observed with the lemma “trzy”: Troje, trojga, trojgiem, trzech, trzej, trzem, trzema, trzy.
NUM
occurs with 6 features: Case (2633; 100% instances), Gender (2633; 100% instances), NumForm (2633; 100% instances), Number (2633; 100% instances), Animacy (1853; 70% instances), NumType (1333; 51% instances)
NUM
occurs with 19 feature-value pairs: Animacy=Hum
, Animacy=Inan
, Animacy=Nhum
, Case=Acc
, Case=Dat
, Case=Gen
, Case=Ins
, Case=Loc
, Case=Nom
, Gender=Fem
, Gender=Masc
, Gender=Neut
, NumForm=Digit
, NumForm=Roman
, NumForm=Word
, NumType=Card
, NumType=Sets
, Number=Plur
, Number=Sing
NUM
occurs with 71 feature combinations.
The most frequent feature combination is Animacy=Inan|Case=Acc|Gender=Masc|Number=Plur|NumForm=Digit|NumType=Card
(395 tokens).
Examples: 10, 3, 2, 30, 5, 15, 4, 100, 20, 50
Relations
NUM
nodes are attached to their parents using 18 different relations: nummod (1317; 50% instances), nummod:gov (1067; 41% instances), conj (86; 3% instances), nmod (48; 2% instances), nsubj (28; 1% instances), obl (26; 1% instances), flat (18; 1% instances), iobj (15; 1% instances), obj (11; 0% instances), nummod:flat (3; 0% instances), obl:arg (3; 0% instances), appos (2; 0% instances), parataxis:insert (2; 0% instances), parataxis:obj (2; 0% instances), root (2; 0% instances), nsubj:pass (1; 0% instances), obl:cmpr (1; 0% instances), xcomp:pred (1; 0% instances)
Parents of NUM
nodes belong to 10 different parts of speech: NOUN (2389; 91% instances), NUM (95; 4% instances), VERB (72; 3% instances), ADJ (33; 1% instances), PROPN (19; 1% instances), PRON (11; 0% instances), DET (7; 0% instances), ADV (4; 0% instances), (2; 0% instances), ADP (1; 0% instances)
1915 (73%) NUM
nodes are leaves.
575 (22%) NUM
nodes have one child.
113 (4%) NUM
nodes have two children.
30 (1%) NUM
nodes have three or more children.
The highest child degree of a NUM
node is 5.
Children of NUM
nodes are attached using 19 different relations: advmod:emph (272; 30% instances), flat (259; 29% instances), punct (105; 12% instances), conj (91; 10% instances), case (51; 6% instances), nmod (41; 5% instances), det (32; 4% instances), cc (23; 3% instances), amod (8; 1% instances), acl (6; 1% instances), advmod (3; 0% instances), orphan (3; 0% instances), acl:relcl (2; 0% instances), advmod:neg (2; 0% instances), obl:cmpr (2; 0% instances), appos (1; 0% instances), mark (1; 0% instances), nsubj (1; 0% instances), nummod:gov (1; 0% instances)
Children of NUM
nodes belong to 14 different parts of speech: NOUN (282; 31% instances), PART (274; 30% instances), PUNCT (105; 12% instances), NUM (95; 11% instances), ADP (51; 6% instances), DET (38; 4% instances), CCONJ (23; 3% instances), ADJ (18; 2% instances), PRON (10; 1% instances), ADV (3; 0% instances), VERB (2; 0% instances), PROPN (1; 0% instances), SCONJ (1; 0% instances), SYM (1; 0% instances)