home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Polish-PDB: POS Tags: NUM

There are 407 NUM lemmas (1%), 460 NUM types (1%) and 2633 NUM tokens (1%). Out of 17 observed tags, the rank of NUM is: 6 in number of lemmas, 6 in number of types and 14 in number of tokens.

The 10 most frequent NUM lemmas: dwa, trzy, cztery, pięć, 10, 3, 2, dziesięć, 30, 5

The 10 most frequent NUM types: dwie, dwa, dwóch, trzy, trzech, 10, cztery, 3, pięć, 2

The 10 most frequent ambiguous lemmas: 10 (NUM 84, X 10, ADJ 9), 3 (NUM 54, X 53, ADJ 9), 2 (X 98, NUM 50, ADJ 19), 30 (NUM 43, ADJ 13, X 5), 5 (NUM 42, X 27, ADJ 9, NOUN 1), 15 (NUM 33, ADJ 16, X 6), 20 (NUM 31, ADJ 16, X 4), 4 (X 46, NUM 30, ADJ 10), 50 (NUM 30, X 2), 12 (NUM 29, ADJ 8, X 8)

The 10 most frequent ambiguous types: 10 (NUM 84, X 10, ADJ 9), 3 (NUM 54, X 53, ADJ 9), 2 (X 98, NUM 50, ADJ 19), 30 (NUM 43, ADJ 13, X 6), 5 (NUM 42, X 27, ADJ 9), 15 (NUM 33, ADJ 16, X 6, NOUN 1), 20 (NUM 31, ADJ 16, X 4), 4 (X 46, NUM 30, ADJ 10), 50 (NUM 30, X 2), 12 (NUM 29, X 8, ADJ 7)

Morphology

The form / lemma ratio of NUM is 1.130221 (the average of all parts of speech is 1.966055).

The 1st highest number of forms (10) was observed with the lemma “dwa”: dwa, dwaj, dwie, dwiema, dwoje, dwojgiem, dwoma, dwu, dwóch, dwóm.

The 2nd highest number of forms (8) was observed with the lemma “cztery”: Czerech, Czworo, czterech, czterej, czterem, czterema, cztery, czworga.

The 3rd highest number of forms (8) was observed with the lemma “trzy”: Troje, trojga, trojgiem, trzech, trzej, trzem, trzema, trzy.

NUM occurs with 6 features: Case (2633; 100% instances), Gender (2633; 100% instances), NumForm (2633; 100% instances), Number (2633; 100% instances), Animacy (1853; 70% instances), NumType (1333; 51% instances)

NUM occurs with 19 feature-value pairs: Animacy=Hum, Animacy=Inan, Animacy=Nhum, Case=Acc, Case=Dat, Case=Gen, Case=Ins, Case=Loc, Case=Nom, Gender=Fem, Gender=Masc, Gender=Neut, NumForm=Digit, NumForm=Roman, NumForm=Word, NumType=Card, NumType=Sets, Number=Plur, Number=Sing

NUM occurs with 71 feature combinations. The most frequent feature combination is Animacy=Inan|Case=Acc|Gender=Masc|Number=Plur|NumForm=Digit|NumType=Card (395 tokens). Examples: 10, 3, 2, 30, 5, 15, 4, 100, 20, 50

Relations

NUM nodes are attached to their parents using 18 different relations: nummod (1317; 50% instances), nummod:gov (1067; 41% instances), conj (86; 3% instances), nmod (48; 2% instances), nsubj (28; 1% instances), obl (26; 1% instances), flat (18; 1% instances), iobj (15; 1% instances), obj (11; 0% instances), nummod:flat (3; 0% instances), obl:arg (3; 0% instances), appos (2; 0% instances), parataxis:insert (2; 0% instances), parataxis:obj (2; 0% instances), root (2; 0% instances), nsubj:pass (1; 0% instances), obl:cmpr (1; 0% instances), xcomp:pred (1; 0% instances)

Parents of NUM nodes belong to 10 different parts of speech: NOUN (2389; 91% instances), NUM (95; 4% instances), VERB (72; 3% instances), ADJ (33; 1% instances), PROPN (19; 1% instances), PRON (11; 0% instances), DET (7; 0% instances), ADV (4; 0% instances), (2; 0% instances), ADP (1; 0% instances)

1915 (73%) NUM nodes are leaves.

575 (22%) NUM nodes have one child.

113 (4%) NUM nodes have two children.

30 (1%) NUM nodes have three or more children.

The highest child degree of a NUM node is 5.

Children of NUM nodes are attached using 19 different relations: advmod:emph (272; 30% instances), flat (259; 29% instances), punct (105; 12% instances), conj (91; 10% instances), case (51; 6% instances), nmod (41; 5% instances), det (32; 4% instances), cc (23; 3% instances), amod (8; 1% instances), acl (6; 1% instances), advmod (3; 0% instances), orphan (3; 0% instances), acl:relcl (2; 0% instances), advmod:neg (2; 0% instances), obl:cmpr (2; 0% instances), appos (1; 0% instances), mark (1; 0% instances), nsubj (1; 0% instances), nummod:gov (1; 0% instances)

Children of NUM nodes belong to 14 different parts of speech: NOUN (282; 31% instances), PART (274; 30% instances), PUNCT (105; 12% instances), NUM (95; 11% instances), ADP (51; 6% instances), DET (38; 4% instances), CCONJ (23; 3% instances), ADJ (18; 2% instances), PRON (10; 1% instances), ADV (3; 0% instances), VERB (2; 0% instances), PROPN (1; 0% instances), SCONJ (1; 0% instances), SYM (1; 0% instances)