home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Polish-LFG: POS Tags: NUM

There are 178 NUM lemmas (1%), 213 NUM types (1%) and 833 NUM tokens (1%). Out of 15 observed tags, the rank of NUM is: 6 in number of lemmas, 7 in number of types and 14 in number of tokens.

The 10 most frequent NUM lemmas: dwa, trzy, cztery, oba, pięć, sto, dwadzieścia, sześć, 15, osiem

The 10 most frequent NUM types: dwa, dwóch, dwie, trzy, cztery, trzech, obu, pięć, czterech, dwadzieścia

The 10 most frequent ambiguous lemmas: 15 (NUM 11, ADJ 3), 20 (NUM 9, ADJ 1), 40 (NUM 9, ADJ 1), 10 (NUM 8, ADJ 1), 6 (NUM 8, ADJ 2), 18 (NUM 6, ADJ 2), 3 (NUM 5, ADJ 3, NOUN 3), 80 (NUM 5, ADJ 2), pół (NUM 5, NOUN 1), 2 (NUM 4, ADJ 1, NOUN 1)

The 10 most frequent ambiguous types: 15 (NUM 11, ADJ 3), 20 (NUM 9, ADJ 1), 40 (NUM 9, ADJ 1), 10 (NUM 8, ADJ 1), 6 (NUM 8, ADJ 1), 3 (NUM 5, ADJ 3, NOUN 3), 80 (NUM 5, ADJ 2), pół (NUM 5, NOUN 1), 2 (NUM 4, ADJ 1, NOUN 1), 24 (NUM 4, ADJ 2)

Morphology

The form / lemma ratio of NUM is 1.196629 (the average of all parts of speech is 1.795702).

The 1st highest number of forms (7) was observed with the lemma “dwa”: dwa, dwaj, dwie, dwiema, dwoma, dwu, dwóch.

The 2nd highest number of forms (5) was observed with the lemma “oba”: oba, obaj, obie, oboma, obu.

The 3rd highest number of forms (4) was observed with the lemma “obydwa”: Obydwie, Obydwu, obydwa, obydwaj.

NUM occurs with 5 features: Case (833; 100% instances), Gender (833; 100% instances), NumType (833; 100% instances), Number (833; 100% instances), SubGender (565; 68% instances)

NUM occurs with 15 feature-value pairs: Case=Acc, Case=Dat, Case=Gen, Case=Ins, Case=Loc, Case=Nom, Gender=Fem, Gender=Masc, Gender=Neut, NumType=Card, NumType=Frac, Number=Plur, SubGender=Masc1, SubGender=Masc2, SubGender=Masc3

NUM occurs with 27 feature combinations. The most frequent feature combination is Case=Acc|Gender=Masc|Number=Plur|NumType=Card|SubGender=Masc3 (227 tokens). Examples: dwa, trzy, cztery, dwadzieścia, sto, 10, osiem, pięć, trzydzieści, 4

Relations

NUM nodes are attached to their parents using 9 different relations: nummod (742; 89% instances), nsubj (22; 3% instances), obl (20; 2% instances), appos (19; 2% instances), conj (15; 2% instances), obj (11; 1% instances), nmod (2; 0% instances), nsubj:pass (1; 0% instances), root (1; 0% instances)

Parents of NUM nodes belong to 8 different parts of speech: NOUN (725; 87% instances), VERB (51; 6% instances), NUM (27; 3% instances), ADJ (12; 1% instances), PRON (8; 1% instances), PROPN (7; 1% instances), DET (2; 0% instances), (1; 0% instances)

772 (93%) NUM nodes are leaves.

44 (5%) NUM nodes have one child.

13 (2%) NUM nodes have two children.

4 (0%) NUM nodes have three or more children.

The highest child degree of a NUM node is 5.

Children of NUM nodes are attached using 13 different relations: appos (26; 31% instances), conj (13; 15% instances), case (12; 14% instances), punct (9; 11% instances), cc (7; 8% instances), advmod (5; 6% instances), nmod (5; 6% instances), det (2; 2% instances), amod (1; 1% instances), cop (1; 1% instances), flat (1; 1% instances), nsubj (1; 1% instances), nummod (1; 1% instances)

Children of NUM nodes belong to 11 different parts of speech: NUM (27; 32% instances), NOUN (16; 19% instances), ADP (12; 14% instances), PUNCT (9; 11% instances), CCONJ (7; 8% instances), PART (5; 6% instances), PRON (3; 4% instances), DET (2; 2% instances), ADJ (1; 1% instances), AUX (1; 1% instances), PROPN (1; 1% instances)