home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Belarusian-HSE: POS Tags: NUM

There are 957 NUM lemmas (3%), 1008 NUM types (2%) and 5845 NUM tokens (2%). Out of 17 observed tags, the rank of NUM is: 7 in number of lemmas, 7 in number of types and 12 in number of tokens.

The 10 most frequent NUM lemmas: два, адзін, тры, 10, 2, некалькі, 5, 1, 20, 3

The 10 most frequent NUM types: 10, 2, 5, некалькі, два, 1, тры, 20, 3, адзін

The 10 most frequent ambiguous lemmas: адзін (DET 352, NUM 206), 10 (NUM 175, ADJ 49), 2 (NUM 168, ADJ 37, PROPN 1), 5 (NUM 143, ADJ 24, PROPN 1), 1 (NUM 133, ADJ 66), 20 (NUM 125, ADJ 54), 3 (NUM 122, ADJ 62, X 1), 100 (NUM 100, ADJ 1), колькі (NUM 100, CCONJ 1), 15 (NUM 97, ADJ 36)

The 10 most frequent ambiguous types: 10 (NUM 175, ADJ 49), 2 (NUM 165, ADJ 37, PROPN 1), 5 (NUM 140, ADJ 24, PROPN 1), 1 (NUM 133, ADJ 66), 20 (NUM 125, ADJ 52), 3 (NUM 121, ADJ 62, ADP 4, X 1), адзін (NUM 92, DET 72), 100 (NUM 100, ADJ 1), колькі (NUM 63, CCONJ 1), 15 (NUM 96, ADJ 36)

Morphology

The form / lemma ratio of NUM is 1.053292 (the average of all parts of speech is 1.754875).

The 1st highest number of forms (11) was observed with the lemma “два”: два, две, двум, двума, двух, дзве, дзвюмя, дзвюх, дзьве, дзьвюма, дзьвюх.

The 2nd highest number of forms (8) was observed with the lemma “адзін”: адзін, адна, аднаго, адно, адной, адну, адны, адным.

The 3rd highest number of forms (4) was observed with the lemma “абодва”: абедзвюх, абедзьве, абодва, абодвух.

NUM occurs with 5 features: NumType (4790; 82% instances), Case (1322; 23% instances), Animacy (534; 9% instances), Gender (516; 9% instances), Number (217; 4% instances)

NUM occurs with 15 feature-value pairs: Animacy=Anim, Animacy=Inan, Case=Acc, Case=Dat, Case=Gen, Case=Ins, Case=Loc, Case=Nom, Gender=Fem, Gender=Masc, Gender=Neut, NumType=Card, NumType=Sets, Number=Plur, Number=Sing

NUM occurs with 68 feature combinations. The most frequent feature combination is NumType=Card (3738 tokens). Examples: 10, 2, 5, 1, 20, 3, 100, 15, 18, 7

Relations

NUM nodes are attached to their parents using 24 different relations: nummod:gov (1926; 33% instances), nummod (1446; 25% instances), list (689; 12% instances), appos (446; 8% instances), nmod (370; 6% instances), root (283; 5% instances), parataxis (256; 4% instances), obl (152; 3% instances), conj (98; 2% instances), nsubj (70; 1% instances), obj (32; 1% instances), compound (26; 0% instances), orphan (11; 0% instances), amod (8; 0% instances), flat (6; 0% instances), nsubj:pass (6; 0% instances), fixed (5; 0% instances), ccomp (4; 0% instances), dep (3; 0% instances), advcl (2; 0% instances), iobj (2; 0% instances), xcomp (2; 0% instances), acl (1; 0% instances), acl:relcl (1; 0% instances)

Parents of NUM nodes belong to 14 different parts of speech: NOUN (3547; 61% instances), NUM (539; 9% instances), VERB (407; 7% instances), ADJ (288; 5% instances), PROPN (287; 5% instances), (283; 5% instances), SYM (282; 5% instances), X (129; 2% instances), ADV (47; 1% instances), PRON (16; 0% instances), DET (9; 0% instances), PART (6; 0% instances), INTJ (3; 0% instances), ADP (2; 0% instances)

3746 (64%) NUM nodes are leaves.

1446 (25%) NUM nodes have one child.

310 (5%) NUM nodes have two children.

343 (6%) NUM nodes have three or more children.

The highest child degree of a NUM node is 12.

Children of NUM nodes are attached using 29 different relations: punct (1212; 35% instances), case (413; 12% instances), list (386; 11% instances), advmod (382; 11% instances), nmod (320; 9% instances), nsubj (111; 3% instances), parataxis (105; 3% instances), conj (100; 3% instances), dep (99; 3% instances), compound (97; 3% instances), cc (44; 1% instances), amod (30; 1% instances), flat (24; 1% instances), obl (22; 1% instances), det (17; 0% instances), cop (16; 0% instances), appos (9; 0% instances), nummod (9; 0% instances), orphan (7; 0% instances), advcl (6; 0% instances), discourse (5; 0% instances), mark (5; 0% instances), nummod:gov (4; 0% instances), iobj (3; 0% instances), acl (2; 0% instances), fixed (2; 0% instances), acl:relcl (1; 0% instances), dislocated (1; 0% instances), nsubj:pass (1; 0% instances)

Children of NUM nodes belong to 16 different parts of speech: PUNCT (1212; 35% instances), NUM (539; 16% instances), ADP (389; 11% instances), ADV (333; 10% instances), NOUN (324; 9% instances), X (139; 4% instances), SYM (134; 4% instances), ADJ (92; 3% instances), PART (59; 2% instances), PROPN (59; 2% instances), CCONJ (43; 1% instances), VERB (30; 1% instances), PRON (25; 1% instances), DET (21; 1% instances), AUX (17; 0% instances), SCONJ (17; 0% instances)