home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Old_East_Slavic-RNC: POS Tags: NUM

There are 182 NUM lemmas (3%), 282 NUM types (2%) and 1598 NUM tokens (3%). Out of 17 observed tags, the rank of NUM is: 6 in number of lemmas, 7 in number of types and 10 in number of tokens.

The 10 most frequent NUM lemmas: два, 3, одинъ, 2, 4, 5, 10, 6, оба, полтора

The 10 most frequent NUM types: 3, два, 2, 4, один, две, 5, 10, 6, три

The 10 most frequent ambiguous lemmas: 3 (NUM 164, ADJ 7), одинъ (NUM 99, DET 3), 2 (NUM 76, ADJ 6), 4 (NUM 75, ADJ 6), 5 (NUM 48, ADJ 4), 10 (NUM 47, ADJ 7), 6 (NUM 43, ADJ 4), 8 (NUM 28, ADJ 5), 12 (NUM 25, ADJ 9), 20 (NUM 25, ADJ 9)

The 10 most frequent ambiguous types: 3 (NUM 157, ADJ 5, ADV 1, X 1), 2 (NUM 76, ADJ 6), 4 (NUM 75, ADJ 6), 5 (NUM 47, ADJ 4), 10 (NUM 45, ADJ 7, X 1), 6 (NUM 43, ADJ 4, X 1), 8 (NUM 28, ADJ 5, X 1), 12 (NUM 25, ADJ 9, X 1), 20 (NUM 24, ADJ 9), 9 (NUM 22, X 1)

Morphology

The form / lemma ratio of NUM is 1.549451 (the average of all parts of speech is 1.988362).

The 1st highest number of forms (17) was observed with the lemma “одинъ”: адин, адна, аднꙋ, один, одиного, одна, однем, одно, однова, одново, однои, одном, одною, одну, отнех, ъднои, ѡдинъ.

The 2nd highest number of forms (11) was observed with the lemma “два”: [д]ве, д[ва], дв[а], дв[е], два, две, двема, двома, дву, двух, двѣ.

The 3rd highest number of forms (8) was observed with the lemma “оба”: [о]беих, оба, обе, обеих, обоеꙗ, обу, обѣ, обѣих.

NUM occurs with 7 features: Case (1595; 100% instances), NumForm (1342; 84% instances), NumType (1206; 75% instances), Gender (558; 35% instances), Number (140; 9% instances), Degree (5; 0% instances), Animacy (2; 0% instances)

NUM occurs with 22 feature-value pairs: Animacy=Anim, Case=Acc, Case=Dat, Case=Gen, Case=Ins, Case=Loc, Case=Nom, Degree=Cmp, Degree=Pos, Gender=Fem, Gender=Masc, Gender=Neut, NumForm=Cyril, NumForm=Digit, NumForm=Roman, NumForm=Word, NumType=Card, NumType=Frac, NumType=Sets, Number=Dual, Number=Plur, Number=Sing

NUM occurs with 105 feature combinations. The most frequent feature combination is Case=Nom|NumForm=Digit|NumType=Card (641 tokens). Examples: 3, 4, 2, 10, 6, 5, 12, 35, 9, 20

Relations

NUM nodes are attached to their parents using 16 different relations: nummod:gov (1255; 79% instances), nummod (216; 14% instances), compound (40; 3% instances), appos (15; 1% instances), obl (13; 1% instances), root (12; 1% instances), conj (10; 1% instances), obj (9; 1% instances), nsubj:pass (7; 0% instances), nsubj (6; 0% instances), amod (5; 0% instances), nmod (4; 0% instances), acl (3; 0% instances), dep (1; 0% instances), flat (1; 0% instances), parataxis (1; 0% instances)

Parents of NUM nodes belong to 9 different parts of speech: NOUN (1483; 93% instances), NUM (38; 2% instances), VERB (31; 2% instances), ADJ (20; 1% instances), (12; 1% instances), PRON (11; 1% instances), ADV (1; 0% instances), DET (1; 0% instances), PROPN (1; 0% instances)

1477 (92%) NUM nodes are leaves.

91 (6%) NUM nodes have one child.

19 (1%) NUM nodes have two children.

11 (1%) NUM nodes have three or more children.

The highest child degree of a NUM node is 6.

Children of NUM nodes are attached using 20 different relations: punct (39; 23% instances), compound (35; 20% instances), case (29; 17% instances), nmod (14; 8% instances), advmod (10; 6% instances), nsubj (9; 5% instances), cc (6; 3% instances), conj (6; 3% instances), obl (4; 2% instances), parataxis (3; 2% instances), amod (2; 1% instances), appos (2; 1% instances), cop (2; 1% instances), det (2; 1% instances), iobj (2; 1% instances), list (2; 1% instances), nummod:gov (2; 1% instances), acl:relcl (1; 1% instances), dep (1; 1% instances), flat (1; 1% instances)

Children of NUM nodes belong to 14 different parts of speech: PUNCT (39; 23% instances), NUM (38; 22% instances), ADP (29; 17% instances), NOUN (29; 17% instances), PART (8; 5% instances), ADJ (6; 3% instances), CCONJ (6; 3% instances), PRON (4; 2% instances), DET (3; 2% instances), PROPN (3; 2% instances), ADV (2; 1% instances), AUX (2; 1% instances), VERB (2; 1% instances), X (1; 1% instances)