home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Old_East_Slavic-RNC: POS Tags: NUM

There are 390 NUM lemmas (3%), 678 NUM types (2%) and 3813 NUM tokens (2%). Out of 17 observed tags, the rank of NUM is: 6 in number of lemmas, 7 in number of types and 11 in number of tokens.

The 10 most frequent NUM lemmas: два, 3, 2, 4, одинъ, трие, 5, 10, 6, четыре

The 10 most frequent NUM types: 3, 2, 4, два, 5, три, 10, 6, две, один

The 10 most frequent ambiguous lemmas: 3 (NUM 254, ADJ 22, ADV 5), 2 (NUM 205, ADJ 22), 4 (NUM 159, ADJ 21), одинъ (NUM 149, DET 7, ADJ 3), 5 (NUM 120, ADJ 19), 10 (NUM 98, ADJ 17), 6 (NUM 94, ADJ 12), 7 (NUM 74, ADJ 14), 8 (NUM 70, ADJ 19), 9 (NUM 61, ADJ 8)

The 10 most frequent ambiguous types: 3 (NUM 240, ADJ 22, ADV 6), 2 (NUM 201, ADJ 22), 4 (NUM 158, ADJ 21), 5 (NUM 115, ADJ 18), 10 (NUM 92, ADJ 16), 6 (NUM 91, ADJ 11), 7 (NUM 71, ADJ 14), 8 (NUM 69, ADJ 18), 9 (NUM 59, ADJ 8), 12 (NUM 55, ADJ 15)

Morphology

The form / lemma ratio of NUM is 1.738462 (the average of all parts of speech is 2.481645).

The 1st highest number of forms (27) was observed with the lemma “одинъ”: адин, адинъ, адна, один, одиного, одиною, одинъ, одна, однем, однеми, одно, однова, одново, одного, одное, однои, одном, одномъ, одною, одну, одъну, отнех, ъднои, ѡдин, ѡдиног[о], ѡдинъ, ѡдинѡг[о].

The 2nd highest number of forms (21) was observed with the lemma “два”: [д]ве, д[ва], дв[а], дв[е], два, две, двема, дви, двома, двою, дву, двум, двумъ, двумя, двух, двухъ, двѣ, двѣма, двѣмъ, двѣмя, двꙋ.

The 3rd highest number of forms (17) was observed with the lemma “оба”: [о]беих, Обоево, оба, обе, обеим, обеих, обоеꙗ, обоим, обоимъ, обоих, обоихъ, обою, обу, обѣ, обѣих, обѣихъ, ѡбѣ.

NUM occurs with 8 features: NumForm (3813; 100% instances), NumType (3813; 100% instances), Case (3808; 100% instances), Gender (1512; 40% instances), Number (267; 7% instances), Animacy (21; 1% instances), Degree (5; 0% instances), Variant (1; 0% instances)

NUM occurs with 22 feature-value pairs: Animacy=Anim, Case=Acc, Case=Dat, Case=Gen, Case=Ins, Case=Loc, Case=Nom, Degree=Cmp, Gender=Fem, Gender=Masc, Gender=Neut, NumForm=Combi, NumForm=Cyril, NumForm=Digit, NumForm=Roman, NumForm=Word, NumType=Card, NumType=Frac, NumType=Sets, Number=Plur, Number=Sing, Variant=Short

NUM occurs with 115 feature combinations. The most frequent feature combination is Case=Nom|NumForm=Digit|NumType=Card (1392 tokens). Examples: 3, 5, 10, 6, 8, 4, 12, 9, 7, 2

Relations

NUM nodes are attached to their parents using 26 different relations: nummod:gov (2553; 67% instances), nummod (580; 15% instances), conj (227; 6% instances), root (125; 3% instances), compound (85; 2% instances), nsubj (60; 2% instances), appos (44; 1% instances), obl (29; 1% instances), obj (22; 1% instances), nmod (17; 0% instances), parataxis (16; 0% instances), nsubj:pass (8; 0% instances), obl:float (7; 0% instances), advcl (6; 0% instances), flat (6; 0% instances), list (6; 0% instances), orphan (5; 0% instances), acl (3; 0% instances), amod (3; 0% instances), iobj (3; 0% instances), dep (2; 0% instances), xcomp (2; 0% instances), csubj (1; 0% instances), dislocated (1; 0% instances), fixed (1; 0% instances), obl:depict (1; 0% instances)

Parents of NUM nodes belong to 12 different parts of speech: NOUN (3192; 84% instances), NUM (302; 8% instances), (125; 3% instances), VERB (98; 3% instances), ADJ (50; 1% instances), PRON (18; 0% instances), ADV (9; 0% instances), PROPN (8; 0% instances), X (5; 0% instances), DET (4; 0% instances), ADP (1; 0% instances), PART (1; 0% instances)

3334 (87%) NUM nodes are leaves.

274 (7%) NUM nodes have one child.

92 (2%) NUM nodes have two children.

113 (3%) NUM nodes have three or more children.

The highest child degree of a NUM node is 12.

Children of NUM nodes are attached using 28 different relations: conj (208; 23% instances), punct (155; 17% instances), case (115; 13% instances), cc (83; 9% instances), compound (66; 7% instances), advmod (63; 7% instances), nsubj (54; 6% instances), nmod (45; 5% instances), list (23; 3% instances), obl (15; 2% instances), nummod:gov (13; 1% instances), cop (9; 1% instances), mark (8; 1% instances), appos (7; 1% instances), flat (6; 1% instances), nummod (5; 1% instances), parataxis (5; 1% instances), orphan (4; 0% instances), vocative (4; 0% instances), advcl (3; 0% instances), iobj (3; 0% instances), obl:pronmod (3; 0% instances), amod (2; 0% instances), det (2; 0% instances), parataxis:discourse (2; 0% instances), acl:relcl (1; 0% instances), dep (1; 0% instances), fixed (1; 0% instances)

Children of NUM nodes belong to 15 different parts of speech: NUM (302; 33% instances), PUNCT (155; 17% instances), ADP (115; 13% instances), NOUN (103; 11% instances), CCONJ (81; 9% instances), ADV (53; 6% instances), PART (22; 2% instances), ADJ (15; 2% instances), VERB (14; 2% instances), X (11; 1% instances), AUX (10; 1% instances), PRON (9; 1% instances), SCONJ (7; 1% instances), DET (6; 1% instances), PROPN (3; 0% instances)