Treebank Statistics: UD_Old_East_Slavic-RNC: POS Tags: NUM
There are 286 NUM
lemmas (3%), 460 NUM
types (2%) and 2537 NUM
tokens (3%).
Out of 17 observed tags, the rank of NUM
is: 6 in number of lemmas, 7 in number of types and 10 in number of tokens.
The 10 most frequent NUM
lemmas: два, 3, 2, одинъ, 4, трие, 5, 10, 6, 8
The 10 most frequent NUM
types: 3, 2, два, 4, две, 5, один, 10, 6, три
The 10 most frequent ambiguous lemmas: 3 (NUM 201, ADJ 10, ADV 5), 2 (NUM 126, ADJ 9), одинъ (NUM 118, DET 6, ADJ 2), 4 (NUM 103, ADJ 13), 5 (NUM 71, ADJ 11), 10 (NUM 70, ADJ 9), 6 (NUM 66, ADJ 4), 8 (NUM 54, ADJ 8), 7 (NUM 51, ADJ 6), 9 (NUM 48, ADJ 3)
The 10 most frequent ambiguous types: 3 (NUM 194, ADJ 10, ADV 6), 2 (NUM 124, ADJ 9), 4 (NUM 102, ADJ 13), 5 (NUM 70, ADJ 11), 10 (NUM 68, ADJ 9), 6 (NUM 66, ADJ 4), 8 (NUM 54, ADJ 8), 7 (NUM 50, ADJ 6), 9 (NUM 48, ADJ 3), 12 (NUM 37, ADJ 15)
- 3
- 2
- 4
- 5
- 10
- 6
- 8
- 7
- 9
- 12
Morphology
The form / lemma ratio of NUM
is 1.608392 (the average of all parts of speech is 2.250521).
The 1st highest number of forms (22) was observed with the lemma “одинъ”: адин, адинъ, адна, один, одиного, одинъ, одна, однем, одно, однова, одново, одного, однои, одном, одномъ, одною, одну, одъну, отнех, ъднои, ѡдинъ, ѡдинѡг[о].
The 2nd highest number of forms (15) was observed with the lemma “два”: [д]ве, д[ва], дв[а], дв[е], два, две, двема, дви, двома, дву, двумъ, двух, двухъ, двѣ, двѣмя.
The 3rd highest number of forms (15) was observed with the lemma “оба”: [о]беих, Обоево, оба, обе, обеим, обеих, обоеꙗ, обоим, обоимъ, обоих, обоихъ, обу, обѣ, обѣих, обѣихъ.
NUM
occurs with 7 features: NumForm (2537; 100% instances), NumType (2537; 100% instances), Case (2533; 100% instances), Gender (906; 36% instances), Number (189; 7% instances), Animacy (11; 0% instances), Degree (4; 0% instances)
NUM
occurs with 21 feature-value pairs: Animacy=Anim
, Case=Acc
, Case=Dat
, Case=Gen
, Case=Ins
, Case=Loc
, Case=Nom
, Degree=Cmp
, Gender=Fem
, Gender=Masc
, Gender=Neut
, NumForm=Combi
, NumForm=Cyril
, NumForm=Digit
, NumForm=Roman
, NumForm=Word
, NumType=Card
, NumType=Frac
, NumType=Sets
, Number=Plur
, Number=Sing
NUM
occurs with 91 feature combinations.
The most frequent feature combination is Case=Nom|NumForm=Digit|NumType=Card
(1091 tokens).
Examples: 3, 10, 6, 5, 4, 8, 2, 9, 12, 7
Relations
NUM
nodes are attached to their parents using 23 different relations: nummod:gov (1675; 66% instances), nummod (333; 13% instances), conj (166; 7% instances), root (124; 5% instances), compound (66; 3% instances), nsubj (52; 2% instances), appos (22; 1% instances), obl (21; 1% instances), obj (16; 1% instances), parataxis (15; 1% instances), nmod (9; 0% instances), list (6; 0% instances), nsubj:pass (6; 0% instances), flat (5; 0% instances), orphan (4; 0% instances), acl (3; 0% instances), advcl (3; 0% instances), amod (3; 0% instances), obl:float (3; 0% instances), xcomp (2; 0% instances), dep (1; 0% instances), dislocated (1; 0% instances), iobj (1; 0% instances)
Parents of NUM
nodes belong to 12 different parts of speech: NOUN (2034; 80% instances), NUM (243; 10% instances), (124; 5% instances), VERB (67; 3% instances), ADJ (32; 1% instances), PRON (15; 1% instances), ADV (9; 0% instances), X (5; 0% instances), DET (3; 0% instances), PROPN (3; 0% instances), ADP (1; 0% instances), PART (1; 0% instances)
2206 (87%) NUM
nodes are leaves.
197 (8%) NUM
nodes have one child.
60 (2%) NUM
nodes have two children.
74 (3%) NUM
nodes have three or more children.
The highest child degree of a NUM
node is 12.
Children of NUM
nodes are attached using 28 different relations: conj (168; 27% instances), punct (112; 18% instances), compound (58; 9% instances), case (54; 9% instances), nsubj (49; 8% instances), advmod (42; 7% instances), nmod (29; 5% instances), cc (23; 4% instances), list (22; 4% instances), obl (11; 2% instances), cop (6; 1% instances), flat (5; 1% instances), mark (5; 1% instances), orphan (4; 1% instances), parataxis (4; 1% instances), advcl (3; 0% instances), iobj (3; 0% instances), vocative (3; 0% instances), amod (2; 0% instances), appos (2; 0% instances), det (2; 0% instances), nummod:gov (2; 0% instances), obl:pronmod (2; 0% instances), parataxis:discourse (2; 0% instances), acl:relcl (1; 0% instances), dep (1; 0% instances), fixed (1; 0% instances), nummod (1; 0% instances)
Children of NUM
nodes belong to 15 different parts of speech: NUM (243; 39% instances), PUNCT (112; 18% instances), NOUN (72; 12% instances), ADP (54; 9% instances), ADV (37; 6% instances), CCONJ (23; 4% instances), ADJ (14; 2% instances), VERB (12; 2% instances), PART (11; 2% instances), X (11; 2% instances), PRON (9; 1% instances), AUX (7; 1% instances), DET (5; 1% instances), SCONJ (5; 1% instances), PROPN (2; 0% instances)