Treebank Statistics: UD_Russian-SynTagRus: POS Tags: NUM
There are 1018 NUM
lemmas (2%), 1114 NUM
types (1%) and 17909 NUM
tokens (1%).
Out of 17 observed tags, the rank of NUM
is: 7 in number of lemmas, 7 in number of types and 13 in number of tokens.
The 10 most frequent NUM
lemmas: один, два, несколько, три, 1, 10, четыре, 20, 2, пять
The 10 most frequent NUM
types: один, несколько, два, три, одной, 1, 10, двух, две, 20
The 10 most frequent ambiguous lemmas: один (NUM 2706, DET 984, NOUN 3), несколько (NUM 1038, ADV 111), 1 (NUM 417, ADJ 26), 10 (NUM 407, ADJ 20), 20 (NUM 323, ADJ 15), 2 (NUM 309, ADJ 14), много (ADV 724, NUM 302), 15 (NUM 280, ADJ 17), 5 (NUM 257, ADJ 12), 3 (NUM 242, ADJ 11)
The 10 most frequent ambiguous types: один (NUM 697, DET 179), несколько (NUM 744, ADV 103), одной (NUM 409, DET 139), 1 (NUM 417, ADJ 26), 10 (NUM 407, ADJ 21), 20 (NUM 322, ADJ 15), 2 (NUM 309, ADJ 14), одного (NUM 278, DET 91), одна (NUM 242, DET 94), 15 (NUM 276, ADJ 17)
- один
- несколько
- одной
- 1
- 10
- 20
- 2
- одного
- одна
- 15
Morphology
The form / lemma ratio of NUM
is 1.094303 (the average of all parts of speech is 2.668075).
The 1st highest number of forms (12) was observed with the lemma “один”: один, одна, одни, одним, одними, одно, одного, одной, одном, одному, одною, одну.
The 2nd highest number of forms (9) was observed with the lemma “оба”: оба, обе, обеим, обеими, обеих, обоего, обоим, обоими, обоих.
The 3rd highest number of forms (6) was observed with the lemma “три”: трем, тремя, трех, три, трём, трёх.
NUM
occurs with 8 features: NumForm (17909; 100% instances), NumType (17909; 100% instances), Case (9372; 52% instances), Gender (4591; 26% instances), Number (2706; 15% instances), Animacy (2025; 11% instances), Degree (103; 1% instances), ExtPos (22; 0% instances)
NUM
occurs with 24 feature-value pairs: Animacy=Anim
, Animacy=Inan
, Case=Acc
, Case=Dat
, Case=Gen
, Case=Ins
, Case=Loc
, Case=Nom
, Degree=Cmp
, ExtPos=ADV
, ExtPos=NUM
, ExtPos=PRON
, Gender=Fem
, Gender=Masc
, Gender=Neut
, NumForm=Combi
, NumForm=Digit
, NumForm=Roman
, NumForm=Word
, NumType=Card
, NumType=Frac
, NumType=Sets
, Number=Plur
, Number=Sing
NUM
occurs with 90 feature combinations.
The most frequent feature combination is NumForm=Digit|NumType=Card
(8427 tokens).
Examples: 1, 10, 20, 2, 15, 5, 3, 30, 4, 100
Relations
NUM
nodes are attached to their parents using 29 different relations: nummod (8974; 50% instances), nummod:gov (3457; 19% instances), nmod (1278; 7% instances), obl (1199; 7% instances), appos (708; 4% instances), nsubj (569; 3% instances), conj (402; 2% instances), root (339; 2% instances), parataxis (228; 1% instances), compound (216; 1% instances), obj (101; 1% instances), xcomp (80; 0% instances), nsubj:pass (70; 0% instances), obl:pronmod (65; 0% instances), advcl (39; 0% instances), ccomp (31; 0% instances), orphan (31; 0% instances), list (22; 0% instances), nummod:entity (22; 0% instances), iobj (18; 0% instances), amod (16; 0% instances), acl (12; 0% instances), acl:relcl (11; 0% instances), fixed (9; 0% instances), flat (5; 0% instances), advmod (3; 0% instances), csubj (2; 0% instances), flat:name (1; 0% instances), obl:tmod (1; 0% instances)
Parents of NUM
nodes belong to 13 different parts of speech: NOUN (12367; 69% instances), VERB (2012; 11% instances), NUM (1232; 7% instances), SYM (1064; 6% instances), PROPN (363; 2% instances), (339; 2% instances), ADJ (277; 2% instances), PRON (129; 1% instances), ADV (69; 0% instances), X (30; 0% instances), DET (23; 0% instances), PART (3; 0% instances), ADP (1; 0% instances)
10868 (61%) NUM
nodes are leaves.
4715 (26%) NUM
nodes have one child.
1408 (8%) NUM
nodes have two children.
918 (5%) NUM
nodes have three or more children.
The highest child degree of a NUM
node is 13.
Children of NUM
nodes are attached using 32 different relations: punct (2623; 24% instances), nmod (2568; 23% instances), advmod (1599; 14% instances), case (959; 9% instances), flat (737; 7% instances), nsubj (387; 4% instances), conj (380; 3% instances), cc (266; 2% instances), obl (231; 2% instances), amod (226; 2% instances), parataxis (200; 2% instances), compound (178; 2% instances), det (150; 1% instances), cop (113; 1% instances), mark (100; 1% instances), appos (71; 1% instances), fixed (44; 0% instances), orphan (44; 0% instances), acl (43; 0% instances), advcl (43; 0% instances), acl:relcl (24; 0% instances), iobj (16; 0% instances), flat:foreign (11; 0% instances), list (11; 0% instances), csubj (6; 0% instances), obl:pronmod (6; 0% instances), expl (3; 0% instances), nummod (3; 0% instances), flat:name (2; 0% instances), parataxis:discourse (2; 0% instances), discourse (1; 0% instances), obj (1; 0% instances)
Children of NUM
nodes belong to 17 different parts of speech: NOUN (2795; 25% instances), PUNCT (2623; 24% instances), NUM (1232; 11% instances), ADP (972; 9% instances), ADV (842; 8% instances), PART (820; 7% instances), ADJ (411; 4% instances), CCONJ (262; 2% instances), PRON (254; 2% instances), VERB (231; 2% instances), DET (207; 2% instances), SCONJ (123; 1% instances), AUX (113; 1% instances), PROPN (81; 1% instances), SYM (63; 1% instances), X (18; 0% instances), INTJ (1; 0% instances)