Treebank Statistics: UD_Russian-Taiga: POS Tags: NUM
There are 376 NUM
lemmas (2%), 440 NUM
types (1%) and 3085 NUM
tokens (2%).
Out of 17 observed tags, the rank of NUM
is: 7 in number of lemmas, 7 in number of types and 13 in number of tokens.
The 10 most frequent NUM
lemmas: много, 2, один, 3, два, 1, несколько, 5, 4, сколько
The 10 most frequent NUM
types: много, 2, 3, 1, 5, несколько, 4, два, сколько, один
The 10 most frequent ambiguous lemmas: много (NUM 285, ADV 17), 2 (NUM 221, ADJ 15), один (DET 236, NUM 194), 3 (NUM 161, ADJ 10), 1 (NUM 145, ADJ 14), несколько (NUM 121, ADV 5), 5 (NUM 119, ADJ 7), 4 (NUM 104, ADJ 7), сколько (NUM 80, ADV 3, CCONJ 2), мало (NUM 65, ADV 20)
The 10 most frequent ambiguous types: много (NUM 181, ADV 14, X 2), 2 (NUM 217, ADJ 16), 3 (NUM 155, ADJ 10), 1 (NUM 145, ADJ 14), 5 (NUM 118, ADJ 7), несколько (NUM 89, ADV 4), 4 (NUM 104, ADJ 7), сколько (NUM 45, CCONJ 2, ADV 1), один (NUM 64, DET 57), 10 (NUM 64, ADJ 4)
- много
- 2
- 3
- 1
- 5
- несколько
- 4
- сколько
- NUM 45: Хоть понимаешь , сколько раз он нам отвечал ?
- CCONJ 2: Мне даже не столько нужна вся эта поддержка , сколько необходимо знать о том , что люди , которые меня окружают готовы на всё ради моего счастья .
- ADV 1: Отправили письмо почтой России , может кто знает , сколько примерно письма идут из соседних городов ?
- один
- 10
Morphology
The form / lemma ratio of NUM
is 1.170213 (the average of all parts of speech is 1.879397).
The 1st highest number of forms (10) was observed with the lemma “один”: оден, один, одна, одним, одно, одного, одной, одном, одному, одну.
The 2nd highest number of forms (6) was observed with the lemma “несколько”: неск, неск., несколькими, нескольких, несколько, нескольку.
The 3rd highest number of forms (6) was observed with the lemma “оба”: оба, обе, обеим, обеих, обоим, обоих.
NUM
occurs with 9 features: NumForm (3082; 100% instances), NumType (2990; 97% instances), Case (1151; 37% instances), Gender (358; 12% instances), Number (194; 6% instances), Animacy (177; 6% instances), Degree (57; 2% instances), Typo (4; 0% instances), Abbr (3; 0% instances)
NUM
occurs with 22 feature-value pairs: Abbr=Yes
, Animacy=Anim
, Animacy=Inan
, Case=Acc
, Case=Dat
, Case=Gen
, Case=Ins
, Case=Loc
, Case=Nom
, Degree=Cmp
, Gender=Fem
, Gender=Masc
, Gender=Neut
, NumForm=Combi
, NumForm=Digit
, NumForm=Roman
, NumForm=Word
, NumType=Card
, NumType=Frac
, NumType=Sets
, Number=Sing
, Typo=Yes
NUM
occurs with 88 feature combinations.
The most frequent feature combination is NumForm=Digit|NumType=Card
(1682 tokens).
Examples: 2, 3, 1, 5, 4, 10, 7, 30, 6, 20
Relations
NUM
nodes are attached to their parents using 25 different relations: nummod:gov (1370; 44% instances), nummod (546; 18% instances), root (279; 9% instances), nmod (221; 7% instances), conj (147; 5% instances), parataxis (144; 5% instances), appos (109; 4% instances), obl (77; 2% instances), obj (48; 2% instances), nsubj (42; 1% instances), list (16; 1% instances), compound (15; 0% instances), flat (15; 0% instances), advcl (13; 0% instances), xcomp (11; 0% instances), acl (7; 0% instances), ccomp (7; 0% instances), acl:relcl (6; 0% instances), csubj (3; 0% instances), dep (2; 0% instances), nsubj:pass (2; 0% instances), orphan (2; 0% instances), amod (1; 0% instances), iobj (1; 0% instances), mark (1; 0% instances)
Parents of NUM
nodes belong to 15 different parts of speech: NOUN (2031; 66% instances), (279; 9% instances), VERB (275; 9% instances), NUM (260; 8% instances), ADJ (69; 2% instances), SYM (47; 2% instances), X (47; 2% instances), PROPN (34; 1% instances), PRON (24; 1% instances), ADV (6; 0% instances), AUX (4; 0% instances), CCONJ (3; 0% instances), INTJ (3; 0% instances), DET (2; 0% instances), PART (1; 0% instances)
1797 (58%) NUM
nodes are leaves.
812 (26%) NUM
nodes have one child.
216 (7%) NUM
nodes have two children.
260 (8%) NUM
nodes have three or more children.
The highest child degree of a NUM
node is 19.
Children of NUM
nodes are attached using 32 different relations: punct (755; 34% instances), advmod (280; 12% instances), nsubj (241; 11% instances), nmod (201; 9% instances), case (164; 7% instances), conj (162; 7% instances), parataxis (95; 4% instances), obl (69; 3% instances), cc (62; 3% instances), iobj (27; 1% instances), mark (26; 1% instances), amod (21; 1% instances), compound (16; 1% instances), cop (16; 1% instances), det (16; 1% instances), flat (14; 1% instances), orphan (12; 1% instances), advcl (11; 0% instances), aux (11; 0% instances), fixed (8; 0% instances), list (8; 0% instances), appos (7; 0% instances), discourse (6; 0% instances), acl:relcl (5; 0% instances), expl (4; 0% instances), flat:foreign (4; 0% instances), nummod (3; 0% instances), nummod:gov (3; 0% instances), acl (2; 0% instances), dep (1; 0% instances), flat:name (1; 0% instances), goeswith (1; 0% instances)
Children of NUM
nodes belong to 17 different parts of speech: PUNCT (755; 34% instances), NOUN (383; 17% instances), NUM (260; 12% instances), ADV (219; 10% instances), ADP (153; 7% instances), PART (81; 4% instances), VERB (71; 3% instances), ADJ (63; 3% instances), CCONJ (61; 3% instances), PRON (60; 3% instances), SYM (43; 2% instances), AUX (27; 1% instances), DET (26; 1% instances), SCONJ (23; 1% instances), X (15; 1% instances), PROPN (9; 0% instances), INTJ (3; 0% instances)