home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Russian-Taiga: POS Tags: VERB

There are 10825 VERB lemmas (19%), 46279 VERB types (30%) and 211632 VERB tokens (12%). Out of 17 observed tags, the rank of VERB is: 3 in number of lemmas, 2 in number of types and 3 in number of tokens.

The 10 most frequent VERB lemmas: мочь, быть, сказать, говорить, стать, знать, можно, хотеть, смотреть, иметь

The 10 most frequent VERB types: есть, может, можно, сказал, надо, нет, сказала, значит, было, сказать

The 10 most frequent ambiguous lemmas: мочь (VERB 3995, NOUN 8), быть (AUX 10794, VERB 3946, X 1), стать (VERB 2525, NOUN 4), знать (VERB 1812, NOUN 23), можно (VERB 1670, PART 2), надо (VERB 1128, ADP 34), нет (VERB 946, PART 795), есть (VERB 391, INTJ 1), пропасть (VERB 101, NOUN 21), подать (VERB 75, NOUN 8)

The 10 most frequent ambiguous types: есть (VERB 1510, AUX 384, INTJ 1), можно (VERB 1416, PART 1), надо (VERB 876, ADP 31), нет (VERB 860, PART 365), было (AUX 2459, VERB 626, PART 85), стали (VERB 548, NOUN 6), быть (AUX 947, VERB 543, X 1), был (AUX 1999, VERB 158), начал (VERB 180, NOUN 14), знать (VERB 158, NOUN 9)

Morphology

The form / lemma ratio of VERB is 4.275196 (the average of all parts of speech is 2.706111).

The 1st highest number of forms (44) was observed with the lemma “писать”: Напишите, пи-шу-у-у, писа́ть, писавшего, писавшей, писавший, писает, писал, писала, писалась, писали, писались, писалось, писался, писана, писанную, писанные, писанных, писано, писаны, писать, писаться, писиет, пиш-ет, пиш-ешь, пиш-у, пишем, пишет, пишете, пишется, пишешь, пиши, пишите, пишу, пишут, пишутся, пишущая, пишущего, пишущей, пишущему, пишущий, пишущим, пишущими, пишущих.

The 2nd highest number of forms (43) was observed with the lemma “идти”: И́-дет, и́дет, и́дут, ид-у, ид-ёт, ид-ёшь, идем, идемте, идет, идеть, идешь, иди, идите, идти, иду, иду́т, идут, идучи, идущая, идущего, идущее, идущей, идущему, идущие, идущий, идущим, идущими, идущих, идущую, идя, идём, идёт, идёшь, шедшая, шедшего, шедшие, шедший, шедших, шел, шла, шли, шло, шёл.

The 3rd highest number of forms (42) was observed with the lemma “читать”: чита-ть, чита́ем, чита́ет, чита́ете, чита́ешь, чита́ть, чита́ю, чита́ют, читавшая, читавшемся, читаем, читаемого, читаемом, читаемые, читаемых, читает, читаете, читается, читаешь, читай, читайте, читал, читала, читалась, читали, читались, читало, читать, читаться, читаю, читают, читаются, читающая, читающего, читающей, читающему, читающие, читающий, читающим, читающих, читающую, читая.

VERB occurs with 17 features: VerbForm (206002; 97% instances), Voice (206002; 97% instances), Aspect (203136; 96% instances), Tense (168981; 80% instances), Number (166378; 79% instances), Mood (143629; 68% instances), Gender (76144; 36% instances), Person (63278; 30% instances), Case (17274; 8% instances), Variant (5623; 3% instances), Animacy (1860; 1% instances), Reflex (1858; 1% instances), Polarity (1682; 1% instances), Abbr (750; 0% instances), Typo (451; 0% instances), ExtPos (144; 0% instances), Foreign (1; 0% instances)

VERB occurs with 41 feature-value pairs: Abbr=Yes, Animacy=Anim, Animacy=Inan, Aspect=Imp, Aspect=Perf, Case=Acc, Case=Dat, Case=Gen, Case=Ins, Case=Loc, Case=Nom, ExtPos=ADP, ExtPos=ADV, ExtPos=CCONJ, ExtPos=NOUN, ExtPos=VERB, Foreign=Yes, Gender=Fem, Gender=Masc, Gender=Neut, Mood=Imp, Mood=Ind, Number=Plur, Number=Sing, Person=1, Person=2, Person=3, Polarity=Neg, Reflex=Yes, Tense=Fut, Tense=Past, Tense=Pres, Typo=Yes, Variant=Short, VerbForm=Conv, VerbForm=Fin, VerbForm=Inf, VerbForm=Part, Voice=Act, Voice=Mid, Voice=Pass

VERB occurs with 484 feature combinations. The most frequent feature combination is Aspect=Imp|Mood=Ind|Number=Sing|Person=3|Tense=Pres|VerbForm=Fin|Voice=Act (17847 tokens). Examples: может, значит, говорит, есть, стоит, имеет, следует, означает, знает, идет

Relations

VERB nodes are attached to their parents using 37 different relations: root (82870; 39% instances), conj (37935; 18% instances), parataxis (17343; 8% instances), advcl (16002; 8% instances), xcomp (14647; 7% instances), acl (14519; 7% instances), csubj (7190; 3% instances), acl:relcl (7002; 3% instances), ccomp (5557; 3% instances), amod (3465; 2% instances), parataxis:discourse (2299; 1% instances), fixed (925; 0% instances), appos (698; 0% instances), nmod (284; 0% instances), obl (191; 0% instances), list (149; 0% instances), nsubj (135; 0% instances), obj (124; 0% instances), obl:pronmod (57; 0% instances), csubj:pass (47; 0% instances), iobj (47; 0% instances), orphan (41; 0% instances), obl:depict (27; 0% instances), dislocated (12; 0% instances), flat (12; 0% instances), nsubj:pass (11; 0% instances), case (9; 0% instances), vocative (8; 0% instances), dep (4; 0% instances), obl:agent (4; 0% instances), advmod (3; 0% instances), cc (3; 0% instances), csubj:outer (3; 0% instances), discourse (3; 0% instances), flat:name (2; 0% instances), obl:tmod (2; 0% instances), reparandum (2; 0% instances)

Parents of VERB nodes belong to 17 different parts of speech: (82870; 39% instances), VERB (81394; 38% instances), NOUN (28004; 13% instances), ADJ (9028; 4% instances), PRON (4105; 2% instances), ADV (2127; 1% instances), PROPN (1639; 1% instances), DET (910; 0% instances), PART (575; 0% instances), X (331; 0% instances), INTJ (255; 0% instances), NUM (236; 0% instances), AUX (68; 0% instances), ADP (42; 0% instances), SCONJ (33; 0% instances), CCONJ (12; 0% instances), SYM (3; 0% instances)

8268 (4%) VERB nodes are leaves.

20000 (9%) VERB nodes have one child.

34821 (16%) VERB nodes have two children.

148543 (70%) VERB nodes have three or more children.

The highest child degree of a VERB node is 40.

Children of VERB nodes are attached using 49 different relations: punct (194682; 27% instances), nsubj (97184; 14% instances), obl (86974; 12% instances), advmod (71338; 10% instances), obj (63039; 9% instances), conj (37640; 5% instances), cc (32917; 5% instances), xcomp (20153; 3% instances), iobj (19980; 3% instances), mark (16424; 2% instances), advcl (14843; 2% instances), parataxis (13085; 2% instances), ccomp (7169; 1% instances), nsubj:pass (6773; 1% instances), obl:tmod (6631; 1% instances), parataxis:discourse (6159; 1% instances), csubj (4650; 1% instances), obl:agent (2961; 0% instances), aux (2753; 0% instances), vocative (2170; 0% instances), aux:pass (1973; 0% instances), discourse (1582; 0% instances), obl:float (988; 0% instances), cop (522; 0% instances), expl (335; 0% instances), case (292; 0% instances), acl (266; 0% instances), fixed (145; 0% instances), nmod (103; 0% instances), obl:depict (92; 0% instances), list (86; 0% instances), det (85; 0% instances), csubj:pass (47; 0% instances), orphan (46; 0% instances), amod (42; 0% instances), dislocated (19; 0% instances), dep (16; 0% instances), acl:relcl (13; 0% instances), appos (13; 0% instances), flat (13; 0% instances), nsubj:outer (12; 0% instances), nummod (12; 0% instances), nummod:gov (12; 0% instances), goeswith (7; 0% instances), compound (5; 0% instances), reparandum (4; 0% instances), flat:name (3; 0% instances), csubj:outer (1; 0% instances), obl:pronmod (1; 0% instances)

Children of VERB nodes belong to 17 different parts of speech: NOUN (196313; 27% instances), PUNCT (194682; 27% instances), VERB (81394; 11% instances), PRON (67057; 9% instances), ADV (53999; 8% instances), CCONJ (32640; 5% instances), PART (23370; 3% instances), PROPN (21982; 3% instances), SCONJ (15385; 2% instances), ADJ (9318; 1% instances), DET (9070; 1% instances), AUX (5296; 1% instances), ADP (994; 0% instances), NUM (942; 0% instances), X (840; 0% instances), INTJ (576; 0% instances), SYM (402; 0% instances)