home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Russian-SynTagRus: POS Tags: NOUN

There are 19075 NOUN lemmas (35%), 49993 NOUN types (35%) and 362044 NOUN tokens (24%). Out of 17 observed tags, the rank of NOUN is: 1 in number of lemmas, 1 in number of types and 1 in number of tokens.

The 10 most frequent NOUN lemmas: год, человек, время, страна, дело, жизнь, работа, власть, система, день

The 10 most frequent NOUN types: года, время, лет, году, человек, раз, жизни, людей, люди, власти

The 10 most frequent ambiguous lemmas: страна (NOUN 2189, X 1), раз (NOUN 1242, SCONJ 60, ADV 10), случай (NOUN 1077, SCONJ 1), ученый (NOUN 689, ADJ 34), право (NOUN 619, ADV 8), друг (PRON 519, NOUN 297, ADJ 1), главное (NOUN 184, ADV 18), правда (ADV 261, NOUN 171), пол (NOUN 142, NUM 127), больной (NOUN 110, ADJ 78)

The 10 most frequent ambiguous types: раз (NOUN 987, SCONJ 46, ADV 10), случае (NOUN 626, SCONJ 1), ученые (NOUN 227, ADJ 3), начала (NOUN 222, VERB 68), дома (NOUN 190, ADV 81), право (NOUN 196, ADV 5, ADJ 2), ученых (NOUN 196, ADJ 6), права (NOUN 184, ADJ 5), страна (NOUN 163, X 1), целом (NOUN 177, ADJ 10)

Morphology

The form / lemma ratio of NOUN is 2.620865 (the average of all parts of speech is 2.668831).

The 1st highest number of forms (15) was observed with the lemma “век”: в, в., вв, вв., век, века, векам, веками, веках, веке, веки, веков, веком, веку, полвека.

The 2nd highest number of forms (15) was observed with the lemma “год”: г, г., гг, гг., год, года, годам, годами, годах, годе, годов, годом, году, годы, лет.

The 3rd highest number of forms (15) was observed with the lemma “тоннель”: тоннеле, тоннелей, тоннели, тоннель, тоннелю, тоннеля, тоннелям, тоннелями, тоннелях, туннеле, туннелем, туннель, туннелю, туннеля, туннелями.

NOUN occurs with 9 features: Animacy (360312; 100% instances), Number (360277; 100% instances), Case (360276; 100% instances), Gender (359763; 99% instances), Abbr (1680; 0% instances), InflClass (1460; 0% instances), ExtPos (22; 0% instances), Typo (8; 0% instances), Foreign (2; 0% instances)

NOUN occurs with 21 feature-value pairs: Abbr=Yes, Animacy=Anim, Animacy=Inan, Case=Acc, Case=Dat, Case=Gen, Case=Ins, Case=Loc, Case=Nom, Case=Par, Case=Voc, ExtPos=ADV, ExtPos=NOUN, Foreign=Yes, Gender=Fem, Gender=Masc, Gender=Neut, InflClass=Ind, Number=Plur, Number=Sing, Typo=Yes

NOUN occurs with 162 feature combinations. The most frequent feature combination is Animacy=Inan|Case=Gen|Gender=Fem|Number=Sing (27770 tokens). Examples: жизни, страны, стороны, экономики, власти, работы, войны, системы, науки, воды

Relations

NOUN nodes are attached to their parents using 37 different relations: nmod (101601; 28% instances), obl (62038; 17% instances), nsubj (58092; 16% instances), obj (42354; 12% instances), conj (29744; 8% instances), obl:tmod (12265; 3% instances), root (10042; 3% instances), iobj (9257; 3% instances), nsubj:pass (7444; 2% instances), parataxis (5651; 2% instances), fixed (4568; 1% instances), appos (4232; 1% instances), xcomp (3668; 1% instances), parataxis:discourse (2401; 1% instances), obl:agent (1937; 1% instances), nummod (1219; 0% instances), advcl (1172; 0% instances), flat (1133; 0% instances), ccomp (645; 0% instances), orphan (609; 0% instances), acl (602; 0% instances), compound (512; 0% instances), vocative (305; 0% instances), acl:relcl (200; 0% instances), nummod:gov (162; 0% instances), csubj (59; 0% instances), obl:depict (51; 0% instances), flat:name (31; 0% instances), list (16; 0% instances), dislocated (10; 0% instances), amod (7; 0% instances), advmod (6; 0% instances), nsubj:outer (6; 0% instances), obl:pronmod (2; 0% instances), dep (1; 0% instances), flat:foreign (1; 0% instances), obl:float (1; 0% instances)

Parents of NOUN nodes belong to 16 different parts of speech: VERB (180779; 50% instances), NOUN (138503; 38% instances), ADJ (14015; 4% instances), (10042; 3% instances), ADP (4252; 1% instances), ADV (3979; 1% instances), NUM (2791; 1% instances), PRON (2561; 1% instances), PROPN (2555; 1% instances), DET (1203; 0% instances), SYM (642; 0% instances), PART (325; 0% instances), SCONJ (262; 0% instances), X (125; 0% instances), AUX (5; 0% instances), INTJ (5; 0% instances)

64929 (18%) NOUN nodes are leaves.

126797 (35%) NOUN nodes have one child.

102249 (28%) NOUN nodes have two children.

68069 (19%) NOUN nodes have three or more children.

The highest child degree of a NOUN node is 20.

Children of NOUN nodes are attached using 42 different relations: nmod (116931; 20% instances), amod (112136; 19% instances), case (109054; 19% instances), punct (68995; 12% instances), det (37724; 6% instances), conj (28983; 5% instances), cc (19844; 3% instances), acl (13974; 2% instances), advmod (13700; 2% instances), appos (10865; 2% instances), nummod (8977; 2% instances), nsubj (7953; 1% instances), parataxis (7601; 1% instances), acl:relcl (7482; 1% instances), mark (3602; 1% instances), nummod:gov (3517; 1% instances), cop (2388; 0% instances), parataxis:discourse (2078; 0% instances), iobj (782; 0% instances), orphan (770; 0% instances), expl (706; 0% instances), compound (591; 0% instances), advcl (433; 0% instances), obl (363; 0% instances), csubj (334; 0% instances), flat:name (110; 0% instances), ccomp (106; 0% instances), discourse (86; 0% instances), obl:tmod (77; 0% instances), aux (74; 0% instances), vocative (62; 0% instances), xcomp (42; 0% instances), obl:float (31; 0% instances), fixed (25; 0% instances), list (25; 0% instances), flat (19; 0% instances), obj (8; 0% instances), flat:foreign (7; 0% instances), dep (3; 0% instances), dislocated (1; 0% instances), nsubj:pass (1; 0% instances), obl:depict (1; 0% instances)

Children of NOUN nodes belong to 17 different parts of speech: NOUN (138503; 24% instances), ADJ (111615; 19% instances), ADP (108408; 19% instances), PUNCT (68995; 12% instances), DET (39342; 7% instances), VERB (28989; 5% instances), PROPN (20218; 3% instances), CCONJ (19078; 3% instances), NUM (12364; 2% instances), PART (9166; 2% instances), ADV (7484; 1% instances), PRON (7395; 1% instances), SCONJ (4658; 1% instances), AUX (2467; 0% instances), X (1419; 0% instances), SYM (305; 0% instances), INTJ (55; 0% instances)