home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Russian-SynTagRus: POS Tags: NOUN

There are 19128 NOUN lemmas (35%), 50020 NOUN types (35%) and 360027 NOUN tokens (24%). Out of 17 observed tags, the rank of NOUN is: 1 in number of lemmas, 1 in number of types and 1 in number of tokens.

The 10 most frequent NOUN lemmas: год, человек, время, страна, дело, жизнь, работа, власть, система, день

The 10 most frequent NOUN types: года, время, лет, году, человек, раз, жизни, людей, люди, власти

The 10 most frequent ambiguous lemmas: страна (NOUN 2189, X 1), раз (NOUN 1242, SCONJ 60, ADV 10), случай (NOUN 1076, SCONJ 1), друг (NOUN 816, ADJ 1), ученый (NOUN 687, ADJ 35), право (NOUN 613, ADV 8), главное (NOUN 184, ADV 18), правда (ADV 261, NOUN 171), пол (NOUN 142, NUM 127), больной (NOUN 110, ADJ 78)

The 10 most frequent ambiguous types: раз (NOUN 987, SCONJ 46, ADV 10), случае (NOUN 626, SCONJ 1), ученые (NOUN 227, ADJ 3), начала (NOUN 222, VERB 68), право (NOUN 196, ADV 5, ADJ 2), дома (NOUN 191, ADV 80), ученых (NOUN 195, ADJ 7), права (NOUN 184, ADJ 5), страна (NOUN 163, X 1), целом (NOUN 177, ADJ 10)

Morphology

The form / lemma ratio of NOUN is 2.615015 (the average of all parts of speech is 2.654430).

The 1st highest number of forms (15) was observed with the lemma “век”: в, в., вв, вв., век, века, векам, веками, веках, веке, веки, веков, веком, веку, полвека.

The 2nd highest number of forms (15) was observed with the lemma “год”: г, г., гг, гг., год, года, годам, годами, годах, годе, годов, годом, году, годы, лет.

The 3rd highest number of forms (15) was observed with the lemma “тоннель”: тоннеле, тоннелей, тоннели, тоннель, тоннелю, тоннеля, тоннелям, тоннелями, тоннелях, туннеле, туннелем, туннель, туннелю, туннеля, туннелями.

NOUN occurs with 7 features: Animacy (359481; 100% instances), Case (359446; 100% instances), Number (359446; 100% instances), Gender (358921; 100% instances), Abbr (472; 0% instances), Typo (5; 0% instances), Foreign (2; 0% instances)

NOUN occurs with 18 feature-value pairs: Abbr=Yes, Animacy=Anim, Animacy=Inan, Case=Acc, Case=Dat, Case=Gen, Case=Ins, Case=Loc, Case=Nom, Case=Par, Case=Voc, Foreign=Yes, Gender=Fem, Gender=Masc, Gender=Neut, Number=Plur, Number=Sing, Typo=Yes

NOUN occurs with 98 feature combinations. The most frequent feature combination is Animacy=Inan|Case=Gen|Gender=Fem|Number=Sing (27440 tokens). Examples: жизни, страны, стороны, экономики, власти, работы, войны, системы, науки, воды

Relations

NOUN nodes are attached to their parents using 34 different relations: nmod (101237; 28% instances), obl (83599; 23% instances), nsubj (57789; 16% instances), obj (38131; 11% instances), conj (29594; 8% instances), root (10041; 3% instances), parataxis (8118; 2% instances), nsubj:pass (7400; 2% instances), iobj (6668; 2% instances), fixed (4830; 1% instances), appos (4192; 1% instances), advcl (1182; 0% instances), flat (1116; 0% instances), nummod (958; 0% instances), xcomp (951; 0% instances), obl:tmod (822; 0% instances), ccomp (616; 0% instances), acl (599; 0% instances), orphan (582; 0% instances), compound (500; 0% instances), obl:agent (465; 0% instances), acl:relcl (200; 0% instances), nummod:gov (163; 0% instances), vocative (127; 0% instances), csubj (53; 0% instances), flat:foreign (42; 0% instances), nummod:entity (18; 0% instances), flat:name (13; 0% instances), amod (7; 0% instances), nsubj:outer (5; 0% instances), dislocated (4; 0% instances), advmod (3; 0% instances), dep (1; 0% instances), list (1; 0% instances)

Parents of NOUN nodes belong to 17 different parts of speech: VERB (179595; 50% instances), NOUN (135688; 38% instances), ADJ (14246; 4% instances), (10041; 3% instances), ADV (4787; 1% instances), ADP (4250; 1% instances), PROPN (3953; 1% instances), NUM (2835; 1% instances), PRON (2569; 1% instances), DET (795; 0% instances), SYM (607; 0% instances), PART (321; 0% instances), SCONJ (262; 0% instances), X (63; 0% instances), INTJ (7; 0% instances), AUX (5; 0% instances), CCONJ (3; 0% instances)

64843 (18%) NOUN nodes are leaves.

125603 (35%) NOUN nodes have one child.

101670 (28%) NOUN nodes have two children.

67911 (19%) NOUN nodes have three or more children.

The highest child degree of a NOUN node is 17.

Children of NOUN nodes are attached using 40 different relations: nmod (114944; 20% instances), amod (114744; 20% instances), case (108626; 19% instances), punct (68921; 12% instances), det (34534; 6% instances), conj (28718; 5% instances), cc (19744; 3% instances), acl (12143; 2% instances), advmod (11238; 2% instances), appos (9783; 2% instances), parataxis (9593; 2% instances), nummod (8829; 2% instances), nsubj (7920; 1% instances), acl:relcl (7428; 1% instances), obl (4905; 1% instances), mark (3613; 1% instances), nummod:gov (3427; 1% instances), cop (2378; 0% instances), flat:foreign (1573; 0% instances), iobj (781; 0% instances), orphan (757; 0% instances), expl (705; 0% instances), compound (545; 0% instances), advcl (432; 0% instances), fixed (383; 0% instances), csubj (328; 0% instances), discourse (258; 0% instances), nummod:entity (253; 0% instances), ccomp (107; 0% instances), aux (73; 0% instances), flat:name (45; 0% instances), vocative (20; 0% instances), obl:tmod (17; 0% instances), flat (14; 0% instances), obj (8; 0% instances), list (4; 0% instances), xcomp (4; 0% instances), dep (3; 0% instances), dislocated (1; 0% instances), nsubj:pass (1; 0% instances)

Children of NOUN nodes belong to 17 different parts of speech: NOUN (135688; 23% instances), ADJ (112892; 20% instances), ADP (108039; 19% instances), PUNCT (68921; 12% instances), DET (35147; 6% instances), VERB (28512; 5% instances), PROPN (23424; 4% instances), CCONJ (19017; 3% instances), NUM (13179; 2% instances), PRON (8607; 1% instances), PART (8603; 1% instances), ADV (7826; 1% instances), SCONJ (4648; 1% instances), AUX (2458; 0% instances), X (469; 0% instances), SYM (315; 0% instances), INTJ (55; 0% instances)