home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Old_Russian-RNC: POS Tags: NOUN

There are 1274 NOUN lemmas (31%), 2665 NOUN types (34%) and 6646 NOUN tokens (22%). Out of 17 observed tags, the rank of NOUN is: 1 in number of lemmas, 1 in number of types and 1 in number of tokens.

The 10 most frequent NOUN lemmas: государь, князь, человѣкъ, годъ, земля, холопъ, день, указъ, чело, Богъ

The 10 most frequent NOUN types: г., государю, государя, государь, году, день, людей, указу, денги, земли

The 10 most frequent ambiguous lemmas: градъ (NOUN 14, PROPN 1), гора (NOUN 11, PROPN 1), дѣловой (NOUN 7, ADJ 5), низъ (NOUN 4, ADP 1), выборный (ADJ 9, NOUN 2), сорокъ (NUM 3, NOUN 2), черезъ (ADP 3, NOUN 2), мелкий (ADJ 6, NOUN 1), мертвый (ADJ 2, NOUN 1), святой (ADJ 60, NOUN 1)

The 10 most frequent ambiguous types: де (PART 129, NOUN 10), посла (NOUN 4, VERB 1), с. (NOUN 4, DET 3), честь (NOUN 3, VERB 1), выборныхъ (ADJ 4, NOUN 2), деловых (ADJ 3, NOUN 2), подати (NOUN 2, VERB 1), Литву (PROPN 2, NOUN 1), вести (VERB 2, NOUN 1), добрѣ (ADV 1, NOUN 1)

Morphology

The form / lemma ratio of NOUN is 2.091837 (the average of all parts of speech is 1.947446).

The 1st highest number of forms (24) was observed with the lemma “человѣкъ”: людей, людем, людемъ, людехъ, люди, людие, людми, людьи, людьми, ч[е]л[овѣ]къ, человек, человека, человеком, человеку, человѣка, человѣкомъ, человѣку, человѣкъ, человѣкѣ, члавкꙋ, члвка, члвкъ, члкомъ, члкꙋ.

The 2nd highest number of forms (21) was observed with the lemma “князь”: КНЯЗЕ, кн[ѧ]зе, кн[ѧ]зем, кн[ѧ]зи, кн[ѧ]зю, кн[ѧ]зѣ, кн[ѧ]зѧ, кнзь, кнзю, княже, князеи, князей, княземъ, князи, князми, князь, князю, князя, князѣ, кнѧз[ь], кнꙗз[ь].

The 3rd highest number of forms (17) was observed with the lemma “отецъ”: Отьцу, о[т]цѣ, отец, отецъ, отець, отца, отци, отцов, отцовъ, отцом, отцомъ, отцъ, отцы, отче, отьца, отьцемъ, ѡ[те]ць.

NOUN occurs with 5 features: Case (6431; 97% instances), Number (6431; 97% instances), Gender (6430; 97% instances), Abbr (287; 4% instances), Animacy (208; 3% instances)

NOUN occurs with 16 feature-value pairs: Abbr=Yes, Animacy=Anim, Case=Acc, Case=Dat, Case=Gen, Case=Ins, Case=Loc, Case=Nom, Case=Voc, Gender=Fem, Gender=Masc, Gender=Neut, Number=Adnum, Number=Dual, Number=Plur, Number=Sing

NOUN occurs with 72 feature combinations. The most frequent feature combination is Case=Gen|Gender=Masc|Number=Sing (600 tokens). Examples: государя, князя, году, отца, царя, маия, указу, февраля, августа, уѣзду

Relations

NOUN nodes are attached to their parents using 26 different relations: obl (1572; 24% instances), conj (1064; 16% instances), nmod (945; 14% instances), obj (806; 12% instances), appos (590; 9% instances), nsubj (586; 9% instances), iobj (345; 5% instances), parataxis (200; 3% instances), root (128; 2% instances), vocative (127; 2% instances), nsubj:pass (90; 1% instances), orphan (58; 1% instances), advcl (26; 0% instances), xcomp (20; 0% instances), acl:relcl (19; 0% instances), flat (19; 0% instances), flat:name (14; 0% instances), compound (9; 0% instances), acl (8; 0% instances), obl:agent (8; 0% instances), ccomp (3; 0% instances), dislocated (3; 0% instances), nummod:gov (3; 0% instances), csubj (1; 0% instances), fixed (1; 0% instances), nummod (1; 0% instances)

Parents of NOUN nodes belong to 12 different parts of speech: VERB (3398; 51% instances), NOUN (2239; 34% instances), PRON (363; 5% instances), PROPN (232; 3% instances), ADJ (179; 3% instances), (128; 2% instances), ADV (54; 1% instances), DET (31; 0% instances), NUM (14; 0% instances), ADP (3; 0% instances), PART (3; 0% instances), AUX (2; 0% instances)

785 (12%) NOUN nodes are leaves.

2046 (31%) NOUN nodes have one child.

2087 (31%) NOUN nodes have two children.

1728 (26%) NOUN nodes have three or more children.

The highest child degree of a NOUN node is 15.

Children of NOUN nodes are attached using 34 different relations: case (2405; 19% instances), amod (2223; 17% instances), punct (1975; 15% instances), det (1361; 11% instances), nmod (1070; 8% instances), conj (1069; 8% instances), cc (978; 8% instances), appos (796; 6% instances), nummod:gov (182; 1% instances), advmod (119; 1% instances), nummod (101; 1% instances), nsubj (99; 1% instances), parataxis (68; 1% instances), acl:relcl (64; 0% instances), obl (55; 0% instances), acl (51; 0% instances), cop (49; 0% instances), orphan (47; 0% instances), mark (26; 0% instances), iobj (25; 0% instances), flat (18; 0% instances), compound (14; 0% instances), aux (13; 0% instances), advcl (12; 0% instances), vocative (12; 0% instances), dep (9; 0% instances), expl (5; 0% instances), ccomp (4; 0% instances), flat:name (4; 0% instances), obj (4; 0% instances), csubj (2; 0% instances), discourse (1; 0% instances), dislocated (1; 0% instances), nsubj:pass (1; 0% instances)

Children of NOUN nodes belong to 17 different parts of speech: ADP (2402; 19% instances), ADJ (2272; 18% instances), NOUN (2239; 17% instances), PUNCT (1975; 15% instances), DET (1235; 10% instances), CCONJ (979; 8% instances), PROPN (721; 6% instances), PRON (310; 2% instances), NUM (296; 2% instances), VERB (194; 2% instances), PART (92; 1% instances), AUX (63; 0% instances), ADV (45; 0% instances), SCONJ (29; 0% instances), X (9; 0% instances), INTJ (1; 0% instances), SYM (1; 0% instances)