home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Russian-SynTagRus: POS Tags: PROPN

There are 8664 PROPN lemmas (19%), 12062 PROPN types (10%) and 41593 PROPN tokens (4%). Out of 17 observed tags, the rank of PROPN is: 3 in number of lemmas, 4 in number of types and 8 in number of tokens.

The 10 most frequent PROPN lemmas: Россия, Москва, США, Путин, Владимир, СССР, Европа, Сергей, Александр, Земля

The 10 most frequent PROPN types: России, США, СССР, Россия, В., Путин, А., Москве, Владимир, Сергей

The 10 most frequent ambiguous lemmas:

The 10 most frequent ambiguous types: США (PROPN 487, NOUN 7), СССР (PROPN 293, NOUN 4), Земли (PROPN 123, NOUN 3), РАН (PROPN 115, NOUN 1), Института (PROPN 107, NOUN 1), НДС (PROPN 101, NOUN 2), ВВП (PROPN 91, NOUN 2), МГУ (PROPN 87, NOUN 2), ЕС (PROPN 71, NOUN 1), ООН (PROPN 70, NOUN 1)

Morphology

The form / lemma ratio of PROPN is 1.392198 (the average of all parts of speech is 2.589298).

The 1st highest number of forms (10) was observed with the lemma “Южный”: Южная, Южно, Южного, Южной, Южном, Южному, Южную, Южные, Южный, Южных.

The 2nd highest number of forms (9) was observed with the lemma “Европейский”: Европейская, Европейский, Европейским, Европейского, Европейское, Европейской, Европейском, Европейскому, Европейскую.

The 3rd highest number of forms (9) was observed with the lemma “Северный”: Северная, Северно, Северного, Северной, Северном, Северному, Северный, Северным, Северо.

PROPN occurs with 6 features: Number (37434; 90% instances), Animacy (37430; 90% instances), Case (37426; 90% instances), Gender (36937; 89% instances), Foreign (2297; 6% instances), Degree (6; 0% instances)

PROPN occurs with 16 feature-value pairs: Animacy=Anim, Animacy=Inan, Case=Acc, Case=Dat, Case=Gen, Case=Ins, Case=Loc, Case=Nom, Case=Voc, Degree=Pos, Foreign=Yes, Gender=Fem, Gender=Masc, Gender=Neut, Number=Plur, Number=Sing

PROPN occurs with 84 feature combinations. The most frequent feature combination is Animacy=Anim|Case=Nom|Gender=Masc|Number=Sing (9697 tokens). Examples: Путин, Владимир, Сергей, Александр, Галилей, В., Медведев, А., Илья, Монахов

Relations

PROPN nodes are attached to their parents using 27 different relations: nmod (11527; 28% instances), nsubj (5895; 14% instances), flat:name (5478; 13% instances), appos (5172; 12% instances), obl (4575; 11% instances), conj (2804; 7% instances), flat:foreign (1789; 4% instances), amod (914; 2% instances), obj (898; 2% instances), parataxis (750; 2% instances), iobj (643; 2% instances), root (523; 1% instances), nsubj:pass (215; 1% instances), flat (94; 0% instances), advcl (76; 0% instances), compound (74; 0% instances), orphan (73; 0% instances), acl (36; 0% instances), ccomp (21; 0% instances), vocative (9; 0% instances), acl:relcl (6; 0% instances), csubj (5; 0% instances), fixed (5; 0% instances), nummod:entity (4; 0% instances), xcomp (3; 0% instances), advmod (2; 0% instances), mark (2; 0% instances)

Parents of PROPN nodes belong to 14 different parts of speech: NOUN (18156; 44% instances), VERB (11270; 27% instances), PROPN (10086; 24% instances), ADJ (765; 2% instances), (523; 1% instances), ADV (273; 1% instances), PRON (148; 0% instances), NUM (106; 0% instances), PART (94; 0% instances), DET (89; 0% instances), SYM (48; 0% instances), ADP (22; 0% instances), X (11; 0% instances), CCONJ (2; 0% instances)

19900 (48%) PROPN nodes are leaves.

13548 (33%) PROPN nodes have one child.

5009 (12%) PROPN nodes have two children.

3136 (8%) PROPN nodes have three or more children.

The highest child degree of a PROPN node is 19.

Children of PROPN nodes are attached using 34 different relations: case (7676; 22% instances), punct (7591; 22% instances), flat:name (5372; 15% instances), conj (3132; 9% instances), nmod (2254; 6% instances), amod (1990; 6% instances), cc (1614; 5% instances), appos (1292; 4% instances), parataxis (619; 2% instances), advmod (586; 2% instances), nummod (461; 1% instances), acl:relcl (410; 1% instances), acl (381; 1% instances), flat:foreign (380; 1% instances), nsubj (310; 1% instances), det (276; 1% instances), mark (262; 1% instances), obl (180; 1% instances), orphan (122; 0% instances), flat (88; 0% instances), cop (87; 0% instances), discourse (30; 0% instances), compound (25; 0% instances), nummod:gov (20; 0% instances), advcl (19; 0% instances), iobj (8; 0% instances), obj (7; 0% instances), nummod:entity (5; 0% instances), nsubj:pass (4; 0% instances), aux:pass (3; 0% instances), fixed (3; 0% instances), aux (2; 0% instances), csubj:pass (1; 0% instances), xcomp (1; 0% instances)

Children of PROPN nodes belong to 17 different parts of speech: PROPN (10086; 29% instances), ADP (7710; 22% instances), PUNCT (7591; 22% instances), NOUN (3064; 9% instances), ADJ (1797; 5% instances), CCONJ (1597; 5% instances), VERB (1025; 3% instances), PART (555; 2% instances), NUM (533; 2% instances), ADV (392; 1% instances), DET (298; 1% instances), SCONJ (261; 1% instances), PRON (144; 0% instances), AUX (79; 0% instances), X (55; 0% instances), SYM (23; 0% instances), INTJ (1; 0% instances)