home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Ukrainian-IU: POS Tags: PROPN

There are 1578 PROPN lemmas (8%), 1996 PROPN types (6%) and 3523 PROPN tokens (3%). Out of 17 observed tags, the rank of PROPN is: 4 in number of lemmas, 4 in number of types and 11 in number of tokens.

The 10 most frequent PROPN lemmas: Україна, Київ, США, Львів, Марія, Європа, Росія, Міра, Андрій, Вінстон

The 10 most frequent PROPN types: україни, Україні, США, Україна, київ, Міра, в, Києва, Росії, Вінстон

The 10 most frequent ambiguous lemmas: А (PROPN 8, X 1), В. (PROPN 7, NOUN 1), К (PROPN 4, X 1), Гнатів (PROPN 3, ADJ 1), С (PROPN 2, ADJ 1, NOUN 1), од (ADP 16, PROPN 2), Е (ADJ 1, PROPN 1), З. (NOUN 1, PROPN 1), Мальборо (NOUN 1, PROPN 1), карий (ADJ 1, PROPN 1)

The 10 most frequent ambiguous types: в (ADP 1381, NOUN 1, PROPN 1, X 1), о (ADP 7, INTJ 2, NOUN 1, PART 1, PROPN 1), і (CCONJ 1833, PART 211, DET 2, PROPN 2, NOUN 1), А (CCONJ 168, PART 30, PROPN 10, INTJ 1, X 1), Ради (NOUN 13, PROPN 10), б (AUX 68, PART 19, PROPN 2, X 1), К (PROPN 8, X 1), с (NOUN 10, PROPN 1), д (ADV 8, PROPN 1), Заповіт (PROPN 4, NOUN 3)

Morphology

The form / lemma ratio of PROPN is 1.264892 (the average of all parts of speech is 1.738999).

The 1st highest number of forms (5) was observed with the lemma “Джон”: Джона, Джонові, Джоном, Джону, джон.

The 2nd highest number of forms (5) was observed with the lemma “Дніпро”: Дніпра, Дніпро, Дніпром, Дніпру, Дніпрі.

The 3rd highest number of forms (5) was observed with the lemma “Німеччина”: Німеччина, Німеччини, Німеччиною, Німеччину, Німеччині.

PROPN occurs with 8 features: Animacy (3523; 100% instances), Case (3523; 100% instances), Number (3482; 99% instances), Gender (3409; 97% instances), NameType (1999; 57% instances), Uninflect (553; 16% instances), Abbr (214; 6% instances), Orth (7; 0% instances)

PROPN occurs with 21 feature-value pairs: Abbr=Yes, Animacy=Anim, Animacy=Inan, Case=Acc, Case=Dat, Case=Gen, Case=Ins, Case=Loc, Case=Nom, Case=Voc, Gender=Fem, Gender=Masc, Gender=Neut, NameType=Giv, NameType=Pat, NameType=Sur, Number=Plur, Number=Ptan, Number=Sing, Orth=Alt, Uninflect=Yes

PROPN occurs with 156 feature combinations. The most frequent feature combination is Animacy=Anim|Case=Nom|Gender=Masc|NameType=Sur|Number=Sing (362 tokens). Examples: Щербачов, Морріс, Сосницький, Кетлінг, Курбас, Сковорода, Зеров, Коцький, Купецький, Порошенко

Relations

PROPN nodes are attached to their parents using 27 different relations: nmod (908; 26% instances), nsubj (538; 15% instances), flat:name (530; 15% instances), flat:title (488; 14% instances), obl (381; 11% instances), conj (260; 7% instances), obj (127; 4% instances), appos (80; 2% instances), root (55; 2% instances), parataxis (29; 1% instances), vocative (25; 1% instances), iobj (19; 1% instances), advcl (16; 0% instances), compound (13; 0% instances), flat:range (13; 0% instances), orphan (9; 0% instances), xcomp:pred (9; 0% instances), acl:relcl (3; 0% instances), discourse (3; 0% instances), flat (3; 0% instances), flat:repeat (3; 0% instances), list (3; 0% instances), acl (2; 0% instances), csubj (2; 0% instances), parataxis:discourse (2; 0% instances), dislocated (1; 0% instances), xcomp (1; 0% instances)

Parents of PROPN nodes belong to 11 different parts of speech: NOUN (1522; 43% instances), VERB (957; 27% instances), PROPN (817; 23% instances), ADJ (100; 3% instances), (55; 2% instances), PRON (28; 1% instances), ADV (21; 1% instances), DET (9; 0% instances), NUM (9; 0% instances), X (4; 0% instances), SYM (1; 0% instances)

1646 (47%) PROPN nodes are leaves.

1138 (32%) PROPN nodes have one child.

439 (12%) PROPN nodes have two children.

300 (9%) PROPN nodes have three or more children.

The highest child degree of a PROPN node is 19.

Children of PROPN nodes are attached using 29 different relations: punct (947; 30% instances), case (653; 21% instances), flat:name (530; 17% instances), conj (292; 9% instances), amod (159; 5% instances), cc (121; 4% instances), appos (83; 3% instances), parataxis (54; 2% instances), discourse (50; 2% instances), acl:relcl (42; 1% instances), det (39; 1% instances), nmod (31; 1% instances), flat:title (27; 1% instances), mark (27; 1% instances), advmod (24; 1% instances), orphan (15; 0% instances), nsubj (12; 0% instances), flat:range (11; 0% instances), compound (6; 0% instances), cop (6; 0% instances), nummod (5; 0% instances), nummod:gov (5; 0% instances), list (4; 0% instances), flat:repeat (3; 0% instances), obl (3; 0% instances), acl:adv (1; 0% instances), advcl (1; 0% instances), flat (1; 0% instances), parataxis:newsent (1; 0% instances)

Children of PROPN nodes belong to 17 different parts of speech: PUNCT (947; 30% instances), PROPN (817; 26% instances), ADP (653; 21% instances), ADJ (178; 6% instances), NOUN (165; 5% instances), CCONJ (121; 4% instances), PART (53; 2% instances), VERB (49; 2% instances), DET (44; 1% instances), NUM (39; 1% instances), SCONJ (26; 1% instances), ADV (21; 1% instances), X (18; 1% instances), PRON (12; 0% instances), AUX (6; 0% instances), INTJ (2; 0% instances), SYM (2; 0% instances)