home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Kazakh-KTB: POS Tags: PROPN

There are 219 PROPN lemmas (8%), 285 PROPN types (6%) and 559 PROPN tokens (5%). Out of 17 observed tags, the rank of PROPN is: 4 in number of lemmas, 4 in number of types and 5 in number of tokens.

The 10 most frequent PROPN lemmas: Иран, Қазақстан, Астана, Азамат, Бекболат, Айгүл, Назарбаев, Нұрсұлтан, Шолпан, Алматы

The 10 most frequent PROPN types: Иран, Қазақстан, Бекболат, Азамат, Нұрсұлтан, Айгүл, Назарбаев, Қазақстанның, АҚШ, Шолпан

The 10 most frequent ambiguous lemmas: Елизавета (PROPN 5, NOUN 1), Ислам (NOUN 6, PROPN 1), Эдуард (NOUN 2, PROPN 1), салжұқ (NOUN 2, PROPN 1)

The 10 most frequent ambiguous types: Елизавета (PROPN 5, NOUN 1), Темір (PROPN 3, NOUN 1), Ислам (NOUN 5, PROPN 1), Отан (NOUN 1, PROPN 1), Эдуард (NOUN 2, PROPN 1), салжұқтар (NOUN 1, PROPN 1)

Morphology

The form / lemma ratio of PROPN is 1.301370 (the average of all parts of speech is 1.743774).

The 1st highest number of forms (7) was observed with the lemma “Қазақстан”: Қазақстан, Қазақстанда, Қазақстандағы, Қазақстанды, Қазақстанмен, Қазақстанның, Қазақстанға.

The 2nd highest number of forms (6) was observed with the lemma “Астана”: Астана, Астанада, Астанадан, Астанадағы, Астананың, Астанаға.

The 3rd highest number of forms (5) was observed with the lemma “Азия”: Азия, Азияда, Азиядан, Азиядағы, Азияның.

PROPN occurs with 5 features: Case (559; 100% instances), Gender (190; 34% instances), Number (3; 1% instances), Number[psor] (2; 0% instances), Person[psor] (2; 0% instances)

PROPN occurs with 12 feature-value pairs: Case=Abl, Case=Acc, Case=Dat, Case=Gen, Case=Ins, Case=Loc, Case=Nom, Gender=Fem, Gender=Masc, Number=Plur, Number[psor]=Plur,Sing, Person[psor]=3

PROPN occurs with 19 feature combinations. The most frequent feature combination is Case=Nom (251 tokens). Examples: Иран, Қазақстан, АҚШ, Алматы, Астана, Бекболат, Азамат, Нұрсұлтан, Ұлыбритания, Айгүл

Relations

PROPN nodes are attached to their parents using 16 different relations: nmod:poss (181; 32% instances), nsubj (121; 22% instances), flat:name (71; 13% instances), obl (55; 10% instances), conj (47; 8% instances), appos (27; 5% instances), obj (14; 3% instances), nmod (12; 2% instances), root (9; 2% instances), amod (7; 1% instances), vocative (6; 1% instances), compound (5; 1% instances), advcl (1; 0% instances), advmod (1; 0% instances), csubj (1; 0% instances), parataxis (1; 0% instances)

Parents of PROPN nodes belong to 8 different parts of speech: NOUN (266; 48% instances), VERB (168; 30% instances), PROPN (94; 17% instances), ADJ (12; 2% instances), (9; 2% instances), ADV (5; 1% instances), NUM (4; 1% instances), PRON (1; 0% instances)

363 (65%) PROPN nodes are leaves.

126 (23%) PROPN nodes have one child.

39 (7%) PROPN nodes have two children.

31 (6%) PROPN nodes have three or more children.

The highest child degree of a PROPN node is 7.

Children of PROPN nodes are attached using 21 different relations: punct (97; 31% instances), conj (62; 20% instances), flat:name (53; 17% instances), amod (22; 7% instances), cc (15; 5% instances), appos (11; 3% instances), nsubj (7; 2% instances), case (6; 2% instances), compound (6; 2% instances), dep (6; 2% instances), acl (4; 1% instances), advmod (4; 1% instances), cop (4; 1% instances), acl:relcl (3; 1% instances), det (3; 1% instances), discourse (3; 1% instances), nmod (3; 1% instances), obl (3; 1% instances), nummod (2; 1% instances), orphan (2; 1% instances), parataxis (1; 0% instances)

Children of PROPN nodes belong to 14 different parts of speech: PUNCT (97; 31% instances), PROPN (94; 30% instances), NOUN (45; 14% instances), ADJ (22; 7% instances), CCONJ (15; 5% instances), VERB (9; 3% instances), X (7; 2% instances), ADP (6; 2% instances), NUM (6; 2% instances), AUX (4; 1% instances), ADV (3; 1% instances), DET (3; 1% instances), PART (3; 1% instances), PRON (3; 1% instances)