home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Kazakh-KTB: POS Tags: PROPN

There are 219 PROPN lemmas (8%), 285 PROPN types (6%) and 562 PROPN tokens (5%). Out of 17 observed tags, the rank of PROPN is: 4 in number of lemmas, 4 in number of types and 5 in number of tokens.

The 10 most frequent PROPN lemmas: Иран, Қазақстан, Астана, Азамат, Бекболат, Айгүл, Назарбаев, Нұрсұлтан, Шолпан, Алматы

The 10 most frequent PROPN types: Иран, Қазақстан, Бекболат, Азамат, Нұрсұлтан, Айгүл, Назарбаев, Қазақстанның, АҚШ, Шолпан

The 10 most frequent ambiguous lemmas: Ислам (NOUN 6, PROPN 1), салжұқ (NOUN 2, PROPN 1)

The 10 most frequent ambiguous types: Темір (PROPN 3, NOUN 1), Ислам (NOUN 5, PROPN 1), Отан (NOUN 1, PROPN 1), салжұқтар (NOUN 1, PROPN 1)

Morphology

The form / lemma ratio of PROPN is 1.301370 (the average of all parts of speech is 1.747153).

The 1st highest number of forms (7) was observed with the lemma “Қазақстан”: Қазақстан, Қазақстанда, Қазақстандағы, Қазақстанды, Қазақстанмен, Қазақстанның, Қазақстанға.

The 2nd highest number of forms (6) was observed with the lemma “Астана”: Астана, Астанада, Астанадан, Астанадағы, Астананың, Астанаға.

The 3rd highest number of forms (5) was observed with the lemma “Азия”: Азия, Азияда, Азиядан, Азиядағы, Азияның.

PROPN occurs with 5 features: Case (562; 100% instances), Gender (193; 34% instances), Number (3; 1% instances), Number[psor] (2; 0% instances), Person[psor] (2; 0% instances)

PROPN occurs with 12 feature-value pairs: Case=Abl, Case=Acc, Case=Dat, Case=Gen, Case=Ins, Case=Loc, Case=Nom, Gender=Fem, Gender=Masc, Number=Plur, Number[psor]=Plur,Sing, Person[psor]=3

PROPN occurs with 19 feature combinations. The most frequent feature combination is Case=Nom (251 tokens). Examples: Иран, Қазақстан, АҚШ, Алматы, Астана, Бекболат, Азамат, Нұрсұлтан, Ұлыбритания, Айгүл

Relations

PROPN nodes are attached to their parents using 15 different relations: nmod:poss (181; 32% instances), nsubj (123; 22% instances), flat:name (72; 13% instances), obl (55; 10% instances), conj (47; 8% instances), appos (28; 5% instances), obj (14; 2% instances), nmod (12; 2% instances), root (9; 2% instances), amod (7; 1% instances), vocative (6; 1% instances), compound (5; 1% instances), advcl (1; 0% instances), csubj (1; 0% instances), parataxis (1; 0% instances)

Parents of PROPN nodes belong to 8 different parts of speech: NOUN (266; 47% instances), VERB (169; 30% instances), PROPN (94; 17% instances), ADJ (13; 2% instances), (9; 2% instances), ADV (5; 1% instances), NUM (5; 1% instances), PRON (1; 0% instances)

365 (65%) PROPN nodes are leaves.

126 (22%) PROPN nodes have one child.

40 (7%) PROPN nodes have two children.

31 (6%) PROPN nodes have three or more children.

The highest child degree of a PROPN node is 7.

Children of PROPN nodes are attached using 21 different relations: punct (97; 30% instances), conj (62; 19% instances), flat:name (54; 17% instances), amod (22; 7% instances), cc (15; 5% instances), appos (12; 4% instances), nsubj (7; 2% instances), case (6; 2% instances), compound (6; 2% instances), dep (6; 2% instances), acl (4; 1% instances), cop (4; 1% instances), obl (4; 1% instances), acl:relcl (3; 1% instances), advmod (3; 1% instances), det (3; 1% instances), discourse (3; 1% instances), nmod (3; 1% instances), nummod (2; 1% instances), orphan (2; 1% instances), parataxis (1; 0% instances)

Children of PROPN nodes belong to 14 different parts of speech: PUNCT (97; 30% instances), PROPN (94; 29% instances), NOUN (47; 15% instances), ADJ (22; 7% instances), CCONJ (15; 5% instances), VERB (9; 3% instances), X (7; 2% instances), ADP (6; 2% instances), NUM (6; 2% instances), AUX (4; 1% instances), ADV (3; 1% instances), DET (3; 1% instances), PART (3; 1% instances), PRON (3; 1% instances)