Treebank Statistics: UD_Kazakh-KTB: POS Tags: PROPN
There are 219 PROPN
lemmas (8%), 285 PROPN
types (6%) and 562 PROPN
tokens (5%).
Out of 17 observed tags, the rank of PROPN
is: 4 in number of lemmas, 4 in number of types and 5 in number of tokens.
The 10 most frequent PROPN
lemmas: Иран, Қазақстан, Астана, Азамат, Бекболат, Айгүл, Назарбаев, Нұрсұлтан, Шолпан, Алматы
The 10 most frequent PROPN
types: Иран, Қазақстан, Бекболат, Азамат, Нұрсұлтан, Айгүл, Назарбаев, Қазақстанның, АҚШ, Шолпан
The 10 most frequent ambiguous lemmas: Ислам (NOUN 6, PROPN 1), салжұқ (NOUN 2, PROPN 1)
The 10 most frequent ambiguous types: Темір (PROPN 3, NOUN 1), Ислам (NOUN 5, PROPN 1), Отан (NOUN 1, PROPN 1), салжұқтар (NOUN 1, PROPN 1)
- Темір
- Ислам
- Отан
- салжұқтар
- NOUN 1: 12 ғасырда салжұқтар бірнеше сұлтандыққа бөлініп кетті де , осы ғасырдың аяғына таман бүкіл Иранды түркі қыпшақ тайпасынан шыққан Хорезм шаһы Текеш басып алды .
- PROPN 1: Көп ұзамай олар Ғазнауи әулеті әскерін талқандап , бүкіл Иранды және көрші елдерді басып алып салжұқтар мемлекетін құрды .
Morphology
The form / lemma ratio of PROPN
is 1.301370 (the average of all parts of speech is 1.747153).
The 1st highest number of forms (7) was observed with the lemma “Қазақстан”: Қазақстан, Қазақстанда, Қазақстандағы, Қазақстанды, Қазақстанмен, Қазақстанның, Қазақстанға.
The 2nd highest number of forms (6) was observed with the lemma “Астана”: Астана, Астанада, Астанадан, Астанадағы, Астананың, Астанаға.
The 3rd highest number of forms (5) was observed with the lemma “Азия”: Азия, Азияда, Азиядан, Азиядағы, Азияның.
PROPN
occurs with 5 features: Case (562; 100% instances), Gender (193; 34% instances), Number (3; 1% instances), Number[psor] (2; 0% instances), Person[psor] (2; 0% instances)
PROPN
occurs with 12 feature-value pairs: Case=Abl
, Case=Acc
, Case=Dat
, Case=Gen
, Case=Ins
, Case=Loc
, Case=Nom
, Gender=Fem
, Gender=Masc
, Number=Plur
, Number[psor]=Plur,Sing
, Person[psor]=3
PROPN
occurs with 19 feature combinations.
The most frequent feature combination is Case=Nom
(251 tokens).
Examples: Иран, Қазақстан, АҚШ, Алматы, Астана, Бекболат, Азамат, Нұрсұлтан, Ұлыбритания, Айгүл
Relations
PROPN
nodes are attached to their parents using 15 different relations: nmod:poss (181; 32% instances), nsubj (123; 22% instances), flat:name (72; 13% instances), obl (55; 10% instances), conj (47; 8% instances), appos (28; 5% instances), obj (14; 2% instances), nmod (12; 2% instances), root (9; 2% instances), amod (7; 1% instances), vocative (6; 1% instances), compound (5; 1% instances), advcl (1; 0% instances), csubj (1; 0% instances), parataxis (1; 0% instances)
Parents of PROPN
nodes belong to 8 different parts of speech: NOUN (266; 47% instances), VERB (169; 30% instances), PROPN (94; 17% instances), ADJ (13; 2% instances), (9; 2% instances), ADV (5; 1% instances), NUM (5; 1% instances), PRON (1; 0% instances)
365 (65%) PROPN
nodes are leaves.
126 (22%) PROPN
nodes have one child.
40 (7%) PROPN
nodes have two children.
31 (6%) PROPN
nodes have three or more children.
The highest child degree of a PROPN
node is 7.
Children of PROPN
nodes are attached using 21 different relations: punct (97; 30% instances), conj (62; 19% instances), flat:name (54; 17% instances), amod (22; 7% instances), cc (15; 5% instances), appos (12; 4% instances), nsubj (7; 2% instances), case (6; 2% instances), compound (6; 2% instances), dep (6; 2% instances), acl (4; 1% instances), cop (4; 1% instances), obl (4; 1% instances), acl:relcl (3; 1% instances), advmod (3; 1% instances), det (3; 1% instances), discourse (3; 1% instances), nmod (3; 1% instances), nummod (2; 1% instances), orphan (2; 1% instances), parataxis (1; 0% instances)
Children of PROPN
nodes belong to 14 different parts of speech: PUNCT (97; 30% instances), PROPN (94; 29% instances), NOUN (47; 15% instances), ADJ (22; 7% instances), CCONJ (15; 5% instances), VERB (9; 3% instances), X (7; 2% instances), ADP (6; 2% instances), NUM (6; 2% instances), AUX (4; 1% instances), ADV (3; 1% instances), DET (3; 1% instances), PART (3; 1% instances), PRON (3; 1% instances)