home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Kyrgyz-KTMU: POS Tags: PROPN

There are 273 PROPN lemmas (12%), 334 PROPN types (10%) and 670 PROPN tokens (9%). Out of 13 observed tags, the rank of PROPN is: 3 in number of lemmas, 3 in number of types and 4 in number of tokens.

The 10 most frequent PROPN lemmas: кыргызстан, Бишкек, Ош, Ысык-Көл, Жалал-Абад, Россия, Казакстан, кыргыз, Токтогул, Баткен

The 10 most frequent PROPN types: Кыргызстанда, Бишкекте, Ош, кыргызстан, Ысык-Көлдө, Ошто, Жалал-Абад, Бишкек, Кыргызстандын, Бишкектеги

The 10 most frequent ambiguous lemmas: Бишкек (PROPN 68, NOUN 1), Россия (PROPN 16, NOUN 1), Цейлон (PROPN 5, NOUN 2), кыргызстандык (NOUN 7, PROPN 3), АКШ (PROPN 3, ADJ 1), Кумтөр (PROPN 3, NOUN 1), ата (NOUN 7, PROPN 1, VERB 1), Иңкамал (PROPN 2, NOUN 1), Кытай (PROPN 2, ADJ 1, NOUN 1), Султанмурат (NOUN 2, PROPN 2)

The 10 most frequent ambiguous types: Кыргызстандын (PROPN 10, NOUN 2), Бишкектеги (PROPN 9, NOUN 1), Россиядагы (PROPN 5, NOUN 1), ГЭС (PROPN 3, NOUN 1), Цейлондо (PROPN 3, NOUN 2), Ата (PROPN 2, NOUN 1), Иңкамал (PROPN 2, NOUN 1), Кыргызстандык (PROPN 2, NOUN 1), Султанмураттын (PROPN 2, NOUN 1), Чал (NOUN 3, PROPN 2)

Morphology

The form / lemma ratio of PROPN is 1.223443 (the average of all parts of speech is 1.500863).

The 1st highest number of forms (7) was observed with the lemma “Кыргызстан”: Кыргызстан, Кыргызстанга, Кыргызстанда, Кыргызстандагы, Кыргызстандан, Кыргызстандык, Кыргызстандын.

The 2nd highest number of forms (6) was observed with the lemma “Бишкек”: Бишкек, Бишкекте, Бишкектеги, Бишкектен, Бишкекти, Бишкектин.

The 3rd highest number of forms (5) was observed with the lemma “Ош”: Ош, Ошко, Ошто, Оштогу, Оштун.

PROPN occurs with 7 features: Case (667; 100% instances), Number (667; 100% instances), Person (613; 91% instances), Person[psor] (74; 11% instances), Abbr (42; 6% instances), Number[psor] (42; 6% instances), PronType (2; 0% instances)

PROPN occurs with 17 feature-value pairs: Abbr=Yes, Case=Abl, Case=Abl,Gen, Case=Acc, Case=Dat, Case=Equ, Case=Gen, Case=Loc, Case=Nom, Number=Plur, Number=Sing, Number[psor]=Sing, Person=2, Person=3, Person[psor]=2, Person[psor]=3, PronType=Prs

PROPN occurs with 32 feature combinations. The most frequent feature combination is Case=Nom|Number=Sing|Person=3 (295 tokens). Examples: Ош, кыргызстан, Жалал-Абад, Бишкек, Токтогул, кыргыз, Ысык-Көл, Бишкекте, Алымкадыр, Баткен

Relations

PROPN nodes are attached to their parents using 12 different relations: nmod (248; 37% instances), obl (216; 32% instances), nsubj (83; 12% instances), conj (27; 4% instances), nmod:poss (25; 4% instances), fixed (23; 3% instances), flat (21; 3% instances), compound (11; 2% instances), root (9; 1% instances), amod (3; 0% instances), obj (3; 0% instances), csubj (1; 0% instances)

Parents of PROPN nodes belong to 11 different parts of speech: VERB (301; 45% instances), NOUN (244; 36% instances), PROPN (94; 14% instances), ADJ (13; 2% instances), (9; 1% instances), CCONJ (2; 0% instances), NUM (2; 0% instances), PRON (2; 0% instances), ADP (1; 0% instances), ADV (1; 0% instances), PUNCT (1; 0% instances)

488 (73%) PROPN nodes are leaves.

143 (21%) PROPN nodes have one child.

31 (5%) PROPN nodes have two children.

8 (1%) PROPN nodes have three or more children.

The highest child degree of a PROPN node is 4.

Children of PROPN nodes are attached using 19 different relations: punct (46; 20% instances), nmod (35; 15% instances), conj (34; 15% instances), fixed (25; 11% instances), flat (22; 9% instances), cc (20; 9% instances), compound (11; 5% instances), nmod:poss (7; 3% instances), obl (7; 3% instances), advmod (6; 3% instances), case (5; 2% instances), amod (4; 2% instances), acl (3; 1% instances), advmod:emph (2; 1% instances), compound:svc (1; 0% instances), det (1; 0% instances), mark (1; 0% instances), nsubj (1; 0% instances), nummod (1; 0% instances)

Children of PROPN nodes belong to 9 different parts of speech: PROPN (94; 41% instances), PUNCT (46; 20% instances), NOUN (45; 19% instances), CCONJ (27; 12% instances), ADV (9; 4% instances), VERB (6; 3% instances), ADJ (2; 1% instances), NUM (2; 1% instances), DET (1; 0% instances)