Treebank Statistics: UD_Kyrgyz-KTMU: POS Tags: PROPN
There are 273 PROPN
lemmas (12%), 334 PROPN
types (10%) and 670 PROPN
tokens (9%).
Out of 13 observed tags, the rank of PROPN
is: 3 in number of lemmas, 3 in number of types and 4 in number of tokens.
The 10 most frequent PROPN
lemmas: кыргызстан, Бишкек, Ош, Ысык-Көл, Жалал-Абад, Россия, Казакстан, кыргыз, Токтогул, Баткен
The 10 most frequent PROPN
types: Кыргызстанда, Бишкекте, Ош, кыргызстан, Ысык-Көлдө, Ошто, Жалал-Абад, Бишкек, Кыргызстандын, Бишкектеги
The 10 most frequent ambiguous lemmas: Бишкек (PROPN 68, NOUN 1), Россия (PROPN 16, NOUN 1), Цейлон (PROPN 5, NOUN 2), кыргызстандык (NOUN 7, PROPN 3), АКШ (PROPN 3, ADJ 1), Кумтөр (PROPN 3, NOUN 1), ата (NOUN 7, PROPN 1, VERB 1), Иңкамал (PROPN 2, NOUN 1), Кытай (PROPN 2, ADJ 1, NOUN 1), Султанмурат (NOUN 2, PROPN 2)
The 10 most frequent ambiguous types: Кыргызстандын (PROPN 10, NOUN 2), Бишкектеги (PROPN 9, NOUN 1), Россиядагы (PROPN 5, NOUN 1), ГЭС (PROPN 3, NOUN 1), Цейлондо (PROPN 3, NOUN 2), Ата (PROPN 2, NOUN 1), Иңкамал (PROPN 2, NOUN 1), Кыргызстандык (PROPN 2, NOUN 1), Султанмураттын (PROPN 2, NOUN 1), Чал (NOUN 3, PROPN 2)
- Кыргызстандын
- Бишкектеги
- Россиядагы
- ГЭС
- Цейлондо
- Ата
- Иңкамал
- Кыргызстандык
- Султанмураттын
- Чал
Morphology
The form / lemma ratio of PROPN
is 1.223443 (the average of all parts of speech is 1.500863).
The 1st highest number of forms (7) was observed with the lemma “Кыргызстан”: Кыргызстан, Кыргызстанга, Кыргызстанда, Кыргызстандагы, Кыргызстандан, Кыргызстандык, Кыргызстандын.
The 2nd highest number of forms (6) was observed with the lemma “Бишкек”: Бишкек, Бишкекте, Бишкектеги, Бишкектен, Бишкекти, Бишкектин.
The 3rd highest number of forms (5) was observed with the lemma “Ош”: Ош, Ошко, Ошто, Оштогу, Оштун.
PROPN
occurs with 7 features: Case (667; 100% instances), Number (667; 100% instances), Person (613; 91% instances), Person[psor] (74; 11% instances), Abbr (42; 6% instances), Number[psor] (42; 6% instances), PronType (2; 0% instances)
PROPN
occurs with 17 feature-value pairs: Abbr=Yes
, Case=Abl
, Case=Abl,Gen
, Case=Acc
, Case=Dat
, Case=Equ
, Case=Gen
, Case=Loc
, Case=Nom
, Number=Plur
, Number=Sing
, Number[psor]=Sing
, Person=2
, Person=3
, Person[psor]=2
, Person[psor]=3
, PronType=Prs
PROPN
occurs with 32 feature combinations.
The most frequent feature combination is Case=Nom|Number=Sing|Person=3
(295 tokens).
Examples: Ош, кыргызстан, Жалал-Абад, Бишкек, Токтогул, кыргыз, Ысык-Көл, Бишкекте, Алымкадыр, Баткен
Relations
PROPN
nodes are attached to their parents using 11 different relations: nmod (249; 37% instances), obl (216; 32% instances), nsubj (83; 12% instances), flat (39; 6% instances), conj (27; 4% instances), nmod:poss (25; 4% instances), compound (15; 2% instances), root (9; 1% instances), amod (3; 0% instances), obj (3; 0% instances), csubj (1; 0% instances)
Parents of PROPN
nodes belong to 11 different parts of speech: VERB (301; 45% instances), NOUN (245; 37% instances), PROPN (93; 14% instances), ADJ (13; 2% instances), (9; 1% instances), CCONJ (2; 0% instances), NUM (2; 0% instances), PRON (2; 0% instances), ADP (1; 0% instances), ADV (1; 0% instances), PUNCT (1; 0% instances)
488 (73%) PROPN
nodes are leaves.
143 (21%) PROPN
nodes have one child.
32 (5%) PROPN
nodes have two children.
7 (1%) PROPN
nodes have three or more children.
The highest child degree of a PROPN
node is 4.
Children of PROPN
nodes are attached using 18 different relations: punct (46; 20% instances), flat (42; 18% instances), nmod (35; 15% instances), conj (34; 15% instances), cc (20; 9% instances), compound (15; 6% instances), nmod:poss (7; 3% instances), obl (7; 3% instances), advmod (6; 3% instances), case (5; 2% instances), amod (4; 2% instances), acl (3; 1% instances), advmod:emph (2; 1% instances), compound:svc (1; 0% instances), det (1; 0% instances), mark (1; 0% instances), nsubj (1; 0% instances), nummod (1; 0% instances)
Children of PROPN
nodes belong to 9 different parts of speech: PROPN (93; 40% instances), PUNCT (46; 20% instances), NOUN (45; 19% instances), CCONJ (27; 12% instances), ADV (9; 4% instances), VERB (6; 3% instances), ADJ (2; 1% instances), NUM (2; 1% instances), DET (1; 0% instances)