Treebank Statistics: UD_Kazakh-KTB: POS Tags: NOUN
There are 1106 NOUN
lemmas (42%), 2068 NOUN
types (45%) and 3100 NOUN
tokens (29%).
Out of 17 observed tags, the rank of NOUN
is: 1 in number of lemmas, 1 in number of types and 1 in number of tokens.
The 10 most frequent NOUN
lemmas: ел, жыл, мемлекет, бала, жер, ж., ғасыр, орыс, қала, адам
The 10 most frequent NOUN
types: ж., мемлекет, ел, орыс, басшысы, бала, елде, жылдың, жылы, қазақ
The 10 most frequent ambiguous lemmas: жыл (NOUN 62, X 2), бала (NOUN 35, X 1), қала (NOUN 30, VERB 3), адам (NOUN 29, X 1), қазақ (NOUN 25, ADJ 2), жұмыс (NOUN 23, X 6), орын (NOUN 18, X 1), бас (NOUN 17, VERB 10, ADJ 1, X 1), мал (NOUN 17, VERB 1, X 1), қыз (NOUN 15, VERB 1)
The 10 most frequent ambiguous types: бала (NOUN 13, X 1), жылы (NOUN 15, ADJ 1), қазақ (NOUN 12, ADJ 2), адам (NOUN 10, X 1), мал (NOUN 8, X 1), орын (NOUN 9, X 1), жұмыс (NOUN 7, X 6), жыл (NOUN 6, X 1), бас (NOUN 3, X 1), ағылшын (NOUN 4, ADJ 1)
- бала
- жылы
- қазақ
- адам
- мал
- орын
- жұмыс
- жыл
- бас
- ағылшын
Morphology
The form / lemma ratio of NOUN
is 1.869801 (the average of all parts of speech is 1.747153).
The 1st highest number of forms (25) was observed with the lemma “ел”: Еліміздегі, Еліне, ел, елге, елде, елдегі, елден, елдер, елдерден, елдерді, елдері, елдерімен, елдеріміз, елдерінде, елдеріңіз, елді, елдің, елі, еліміз, елімізге, елімізде, еліміздің, елінің, еліңе, еліңіз.
The 2nd highest number of forms (17) was observed with the lemma “бала”: Балаларды, Балалардың, бала, балалар, балалардан, балалармен, балалары, балаларына, балаларынан, балама, баламды, баласы, баласын, баласына, балаға, балаң, балаңа.
The 3rd highest number of forms (12) was observed with the lemma “жыл”: жыл, жылда, жылдан, жылдар, жылдардағы, жылдардың, жылдары, жылдарына, жылдың, жылмен, жылы, жылға.
NOUN
occurs with 5 features: Case (2999; 97% instances), Number[psor] (1132; 37% instances), Person[psor] (1132; 37% instances), Number (398; 13% instances), Polite (11; 0% instances)
NOUN
occurs with 15 feature-value pairs: Case=Abl
, Case=Acc
, Case=Dat
, Case=Gen
, Case=Ins
, Case=Loc
, Case=Nom
, Number=Plur
, Number[psor]=Plur
, Number[psor]=Plur,Sing
, Number[psor]=Sing
, Person[psor]=1
, Person[psor]=2
, Person[psor]=3
, Polite=Form
NOUN
occurs with 55 feature combinations.
The most frequent feature combination is Case=Nom
(964 tokens).
Examples: мемлекет, ел, орыс, қазақ, Президент, адам, бала, кісі, мал, орын
Relations
NOUN
nodes are attached to their parents using 25 different relations: nsubj (679; 22% instances), obl (652; 21% instances), nmod:poss (548; 18% instances), obj (432; 14% instances), conj (188; 6% instances), nmod (165; 5% instances), root (144; 5% instances), compound (73; 2% instances), amod (52; 2% instances), appos (35; 1% instances), advcl (24; 1% instances), flat:name (17; 1% instances), parataxis (17; 1% instances), ccomp (12; 0% instances), nummod (11; 0% instances), xcomp (10; 0% instances), compound:lvc (7; 0% instances), orphan (7; 0% instances), acl (5; 0% instances), acl:relcl (5; 0% instances), iobj (5; 0% instances), vocative (4; 0% instances), clf (3; 0% instances), csubj (3; 0% instances), obl:own (2; 0% instances)
Parents of NOUN
nodes belong to 9 different parts of speech: VERB (1644; 53% instances), NOUN (1023; 33% instances), ADJ (177; 6% instances), (144; 5% instances), PROPN (47; 2% instances), NUM (35; 1% instances), PRON (19; 1% instances), ADV (7; 0% instances), AUX (4; 0% instances)
982 (32%) NOUN
nodes are leaves.
1336 (43%) NOUN
nodes have one child.
477 (15%) NOUN
nodes have two children.
305 (10%) NOUN
nodes have three or more children.
The highest child degree of a NOUN
node is 9.
Children of NOUN
nodes are attached using 28 different relations: nmod:poss (773; 23% instances), amod (672; 20% instances), punct (436; 13% instances), det (211; 6% instances), conj (180; 5% instances), nsubj (129; 4% instances), case (113; 3% instances), nummod (111; 3% instances), acl (101; 3% instances), cop (88; 3% instances), acl:relcl (87; 3% instances), cc (86; 3% instances), compound (84; 2% instances), appos (67; 2% instances), nmod (64; 2% instances), obl (56; 2% instances), advmod (47; 1% instances), flat:name (30; 1% instances), advcl (24; 1% instances), parataxis (18; 1% instances), dep (16; 0% instances), csubj (12; 0% instances), orphan (10; 0% instances), aux (9; 0% instances), discourse (4; 0% instances), iobj (1; 0% instances), obj (1; 0% instances), vocative (1; 0% instances)
Children of NOUN
nodes belong to 17 different parts of speech: NOUN (1023; 30% instances), ADJ (506; 15% instances), PUNCT (436; 13% instances), NUM (280; 8% instances), PROPN (266; 8% instances), VERB (218; 6% instances), DET (212; 6% instances), PRON (113; 3% instances), ADP (112; 3% instances), AUX (97; 3% instances), CCONJ (83; 2% instances), ADV (62; 2% instances), X (16; 0% instances), SCONJ (3; 0% instances), INTJ (2; 0% instances), PART (1; 0% instances), SYM (1; 0% instances)