This is part of archived UD v1 documentation. See http://universaldependencies.org/ for the current version.
home kk/pos issue tracker

NOUN: noun

Nouns inflect for case, number and possession. Nouns receive nominal morphology. Other parts of speech may be derived into nouns, such as adjectives.

Proper nouns are not annotated as NOUN but rather PROPN.

Examples


Treebank Statistics (UD_Kazakh)

There are 778 NOUN lemmas (42%), 1262 NOUN types (44%) and 1859 NOUN tokens (30%). Out of 16 observed tags, the rank of NOUN is: 1 in number of lemmas, 1 in number of types and 1 in number of tokens.

The 10 most frequent NOUN lemmas: ел, орыс, жыл, жер, ғасыр, ж., адам, бала, мемлекет, қазақ

The 10 most frequent NOUN types: _, ж., орыс, қазақ, ел, әулеті, ғасырдың, елде, мал, кісі

The 10 most frequent ambiguous lemmas: қала (NOUN 17, VERB 1), мал (NOUN 16, VERB 1), бас (NOUN 14, VERB 9), ат (NOUN 9, VERB 4), бай (NOUN 8, ADJ 2), жақ (NOUN 7, VERB 1), ана (NOUN 6, DET 1), арт (NOUN 6, VERB 1), іш (NOUN 6, VERB 3), ет (NOUN 5, VERB 3)

The 10 most frequent ambiguous types: _ (AUX 154, PART 76, NOUN 75, ADJ 72, VERB 29, PRON 23, CONJ 13, ADV 7, ADP 7, PROPN 5, NUM 4, PUNCT 1), жылы (NOUN 5, ADJ 1), бай (NOUN 2, ADJ 2), млн. (NOUN 3, NUM 2), Батыс (NOUN 2, ADJ 2), КСРО (NOUN 2, PROPN 2), сайлау (NOUN 2, VERB 1), ұлы (NOUN 2, ADJ 1), Темір (NOUN 1, PROPN 1), ар (ADJ 1, NOUN 1)

Morphology

The form / lemma ratio of NOUN is 1.622108 (the average of all parts of speech is 1.549647).

The 1st highest number of forms (17) was observed with the lemma “бала”: _, Балаларды, Балалардың, бала, балалар, балалардан, балалармен, балаларына, балаларынан, балама, баламды, баласы, баласын, баласына, балаға, балаң, балаңа.

The 2nd highest number of forms (16) was observed with the lemma “ел”: _, ел, елге, елде, елдегі, елден, елдер, елдерден, елдерді, елдері, елдерімен, елді, елдің, елі, елінің, еліңе.

The 3rd highest number of forms (11) was observed with the lemma “жер”: жер, жерге, жерде, жерді, жері, жерін, жерінде, жеріне, жерінен, жеріңді, жеріңе.

NOUN occurs with 4 features: kk-feat/Case (314; 17% instances), kk-feat/Number[psor] (98; 5% instances), kk-feat/Person[psor] (98; 5% instances), kk-feat/Number (29; 2% instances)

NOUN occurs with 14 feature-value pairs: Case=Abl, Case=Acc, Case=Dat, Case=Gen, Case=Ins, Case=Loc, Case=Nom, Number=Plur, Number[psor]=Plur, Number[psor]=Plur,Sing, Number[psor]=Sing, Person[psor]=1, Person[psor]=2, Person[psor]=3

NOUN occurs with 30 feature combinations. The most frequent feature combination is _ (1545 tokens). Examples: _, ж., орыс, әулеті, ел, ғасырдың, елде, парсы, тілдерін, ғасырда

Relations

NOUN nodes are attached to their parents using 24 different relations: kk-dep/nmod (497; 27% instances), kk-dep/nsubj (395; 21% instances), kk-dep/nmod:poss (321; 17% instances), kk-dep/dobj (280; 15% instances), kk-dep/conj (128; 7% instances), kk-dep/root (67; 4% instances), kk-dep/compound (54; 3% instances), kk-dep/appos (24; 1% instances), kk-dep/remnant (17; 1% instances), kk-dep/amod (14; 1% instances), kk-dep/advcl (13; 1% instances), kk-dep/name (9; 0% instances), kk-dep/parataxis (8; 0% instances), kk-dep/ccomp (7; 0% instances), kk-dep/nummod (6; 0% instances), kk-dep/iobj (5; 0% instances), kk-dep/xcomp (3; 0% instances), kk-dep/acl:relcl (2; 0% instances), kk-dep/advmod (2; 0% instances), kk-dep/nmod:own (2; 0% instances), kk-dep/vocative (2; 0% instances), kk-dep/acl (1; 0% instances), kk-dep/csubj (1; 0% instances), kk-dep/dobj:caus (1; 0% instances)

Parents of NOUN nodes belong to 12 different parts of speech: VERB (1017; 55% instances), NOUN (623; 34% instances), ADJ (88; 5% instances), ROOT (67; 4% instances), PROPN (27; 1% instances), PRON (13; 1% instances), NUM (12; 1% instances), ADV (8; 0% instances), AUX (1; 0% instances), CONJ (1; 0% instances), DET (1; 0% instances), PUNCT (1; 0% instances)

769 (41%) NOUN nodes are leaves.

669 (36%) NOUN nodes have one child.

234 (13%) NOUN nodes have two children.

187 (10%) NOUN nodes have three or more children.

The highest child degree of a NOUN node is 19.

Children of NOUN nodes are attached using 25 different relations: kk-dep/nmod:poss (431; 22% instances), kk-dep/amod (326; 17% instances), kk-dep/punct (251; 13% instances), kk-dep/conj (131; 7% instances), kk-dep/det (114; 6% instances), kk-dep/cop (93; 5% instances), kk-dep/acl:relcl (87; 5% instances), kk-dep/nmod (68; 4% instances), kk-dep/nsubj (67; 3% instances), kk-dep/cc (61; 3% instances), kk-dep/compound (59; 3% instances), kk-dep/case (55; 3% instances), kk-dep/nummod (49; 3% instances), kk-dep/appos (31; 2% instances), kk-dep/advmod (24; 1% instances), kk-dep/acl (22; 1% instances), kk-dep/remnant (17; 1% instances), kk-dep/advcl (14; 1% instances), kk-dep/parataxis (10; 1% instances), kk-dep/aux (5; 0% instances), kk-dep/name (5; 0% instances), kk-dep/discourse (4; 0% instances), kk-dep/csubj (3; 0% instances), kk-dep/ccomp (1; 0% instances), kk-dep/dobj (1; 0% instances)

Children of NOUN nodes belong to 15 different parts of speech: NOUN (623; 32% instances), PUNCT (243; 13% instances), ADJ (232; 12% instances), VERB (152; 8% instances), NUM (151; 8% instances), PROPN (148; 8% instances), DET (114; 6% instances), AUX (64; 3% instances), CONJ (60; 3% instances), ADP (55; 3% instances), PRON (42; 2% instances), PART (24; 1% instances), ADV (18; 1% instances), INTJ (2; 0% instances), SCONJ (1; 0% instances)


NOUN in other languages: [bg] [cs] [de] [el] [en] [es] [eu] [fa] [fi] [fr] [ga] [he] [hu] [it] [ja] [ko] [sv] [u]