Treebank Statistics: UD_Turkish-Penn: POS Tags: NOUN
There are 5152 NOUN lemmas (33%), 16025 NOUN types (44%) and 61834 NOUN tokens (34%).
Out of 15 observed tags, the rank of NOUN is: 1 in number of lemmas, 1 in number of types and 1 in number of tokens.
The 10 most frequent NOUN lemmas: dolar, bay, hisse, yıl, şirket, piyasa, gün, fiyat, satış, ay
The 10 most frequent NOUN types: bay, hisse, dolar, yıl, şirket, şekilde, satın, devam, gelir, günü
The 10 most frequent ambiguous lemmas: dolar (NOUN 1105, PROPN 16), şirket (NOUN 868, PROPN 1), piyasa (NOUN 527, PROPN 1), gün (NOUN 493, PROPN 1), ara (NOUN 318, VERB 57, ADJ 8), borsa (NOUN 268, PROPN 10), başkan (NOUN 261, PROPN 14), alım (NOUN 231, PROPN 2), banka (NOUN 208, PROPN 1), değer (NOUN 208, PROPN 3)
The 10 most frequent ambiguous types: satın (NOUN 223, VERB 2), gelir (NOUN 73, VERB 14), Amerikan (NOUN 181, ADJ 114), yıllık (NOUN 156, ADJ 1), menkul (NOUN 57, ADJ 36), satış (NOUN 100, VERB 3), artışla (NOUN 113, VERB 9), satışlar (NOUN 22, VERB 2), düşüş (NOUN 77, VERB 6), güçlü (NOUN 69, ADJ 9)
- satın
- gelir
- Amerikan
- yıllık
- menkul
- satış
- artışla
- satışlar
- düşüş
- güçlü
Morphology
The form / lemma ratio of NOUN is 3.110443 (the average of all parts of speech is 2.343544).
The 1st highest number of forms (45) was observed with the lemma “iş”: iş, işe, işi, işidir, işim, işimde, işimiz, işimize, işimizi, işin, işinde, işindeki, işinden, işine, işini, işinin, işle, işler, işlerden, işlere, işleri, işlerin, işlerinde, işlerine, işlerini, işlerinin, işmiş, işsizlik, işte, işteki, işten, İŞ, İŞE, İş, İşe, İşi, İşim, İşimiz, İşin, İşinin, İşler, İşleri, İşlerinin, İşte, İşten.
The 2nd highest number of forms (36) was observed with the lemma “fiyat”: FİYATLAR, fiyat, fiyata, fiyatla, fiyatlandı, fiyatlar, fiyatlara, fiyatlarda, fiyatlardaki, fiyatlardan, fiyatlardı, fiyatlarla, fiyatları, fiyatlarım, fiyatlarımı, fiyatların, fiyatlarına, fiyatlarında, fiyatlarındaki, fiyatlarından, fiyatlarını, fiyatlarının, fiyatlarıyla, fiyatlı, fiyatta, fiyattan, fiyatı, fiyatıdır, fiyatın, fiyatına, fiyatında, fiyatından, fiyatını, fiyatının, fiyatıyla, işlem.
The 3rd highest number of forms (36) was observed with the lemma “şirket”: Şirketimiz, ŞİRKET, ŞİRKETLER, ŞİRKETİ, şirket, şirkete, şirketi, şirketidir, şirketin, şirketinde, şirketindeki, şirketinden, şirketine, şirketini, şirketinin, şirketiydi, şirketle, şirketler, şirketlerde, şirketlerdeki, şirketlerden, şirketlere, şirketleri, şirketlerin, şirketlerinde, şirketlerindeki, şirketlerinden, şirketlerine, şirketlerini, şirketlerinin, şirketlerle, şirketmiş, şirkette, şirketteki, şirketten, şirkettir.
NOUN occurs with 10 features: Number (59486; 96% instances), Case (58514; 95% instances), Number[psor] (18570; 30% instances), Person[psor] (18570; 30% instances), Person (965; 2% instances), Aspect (369; 1% instances), Polarity (52; 0% instances), ExtPos (13; 0% instances), Typo (12; 0% instances), Polite (1; 0% instances)
NOUN occurs with 24 feature-value pairs: Aspect=Perf, Aspect=Prog, Case=Abl, Case=Acc, Case=Dat, Case=Gen, Case=Ins, Case=Loc, Case=Nom, ExtPos=ADV, Number=Plur, Number=Sing, Number[psor]=Plur, Number[psor]=Sing, Person=1, Person=2, Person=3, Person[psor]=1, Person[psor]=2, Person[psor]=3, Polarity=Neg, Polarity=Pos, Polite=Form, Typo=Yes
NOUN occurs with 139 feature combinations.
The most frequent feature combination is Case=Nom|Number=Sing (25933 tokens).
Examples: bay, hisse, dolar, yıl, şirket, devam, satın, gelir, Amerikan, dün
Relations
NOUN nodes are attached to their parents using 29 different relations: nmod (20835; 34% instances), nsubj (10927; 18% instances), obl (7461; 12% instances), obj (6978; 11% instances), compound (5926; 10% instances), amod (3546; 6% instances), root (2263; 4% instances), conj (1609; 3% instances), flat (427; 1% instances), advcl (259; 0% instances), ccomp (253; 0% instances), case (228; 0% instances), appos (187; 0% instances), xcomp (171; 0% instances), nummod (141; 0% instances), parataxis (128; 0% instances), list (108; 0% instances), acl (98; 0% instances), csubj (94; 0% instances), discourse (93; 0% instances), fixed (55; 0% instances), iobj (17; 0% instances), clf (13; 0% instances), dep (7; 0% instances), dislocated (4; 0% instances), vocative (3; 0% instances), nmod:tmod (1; 0% instances), nsubj:outer (1; 0% instances), orphan (1; 0% instances)
Parents of NOUN nodes belong to 15 different parts of speech: VERB (30823; 50% instances), NOUN (21990; 36% instances), ADJ (2859; 5% instances), (2263; 4% instances), PROPN (2092; 3% instances), ADV (799; 1% instances), NUM (511; 1% instances), PRON (228; 0% instances), X (113; 0% instances), ADP (82; 0% instances), DET (34; 0% instances), AUX (19; 0% instances), CCONJ (13; 0% instances), INTJ (7; 0% instances), SCONJ (1; 0% instances)
22189 (36%) NOUN nodes are leaves.
25066 (41%) NOUN nodes have one child.
9431 (15%) NOUN nodes have two children.
5148 (8%) NOUN nodes have three or more children.
The highest child degree of a NOUN node is 11.
Children of NOUN nodes are attached using 31 different relations: nmod (19529; 31% instances), amod (12761; 20% instances), det (5740; 9% instances), punct (4428; 7% instances), nummod (4196; 7% instances), case (2876; 5% instances), acl (2781; 4% instances), nsubj (1712; 3% instances), cc (1606; 3% instances), conj (1566; 2% instances), compound (1522; 2% instances), advmod (1254; 2% instances), obl (779; 1% instances), obj (580; 1% instances), flat (370; 1% instances), appos (343; 1% instances), advcl (208; 0% instances), aux (207; 0% instances), discourse (182; 0% instances), list (158; 0% instances), parataxis (113; 0% instances), mark (87; 0% instances), ccomp (70; 0% instances), csubj (58; 0% instances), xcomp (37; 0% instances), dep (34; 0% instances), fixed (29; 0% instances), goeswith (12; 0% instances), clf (5; 0% instances), dislocated (4; 0% instances), orphan (1; 0% instances)
Children of NOUN nodes belong to 15 different parts of speech: NOUN (21990; 35% instances), ADJ (9744; 15% instances), DET (6015; 10% instances), PROPN (5078; 8% instances), NUM (4776; 8% instances), PUNCT (4428; 7% instances), VERB (4188; 7% instances), CCONJ (2356; 4% instances), ADP (1915; 3% instances), ADV (1622; 3% instances), PRON (716; 1% instances), AUX (210; 0% instances), X (179; 0% instances), INTJ (18; 0% instances), SCONJ (13; 0% instances)