Treebank Statistics: UD_Karelian-KKPP: POS Tags: NOUN
There are 359 NOUN
lemmas (38%), 551 NOUN
types (39%) and 839 NOUN
tokens (27%).
Out of 14 observed tags, the rank of NOUN
is: 1 in number of lemmas, 1 in number of types and 1 in number of tokens.
The 10 most frequent NOUN
lemmas: ihmini, lapši, mua, poika, kulttuuri, muamo, peli, aktijo, roveh, tunti
The 10 most frequent NOUN
types: muamo, muan, poika, tunti, ihmisie, vuotena, kulttuurien, lapšien, ropehet, aktijo
The 10 most frequent ambiguous lemmas: ruado (NOUN 4, VERB 1), työ (NOUN 2, PRON 2), Kalevala-seikkailu#peli (NOUN 1, PROPN 1), juuri (ADV 1, NOUN 1), šilta (ADV 1, NOUN 1)
The 10 most frequent ambiguous types: šeikkailupelie (NOUN 1, X 1)
- šeikkailupelie
Morphology
The form / lemma ratio of NOUN
is 1.534819 (the average of all parts of speech is 1.495298).
The 1st highest number of forms (8) was observed with the lemma “ihmini”: ihmini, ihmiseh, ihmisen, ihmiset, ihmisie, ihmisien, ihmisillä, ihmisissä.
The 2nd highest number of forms (7) was observed with the lemma “pereh”: Pereh, perehellä, perehen, perehenä, perehie, perehillä, perehtä.
The 3rd highest number of forms (7) was observed with the lemma “seikkailu#peli”: Šeikkailupelit, šeikkailupeli, šeikkailupelie, šeikkailupelih, šeikkailupelijä, šeikkailupelilöistä, šeikkailupelissä.
NOUN
occurs with 5 features: Case (837; 100% instances), Number (837; 100% instances), Person[psor] (4; 0% instances), Abbr (2; 0% instances), Number[psor] (2; 0% instances)
NOUN
occurs with 19 feature-value pairs: Abbr=Yes
, Case=Abe
, Case=Abl
, Case=Ade
, Case=Com
, Case=Ela
, Case=Ess
, Case=Gen
, Case=Ill
, Case=Ine
, Case=Ins
, Case=Nom
, Case=Par
, Case=Tra
, Number=Plur
, Number=Sing
, Number[psor]=Sing
, Person[psor]=2
, Person[psor]=3
NOUN
occurs with 25 feature combinations.
The most frequent feature combination is Case=Gen|Number=Sing
(146 tokens).
Examples: muan, karjalan, muajilman, pelin, pojan, -projektin, ihmisen, järještön, keškukšen, luonnon
Relations
NOUN
nodes are attached to their parents using 21 different relations: obl (238; 28% instances), obj (173; 21% instances), nmod:poss (117; 14% instances), conj (93; 11% instances), nsubj (85; 10% instances), root (26; 3% instances), nmod (20; 2% instances), nsubj:cop (20; 2% instances), compound (19; 2% instances), flat:name (13; 2% instances), parataxis (10; 1% instances), appos (8; 1% instances), case (5; 1% instances), advcl (4; 0% instances), orphan (2; 0% instances), acl:relcl (1; 0% instances), amod (1; 0% instances), discourse (1; 0% instances), flat (1; 0% instances), nmod:gsubj (1; 0% instances), xcomp (1; 0% instances)
Parents of NOUN
nodes belong to 10 different parts of speech: VERB (460; 55% instances), NOUN (260; 31% instances), (26; 3% instances), PROPN (23; 3% instances), ADJ (21; 3% instances), AUX (18; 2% instances), PRON (16; 2% instances), ADV (7; 1% instances), ADP (4; 0% instances), NUM (4; 0% instances)
302 (36%) NOUN
nodes are leaves.
340 (41%) NOUN
nodes have one child.
112 (13%) NOUN
nodes have two children.
85 (10%) NOUN
nodes have three or more children.
The highest child degree of a NOUN
node is 9.
Children of NOUN
nodes are attached using 28 different relations: nmod:poss (203; 22% instances), amod (153; 17% instances), punct (114; 13% instances), conj (106; 12% instances), cc (54; 6% instances), nummod (33; 4% instances), det (32; 4% instances), case (29; 3% instances), nmod (28; 3% instances), cop (27; 3% instances), compound (19; 2% instances), acl:relcl (17; 2% instances), advmod (16; 2% instances), nsubj:cop (16; 2% instances), appos (13; 1% instances), flat:name (13; 1% instances), obl (13; 1% instances), obj (5; 1% instances), mark (4; 0% instances), acl (3; 0% instances), xcomp (3; 0% instances), aux (2; 0% instances), ccomp (2; 0% instances), parataxis (2; 0% instances), advcl (1; 0% instances), cop:own (1; 0% instances), nsubj (1; 0% instances), vocative (1; 0% instances)
Children of NOUN
nodes belong to 13 different parts of speech: NOUN (260; 29% instances), ADJ (163; 18% instances), PUNCT (114; 13% instances), PRON (86; 9% instances), PROPN (86; 9% instances), CCONJ (54; 6% instances), NUM (36; 4% instances), AUX (32; 4% instances), ADP (29; 3% instances), VERB (29; 3% instances), ADV (16; 2% instances), SCONJ (3; 0% instances), X (3; 0% instances)