home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Skolt_Sami-Giellagas: POS Tags: NOUN

There are 148 NOUN lemmas (28%), 208 NOUN types (26%) and 402 NOUN tokens (14%). Out of 16 observed tags, the rank of NOUN is: 2 in number of lemmas, 2 in number of types and 3 in number of tokens.

The 10 most frequent NOUN lemmas: ooumaž, heävaš, nijdd, tueʹllj, ääkkaž, eʹčč, triâŋgg, kueʹtt, meäʹcc, päʹrnn

The 10 most frequent NOUN types: ooumaž, tueʹllj, mieʹccest, heävaš, nijdd, stäʹlmmstääll, vuâra, ääkka, eččad, niõđ

The 10 most frequent ambiguous lemmas: nuʹbb (ADJ 4, NOUN 2, PRON 2), Peter (PROPN 4, NOUN 1), _ (NOUN 1, X 1), šurr (ADJ 1, NOUN 1)

The 10 most frequent ambiguous types: jieʹllem (NOUN 3, VERB 1), kooʹddid (NOUN 3, VERB 1), Peter (PROPN 4, NOUN 1), jânnam (NOUN 1, X 1), väldd (VERB 4, NOUN 1), årra (ADP 3, NOUN 1)

Morphology

The form / lemma ratio of NOUN is 1.405405 (the average of all parts of speech is 1.472015).

The 1st highest number of forms (5) was observed with the lemma “päʹrnn”: paaʹrnines, pärnna, pärnnses, pääʹrn, päʹrnn.

The 2nd highest number of forms (5) was observed with the lemma “villj”: viillj, villj, villjâs, viʹllje, viʹlljes.

The 3rd highest number of forms (4) was observed with the lemma “eʹčč”: eeʹjjed, eččad, eččan, eʹčč.

NOUN occurs with 8 features: Case (396; 99% instances), Number (377; 94% instances), Animacy (105; 26% instances), Number[psor] (42; 10% instances), Person[psor] (42; 10% instances), Typo (3; 1% instances), Clitic (2; 0% instances), Degree (1; 0% instances)

NOUN occurs with 21 feature-value pairs: Animacy=Hum, Case=Abe, Case=Acc, Case=Com, Case=Ess, Case=Gen, Case=Ill, Case=Loc, Case=Nom, Case=Par, Clitic=Han, Clitic=QstA, Degree=Dim, Number=Plur, Number=Sing, Number[psor]=Plur, Number[psor]=Sing, Person[psor]=1, Person[psor]=2, Person[psor]=3, Typo=Yes

NOUN occurs with 44 feature combinations. The most frequent feature combination is Case=Nom|Number=Sing (86 tokens). Examples: heävaš, stäʹlmmstääll, niõđâž, Peʹll, källsaž, tieʹrmes, triâŋgg, jieʹllem, Bieʹss, Nuʹbb

Relations

NOUN nodes are attached to their parents using 20 different relations: nsubj (128; 32% instances), obj (90; 22% instances), obl:lmod (43; 11% instances), conj (23; 6% instances), obl (20; 5% instances), xcomp (17; 4% instances), obl:tmod (15; 4% instances), root (13; 3% instances), nsubj:cop (10; 2% instances), vocative (8; 2% instances), nmod (7; 2% instances), nmod:poss (7; 2% instances), orphan (5; 1% instances), reparandum (4; 1% instances), appos (3; 1% instances), nsubj:pass (3; 1% instances), dislocated (2; 0% instances), obl:agent (2; 0% instances), discourse (1; 0% instances), parataxis (1; 0% instances)

Parents of NOUN nodes belong to 8 different parts of speech: VERB (315; 78% instances), NOUN (40; 10% instances), AUX (16; 4% instances), (13; 3% instances), PRON (6; 1% instances), ADJ (4; 1% instances), ADV (4; 1% instances), PROPN (4; 1% instances)

238 (59%) NOUN nodes are leaves.

109 (27%) NOUN nodes have one child.

27 (7%) NOUN nodes have two children.

28 (7%) NOUN nodes have three or more children.

The highest child degree of a NOUN node is 9.

Children of NOUN nodes are attached using 31 different relations: det (58; 19% instances), punct (50; 17% instances), conj (24; 8% instances), amod (23; 8% instances), case (20; 7% instances), cop (16; 5% instances), nummod (14; 5% instances), cc (13; 4% instances), nmod:poss (9; 3% instances), orphan (9; 3% instances), nsubj:cop (8; 3% instances), advmod (7; 2% instances), nmod (7; 2% instances), reparandum (6; 2% instances), nsubj (5; 2% instances), advmod:lmod (4; 1% instances), advcl (3; 1% instances), advmod:tmod (3; 1% instances), discourse (3; 1% instances), obl (3; 1% instances), dep (2; 1% instances), vocative (2; 1% instances), appos (1; 0% instances), aux (1; 0% instances), dislocated (1; 0% instances), expl (1; 0% instances), goeswith (1; 0% instances), mark (1; 0% instances), obj (1; 0% instances), obl:tmod (1; 0% instances), xcomp (1; 0% instances)

Children of NOUN nodes belong to 14 different parts of speech: PRON (72; 24% instances), PUNCT (50; 17% instances), NOUN (40; 13% instances), ADJ (23; 8% instances), ADP (20; 7% instances), AUX (20; 7% instances), ADV (18; 6% instances), NUM (14; 5% instances), VERB (14; 5% instances), CCONJ (13; 4% instances), DET (8; 3% instances), INTJ (4; 1% instances), PROPN (1; 0% instances), X (1; 0% instances)