home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Slovenian-SSJ: POS Tags: NOUN

There are 8846 NOUN lemmas (34%), 18464 NOUN types (37%) and 56914 NOUN tokens (21%). Out of 17 observed tags, the rank of NOUN is: 1 in number of lemmas, 1 in number of types and 1 in number of tokens.

The 10 most frequent NOUN lemmas: leto, čas, človek, dan, delo, država, mesto, svet, otrok, primer

The 10 most frequent NOUN types: leta, let, strani, del, dela, delo, primer, dan, leto, ljudi

The 10 most frequent ambiguous lemmas: dan (NOUN 332, ADJ 5), svet (NOUN 250, ADJ 33), del (NOUN 224, X 2), stran (NOUN 192, ADV 13), konec (NOUN 156, ADP 35), moč (NOUN 80, ADV 5), sila (NOUN 68, ADV 4), moški (NOUN 60, ADJ 24), red (NOUN 58, X 1), raven (NOUN 49, ADJ 3)

The 10 most frequent ambiguous types: del (NOUN 123, X 2), dela (NOUN 112, VERB 16), ljudi (NOUN 113, X 1), sveta (NOUN 67, ADJ 1), leti (NOUN 68, VERB 1), ženske (NOUN 40, ADJ 3), vojne (NOUN 41, ADJ 9), vojni (NOUN 41, ADJ 1), vlada (NOUN 28, VERB 6), konec (ADP 31, NOUN 30)

Morphology

The form / lemma ratio of NOUN is 2.087271 (the average of all parts of speech is 1.935546).

The 1st highest number of forms (12) was observed with the lemma “dan”: dan, dne, dneh, dnem, dneva, dneve, dnevi, dnevih, dnevom, dnevov, dnevu, dni.

The 2nd highest number of forms (10) was observed with the lemma “gost”: gost, gosta, goste, gosteh, gosti, gostih, gostje, gostom, gostov, gostu.

The 3rd highest number of forms (9) was observed with the lemma “del”: del, dela, dele, deli, delih, delom, deloma, delov, delu.

NOUN occurs with 4 features: Case (56914; 100% instances), Gender (56914; 100% instances), Number (56914; 100% instances), Animacy (3465; 6% instances)

NOUN occurs with 14 feature-value pairs: Animacy=Anim, Animacy=Inan, Case=Acc, Case=Dat, Case=Gen, Case=Ins, Case=Loc, Case=Nom, Gender=Fem, Gender=Masc, Gender=Neut, Number=Dual, Number=Plur, Number=Sing

NOUN occurs with 55 feature combinations. The most frequent feature combination is Case=Nom|Gender=Masc|Number=Sing (5014 tokens). Examples: predsednik, del, človek, čas, zakon, direktor, svet, otrok, oče, sistem

Relations

NOUN nodes are attached to their parents using 26 different relations: nmod (15552; 27% instances), obl (13804; 24% instances), nsubj (9429; 17% instances), obj (8131; 14% instances), conj (4983; 9% instances), root (1447; 3% instances), appos (1138; 2% instances), parataxis (576; 1% instances), iobj (567; 1% instances), list (242; 0% instances), orphan (241; 0% instances), xcomp (221; 0% instances), acl (195; 0% instances), ccomp (110; 0% instances), fixed (91; 0% instances), advcl (87; 0% instances), vocative (37; 0% instances), dep (16; 0% instances), csubj (13; 0% instances), discourse (12; 0% instances), dislocated (6; 0% instances), amod (4; 0% instances), flat (4; 0% instances), flat:foreign (3; 0% instances), flat:name (3; 0% instances), nummod (2; 0% instances)

Parents of NOUN nodes belong to 14 different parts of speech: VERB (28178; 50% instances), NOUN (21474; 38% instances), ADJ (4152; 7% instances), (1447; 3% instances), PROPN (607; 1% instances), ADV (249; 0% instances), DET (239; 0% instances), NUM (218; 0% instances), ADP (89; 0% instances), PRON (82; 0% instances), X (75; 0% instances), SYM (56; 0% instances), PART (33; 0% instances), AUX (15; 0% instances)

8769 (15%) NOUN nodes are leaves.

19768 (35%) NOUN nodes have one child.

17075 (30%) NOUN nodes have two children.

11302 (20%) NOUN nodes have three or more children.

The highest child degree of a NOUN node is 28.

Children of NOUN nodes are attached using 31 different relations: amod (21452; 22% instances), case (19785; 20% instances), nmod (18263; 19% instances), punct (7307; 8% instances), det (5696; 6% instances), conj (4982; 5% instances), cc (3644; 4% instances), acl (3461; 4% instances), nummod (3405; 4% instances), advmod (2176; 2% instances), cop (1625; 2% instances), appos (1213; 1% instances), nsubj (1152; 1% instances), orphan (532; 1% instances), mark (375; 0% instances), parataxis (367; 0% instances), aux (361; 0% instances), obl (324; 0% instances), list (264; 0% instances), csubj (86; 0% instances), advcl (82; 0% instances), cc:preconj (77; 0% instances), dep (62; 0% instances), discourse (33; 0% instances), obj (18; 0% instances), vocative (6; 0% instances), flat:foreign (5; 0% instances), fixed (4; 0% instances), ccomp (1; 0% instances), dislocated (1; 0% instances), flat:name (1; 0% instances)

Children of NOUN nodes belong to 17 different parts of speech: ADJ (22116; 23% instances), NOUN (21474; 22% instances), ADP (19239; 20% instances), PUNCT (7307; 8% instances), DET (6177; 6% instances), CCONJ (3828; 4% instances), PROPN (3686; 4% instances), NUM (3607; 4% instances), VERB (3371; 3% instances), AUX (1987; 2% instances), PART (1386; 1% instances), SCONJ (956; 1% instances), ADV (947; 1% instances), X (342; 0% instances), PRON (253; 0% instances), SYM (73; 0% instances), INTJ (11; 0% instances)