home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Slovenian-SSJ: POS Tags: PROPN

There are 5244 PROPN lemmas (20%), 6458 PROPN types (13%) and 10578 PROPN tokens (4%). Out of 17 observed tags, the rank of PROPN is: 3 in number of lemmas, 4 in number of types and 9 in number of tokens.

The 10 most frequent PROPN lemmas: Slovenija, Ljubljana, Evropa, EU, Maribor, ZDA, Amerika, Nemčija, Slovenec, Italija

The 10 most frequent PROPN types: Slovenije, Sloveniji, EU, Slovenija, ZDA, Evropi, Ljubljana, Ljubljani, New, Evrope

The 10 most frequent ambiguous lemmas: New (PROPN 31, X 1), Sonce (PROPN 13, NOUN 1), Al (PROPN 7, X 4), Real (PROPN 3, X 1), Allgemeine (PROPN 2, X 1), Grand (PROPN 2, X 1), MB (NOUN 5, PROPN 2), Miss (PROPN 2, X 1), School (PROPN 2, X 1), VPR (PROPN 2, NOUN 1)

The 10 most frequent ambiguous types: New (PROPN 29, X 3), Windows (PROPN 17, X 1), Ali (ADV 44, CCONJ 16, PROPN 10, X 1), Zemlja (PROPN 8, NOUN 1), Al (PROPN 7, X 4), Hrvaške (PROPN 7, ADJ 1), Union (PROPN 7, X 2), Dolenjske (PROPN 6, ADJ 3), Nato (ADV 18, PROPN 6), Slovenskem (PROPN 6, ADJ 2)

Morphology

The form / lemma ratio of PROPN is 1.231503 (the average of all parts of speech is 1.935546).

The 1st highest number of forms (6) was observed with the lemma “Francoz”: Francoz, Francoza, Francoze, Francozi, Francozom, Francozov.

The 2nd highest number of forms (6) was observed with the lemma “Hrvat”: Hrvat, Hrvate, Hrvati, Hrvatom, Hrvatov, Hrvatu.

The 3rd highest number of forms (6) was observed with the lemma “Ljubljančan”: Ljubljančan, Ljubljančana, Ljubljančane, Ljubljančani, Ljubljančanom, Ljubljančanov.

PROPN occurs with 4 features: Case (10578; 100% instances), Gender (10578; 100% instances), Number (10578; 100% instances), Animacy (440; 4% instances)

PROPN occurs with 14 feature-value pairs: Animacy=Anim, Animacy=Inan, Case=Acc, Case=Dat, Case=Gen, Case=Ins, Case=Loc, Case=Nom, Gender=Fem, Gender=Masc, Gender=Neut, Number=Dual, Number=Plur, Number=Sing

PROPN occurs with 35 feature combinations. The most frequent feature combination is Case=Nom|Gender=Masc|Number=Sing (4365 tokens). Examples: New, Maribor, Janez, Bojan, Jože, Boris, John, Peter, Windows, Dušan

Relations

PROPN nodes are attached to their parents using 21 different relations: nmod (3667; 35% instances), nsubj (1784; 17% instances), flat:name (1608; 15% instances), obl (1009; 10% instances), conj (962; 9% instances), appos (385; 4% instances), obj (334; 3% instances), list (280; 3% instances), parataxis (201; 2% instances), root (152; 1% instances), iobj (60; 1% instances), orphan (41; 0% instances), flat:foreign (31; 0% instances), vocative (25; 0% instances), acl (17; 0% instances), xcomp (9; 0% instances), advcl (6; 0% instances), amod (3; 0% instances), flat (2; 0% instances), csubj (1; 0% instances), dep (1; 0% instances)

Parents of PROPN nodes belong to 13 different parts of speech: NOUN (3685; 35% instances), PROPN (3294; 31% instances), VERB (2926; 28% instances), ADJ (329; 3% instances), (152; 1% instances), X (122; 1% instances), DET (21; 0% instances), ADV (17; 0% instances), NUM (15; 0% instances), PRON (8; 0% instances), PART (4; 0% instances), SYM (3; 0% instances), AUX (2; 0% instances)

4670 (44%) PROPN nodes are leaves.

3457 (33%) PROPN nodes have one child.

1478 (14%) PROPN nodes have two children.

973 (9%) PROPN nodes have three or more children.

The highest child degree of a PROPN node is 32.

Children of PROPN nodes are attached using 26 different relations: case (2018; 19% instances), punct (2000; 19% instances), flat:name (1644; 16% instances), conj (1014; 10% instances), nmod (746; 7% instances), nummod (547; 5% instances), amod (481; 5% instances), cc (451; 4% instances), appos (379; 4% instances), list (326; 3% instances), acl (219; 2% instances), advmod (163; 2% instances), flat:foreign (88; 1% instances), orphan (84; 1% instances), cop (56; 1% instances), det (52; 0% instances), parataxis (45; 0% instances), nsubj (40; 0% instances), dep (23; 0% instances), mark (21; 0% instances), obl (20; 0% instances), aux (14; 0% instances), cc:preconj (9; 0% instances), advcl (3; 0% instances), discourse (2; 0% instances), flat (2; 0% instances)

Children of PROPN nodes belong to 17 different parts of speech: PROPN (3294; 32% instances), PUNCT (2000; 19% instances), ADP (1993; 19% instances), NOUN (608; 6% instances), NUM (592; 6% instances), ADJ (519; 5% instances), CCONJ (465; 4% instances), X (360; 3% instances), VERB (211; 2% instances), PART (91; 1% instances), ADV (86; 1% instances), AUX (70; 1% instances), DET (69; 1% instances), SCONJ (54; 1% instances), PRON (17; 0% instances), SYM (14; 0% instances), INTJ (4; 0% instances)