home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Karelian-KKPP: POS Tags: PROPN

There are 96 PROPN lemmas (10%), 118 PROPN types (8%) and 181 PROPN tokens (6%). Out of 14 observed tags, the rank of PROPN is: 5 in number of lemmas, 4 in number of types and 7 in number of tokens.

The 10 most frequent PROPN lemmas: Kalevala, Karjala, Venäjä, Petroskoi, Anna, Kiestinki, WWF, Art-teltta, Irina, Manala

The 10 most frequent PROPN types: Karjalan, Kalevala, Kalevalan, Venäjän, Kiestinkin, Anna, Irina, Petroskoin, Venäjällä, Manalah

The 10 most frequent ambiguous lemmas: Antilaš (PROPN 1, X 1), Kalevala-seikkailu#peli (NOUN 1, PROPN 1)

The 10 most frequent ambiguous types: Anna (PROPN 4, VERB 1), Antilahien (PROPN 1, X 1)

Morphology

The form / lemma ratio of PROPN is 1.229167 (the average of all parts of speech is 1.495298).

The 1st highest number of forms (4) was observed with the lemma “Kalevala”: Kalevala, Kalevalan, Kalevalašta, Kalevalašša.

The 2nd highest number of forms (4) was observed with the lemma “Karjala”: Karjalah, Karjalan, Karjalašta, Karjalašša.

The 3rd highest number of forms (4) was observed with the lemma “Petroskoi”: Petroskoi, Petroskoin, Petroskoissa, Petroskoissaki.

PROPN occurs with 5 features: Case (181; 100% instances), Number (181; 100% instances), Abbr (6; 3% instances), Clitic (2; 1% instances), Typo (1; 1% instances)

PROPN occurs with 12 feature-value pairs: Abbr=Yes, Case=Ade, Case=Ela, Case=Gen, Case=Ill, Case=Ine, Case=Nom, Case=Par, Clitic=Ki, Number=Plur, Number=Sing, Typo=Yes

PROPN occurs with 12 feature combinations. The most frequent feature combination is Case=Gen|Number=Sing (71 tokens). Examples: Karjalan, Kalevalan, Venäjän, Kiestinkin, Petroskoin, Art-teltan, Kižin, Pohjolan, Ainon, Annan

Relations

PROPN nodes are attached to their parents using 14 different relations: nmod:poss (61; 34% instances), obl (28; 15% instances), flat:name (22; 12% instances), conj (20; 11% instances), nsubj (15; 8% instances), appos (8; 4% instances), nmod (8; 4% instances), obj (7; 4% instances), nsubj:cop (4; 2% instances), compound (3; 2% instances), parataxis (2; 1% instances), flat (1; 1% instances), root (1; 1% instances), vocative (1; 1% instances)

Parents of PROPN nodes belong to 8 different parts of speech: NOUN (86; 48% instances), VERB (46; 25% instances), PROPN (39; 22% instances), ADJ (5; 3% instances), NUM (2; 1% instances), ADP (1; 1% instances), PRON (1; 1% instances), (1; 1% instances)

102 (56%) PROPN nodes are leaves.

45 (25%) PROPN nodes have one child.

24 (13%) PROPN nodes have two children.

10 (6%) PROPN nodes have three or more children.

The highest child degree of a PROPN node is 5.

Children of PROPN nodes are attached using 15 different relations: punct (51; 39% instances), flat:name (26; 20% instances), conj (15; 11% instances), cc (8; 6% instances), parataxis (7; 5% instances), case (6; 5% instances), compound (4; 3% instances), obl (4; 3% instances), amod (2; 2% instances), appos (2; 2% instances), nmod:poss (2; 2% instances), nsubj:cop (2; 2% instances), cop:own (1; 1% instances), goeswith (1; 1% instances), nmod (1; 1% instances)

Children of PROPN nodes belong to 9 different parts of speech: PUNCT (51; 39% instances), PROPN (39; 30% instances), NOUN (23; 17% instances), CCONJ (8; 6% instances), ADP (5; 4% instances), X (3; 2% instances), ADJ (1; 1% instances), ADV (1; 1% instances), AUX (1; 1% instances)