home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Finnish: POS Tags: PROPN

There are 4606 PROPN lemmas (17%), 5954 PROPN types (11%) and 12062 PROPN tokens (6%). Out of 15 observed tags, the rank of PROPN is: 2 in number of lemmas, 4 in number of types and 7 in number of tokens.

The 10 most frequent PROPN lemmas: Eurooppa, Suomi, Turku, EU, Helsinki, Yhdysvallat, the, Venäjä, Mithridates, Kiina

The 10 most frequent PROPN types: euroopan, Turun, suomen, EU:n, Suomessa, the, Helsingin, Yhdysvaltain, Mithridates, Venäjän

The 10 most frequent ambiguous lemmas: KTM (PROPN 17, NOUN 1), A. (PROPN 13, NOUN 2), and (PROPN 10, X 3), kokoomus (NOUN 8, PROPN 1), on (PROPN 4, ADV 1, X 1), a (NOUN 33, X 4, PROPN 1), in (PROPN 6, X 2), i (NOUN 1, PROPN 1), n (PROPN 7, NOUN 6, NUM 1), you (X 2, PROPN 1)

The 10 most frequent ambiguous types: suomen (NOUN 14, PROPN 2), Ranskan (PROPN 22, NOUN 1), A. (PROPN 13, NOUN 2), KTM (PROPN 12, NOUN 1), Te (PROPN 10, PRON 1), and (PROPN 10, X 3), on (AUX 3843, VERB 88, PROPN 4, ADV 1), van (PROPN 4, CCONJ 1), a (NOUN 33, X 4, PROPN 1), in (PROPN 5, X 2)

Morphology

The form / lemma ratio of PROPN is 1.292662 (the average of all parts of speech is 2.060960).

The 1st highest number of forms (12) was observed with the lemma “Suomi”: SUOMI, Suomassa, Suomeen, Suomelle, Suomenkaan, Suomessa, Suomessakin, Suomesta, Suomi, Suomikin, suomea, suomen.

The 2nd highest number of forms (9) was observed with the lemma “Eurooppa”: Euroopalle, Euroopan-, Euroopassa, Euroopasta, Eurooppa, Eurooppaa, Eurooppaan, Eurooppamme, euroopan.

The 3rd highest number of forms (8) was observed with the lemma “EU”: EU, EU:, EU:N, EU:lle, EU:n, EU:ssa, EU:sta, EU:ta.

PROPN occurs with 10 features: Number (12053; 100% instances), Case (12051; 100% instances), Abbr (337; 3% instances), Typo (34; 0% instances), Clitic (25; 0% instances), Style (10; 0% instances), Person[psor] (8; 0% instances), Number[psor] (5; 0% instances), Derivation (3; 0% instances), Degree (1; 0% instances)

PROPN occurs with 29 feature-value pairs: Abbr=Yes, Case=Abl, Case=Ade, Case=All, Case=Com, Case=Ela, Case=Ess, Case=Gen, Case=Ill, Case=Ine, Case=Ins, Case=Nom, Case=Par, Case=Tra, Clitic=Han, Clitic=Kaan, Clitic=Kin, Clitic=Ko, Degree=Pos, Derivation=Lainen, Number=Plur, Number=Sing, Number[psor]=Plur, Number[psor]=Sing, Person[psor]=1, Person[psor]=2, Person[psor]=3, Style=Coll, Typo=Yes

PROPN occurs with 71 feature combinations. The most frequent feature combination is Case=Nom|Number=Sing (6033 tokens). Examples: the, Mithridates, Pekka, New, of, Mårten, Suomi, Eurooppa, Kaijasilta, Simonides

Relations

PROPN nodes are attached to their parents using 25 different relations: nmod:poss (2839; 24% instances), flat:name (2510; 21% instances), nsubj (2026; 17% instances), obl (1255; 10% instances), conj (1009; 8% instances), appos (474; 4% instances), nmod (454; 4% instances), nsubj:cop (407; 3% instances), obj (347; 3% instances), root (261; 2% instances), compound:nn (206; 2% instances), nmod:gsubj (91; 1% instances), nmod:gobj (54; 0% instances), advcl (25; 0% instances), vocative (25; 0% instances), orphan (19; 0% instances), acl:relcl (13; 0% instances), ccomp (10; 0% instances), parataxis (10; 0% instances), xcomp:ds (10; 0% instances), goeswith (9; 0% instances), amod (4; 0% instances), csubj (2; 0% instances), nummod (1; 0% instances), xcomp (1; 0% instances)

Parents of PROPN nodes belong to 15 different parts of speech: NOUN (4124; 34% instances), PROPN (3674; 30% instances), VERB (3600; 30% instances), (261; 2% instances), ADJ (237; 2% instances), ADV (58; 0% instances), PRON (50; 0% instances), NUM (30; 0% instances), PUNCT (10; 0% instances), X (9; 0% instances), SYM (3; 0% instances), ADP (2; 0% instances), CCONJ (2; 0% instances), AUX (1; 0% instances), INTJ (1; 0% instances)

7699 (64%) PROPN nodes are leaves.

2176 (18%) PROPN nodes have one child.

1222 (10%) PROPN nodes have two children.

965 (8%) PROPN nodes have three or more children.

The highest child degree of a PROPN node is 33.

Children of PROPN nodes are attached using 34 different relations: flat:name (2643; 31% instances), punct (1217; 14% instances), conj (1043; 12% instances), compound:nn (732; 9% instances), cc (593; 7% instances), case (310; 4% instances), advmod (277; 3% instances), nmod (245; 3% instances), appos (216; 3% instances), nsubj:cop (196; 2% instances), acl (187; 2% instances), amod (176; 2% instances), nmod:poss (165; 2% instances), cop (155; 2% instances), acl:relcl (108; 1% instances), obl (62; 1% instances), parataxis (45; 1% instances), nummod (43; 0% instances), det (32; 0% instances), mark (31; 0% instances), cop:own (30; 0% instances), cc:preconj (19; 0% instances), _ (18; 0% instances), aux (16; 0% instances), orphan (12; 0% instances), advcl (11; 0% instances), goeswith (9; 0% instances), flat (8; 0% instances), discourse (3; 0% instances), vocative (3; 0% instances), root (2; 0% instances), nmod:gsubj (1; 0% instances), obj (1; 0% instances), xcomp:ds (1; 0% instances)

Children of PROPN nodes belong to 15 different parts of speech: PROPN (3657; 42% instances), NOUN (1337; 16% instances), PUNCT (1251; 15% instances), CCONJ (603; 7% instances), VERB (353; 4% instances), ADP (305; 4% instances), ADV (290; 3% instances), ADJ (284; 3% instances), AUX (201; 2% instances), NUM (169; 2% instances), PRON (80; 1% instances), SCONJ (30; 0% instances), SYM (26; 0% instances), X (23; 0% instances), INTJ (1; 0% instances)