home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Finnish-PUD: POS Tags: PROPN

There are 1051 PROPN lemmas (21%), 1215 PROPN types (16%) and 1504 PROPN tokens (10%). Out of 15 observed tags, the rank of PROPN is: 2 in number of lemmas, 3 in number of types and 4 in number of tokens.

The 10 most frequent PROPN lemmas: Kiina, Yhdysvallat, Trump, Ranska, Venäjä, of, Australia, Britannia, Eurooppa, Hong

The 10 most frequent PROPN types: of, Ranskan, Australian, Kiinan, Hong, Venäjän, Yhdysvaltain, de, the, Euroopan

The 10 most frequent ambiguous lemmas: väli#meri (PROPN 7, NOUN 2), a (PROPN 4, X 1), Alpit (NOUN 2, PROPN 2), BBC (NOUN 2, PROPN 2), Karibianmeri (PROPN 2, NOUN 1), Winstone (PROPN 2, NOUN 1), Wright (PROPN 2, NOUN 1), Filippos (NOUN 1, PROPN 1), I (ADJ 4, PROPN 1), III (ADJ 2, PROPN 1)

The 10 most frequent ambiguous types: a (PROPN 2, X 1), Winstone (PROPN 2, NOUN 1), Alpit (NOUN 1, PROPN 1), BBC:lle (NOUN 1, PROPN 1), Filippos (NOUN 1, PROPN 1), International (NOUN 1, PROPN 1), Itä-Afrikan (NOUN 1, PROPN 1), Karibianmeri (NOUN 1, PROPN 1), Lordi (NOUN 1, PROPN 1), On (AUX 7, PROPN 1, VERB 1)

Morphology

The form / lemma ratio of PROPN is 1.156042 (the average of all parts of speech is 1.520990).

The 1st highest number of forms (6) was observed with the lemma “Italia”: Italiaa, Italiaan, Italialla, Italian, Italiassa, Italiasta.

The 2nd highest number of forms (5) was observed with the lemma “Britannia”: Britannia, Britanniaan, Britannialle, Britannian, Britanniassa.

The 3rd highest number of forms (5) was observed with the lemma “Kiina”: Kiina, Kiinaa, Kiinan, Kiinassa, Kiinasta.

PROPN occurs with 6 features: Case (1504; 100% instances), Number (1501; 100% instances), Abbr (15; 1% instances), Degree (2; 0% instances), Derivation (1; 0% instances), NumType (1; 0% instances)

PROPN occurs with 16 feature-value pairs: Abbr=Yes, Case=Abl, Case=Ade, Case=All, Case=Ela, Case=Ess, Case=Gen, Case=Ill, Case=Ine, Case=Nom, Case=Par, Degree=Pos, Derivation=Lainen, NumType=Ord, Number=Plur, Number=Sing

PROPN occurs with 26 feature combinations. The most frequent feature combination is Case=Nom|Number=Sing (774 tokens). Examples: of, Hong, de, the, Donald, Joseph, Kiina, Qing, North, Trump

Relations

PROPN nodes are attached to their parents using 19 different relations: flat:name (372; 25% instances), nmod:poss (318; 21% instances), nsubj (241; 16% instances), obl (161; 11% instances), conj (91; 6% instances), obj (67; 4% instances), appos (66; 4% instances), compound:nn (63; 4% instances), nsubj:cop (44; 3% instances), nmod (38; 3% instances), root (12; 1% instances), nmod:gsubj (9; 1% instances), nmod:gobj (7; 0% instances), advcl (4; 0% instances), ccomp (4; 0% instances), parataxis (3; 0% instances), acl:relcl (2; 0% instances), orphan (1; 0% instances), xcomp (1; 0% instances)

Parents of PROPN nodes belong to 10 different parts of speech: NOUN (518; 34% instances), PROPN (469; 31% instances), VERB (466; 31% instances), ADJ (21; 1% instances), (12; 1% instances), PRON (8; 1% instances), ADV (4; 0% instances), NUM (4; 0% instances), ADP (1; 0% instances), AUX (1; 0% instances)

962 (64%) PROPN nodes are leaves.

299 (20%) PROPN nodes have one child.

128 (9%) PROPN nodes have two children.

115 (8%) PROPN nodes have three or more children.

The highest child degree of a PROPN node is 8.

Children of PROPN nodes are attached using 23 different relations: flat:name (371; 37% instances), punct (137; 14% instances), conj (99; 10% instances), cc (72; 7% instances), compound:nn (70; 7% instances), case (39; 4% instances), advmod (34; 3% instances), amod (27; 3% instances), acl:relcl (23; 2% instances), nmod (23; 2% instances), nsubj:cop (20; 2% instances), acl (19; 2% instances), nmod:poss (17; 2% instances), appos (15; 1% instances), cop (15; 1% instances), mark (6; 1% instances), cop:own (5; 0% instances), obl (4; 0% instances), cc:preconj (2; 0% instances), orphan (2; 0% instances), aux (1; 0% instances), det (1; 0% instances), parataxis (1; 0% instances)

Children of PROPN nodes belong to 13 different parts of speech: PROPN (469; 47% instances), NOUN (138; 14% instances), PUNCT (137; 14% instances), CCONJ (74; 7% instances), ADP (40; 4% instances), VERB (38; 4% instances), ADV (35; 3% instances), ADJ (34; 3% instances), AUX (21; 2% instances), SCONJ (6; 1% instances), PRON (5; 0% instances), X (4; 0% instances), NUM (2; 0% instances)