home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Finnish-PUD: POS Tags: PROPN

There are 1042 PROPN lemmas (21%), 1211 PROPN types (16%) and 1505 PROPN tokens (10%). Out of 15 observed tags, the rank of PROPN is: 2 in number of lemmas, 3 in number of types and 4 in number of tokens.

The 10 most frequent PROPN lemmas: Kiina, Yhdysvallat, Trump, Ranska, Venäjä, of, Australia, Britannia, Eurooppa, Hong

The 10 most frequent PROPN types: of, Ranskan, Australian, Kiinan, Hong, Venäjän, Yhdysvaltain, de, the, Euroopan

The 10 most frequent ambiguous lemmas: a (PROPN 4, X 1), III (PROPN 3, ADJ 2), Alpit (NOUN 2, PROPN 2), BBC (NOUN 2, PROPN 2), Winstone (PROPN 2, NOUN 1), Wright (PROPN 2, NOUN 1), Filippos (NOUN 1, PROPN 1), I (ADJ 4, PROPN 1), IV (ADJ 1, PROPN 1), Papua-Uusi-Guinea (NOUN 2, PROPN 1)

The 10 most frequent ambiguous types: a (PROPN 2, X 1), Winstone (PROPN 2, NOUN 1), Alpit (NOUN 1, PROPN 1), BBC:lle (NOUN 1, PROPN 1), Filippos (NOUN 1, PROPN 1), On (AUX 7, PROPN 1, VERB 1), P​a​p​u​a​-​U​u​d​e​s​s​a​-​G​u​i​n​e​a​s​s​a (), Pyhän (ADJ 1, PROPN 1), USA:n (NOUN 1, PROPN 1)

Morphology

The form / lemma ratio of PROPN is 1.162188 (the average of all parts of speech is 1.526379).

The 1st highest number of forms (6) was observed with the lemma “Italia”: Italiaa, Italiaan, Italialla, Italian, Italiassa, Italiasta.

The 2nd highest number of forms (5) was observed with the lemma “Britannia”: Britannia, Britanniaan, Britannialle, Britannian, Britanniassa.

The 3rd highest number of forms (5) was observed with the lemma “Kiina”: Kiina, Kiinaa, Kiinan, Kiinassa, Kiinasta.

PROPN occurs with 6 features: Case (1505; 100% instances), Number (1502; 100% instances), Abbr (16; 1% instances), Degree (2; 0% instances), Derivation (1; 0% instances), NumType (1; 0% instances)

PROPN occurs with 16 feature-value pairs: Abbr=Yes, Case=Abl, Case=Ade, Case=All, Case=Ela, Case=Ess, Case=Gen, Case=Ill, Case=Ine, Case=Nom, Case=Par, Degree=Pos, Derivation=Lainen, NumType=Ord, Number=Plur, Number=Sing

PROPN occurs with 26 feature combinations. The most frequent feature combination is Case=Nom|Number=Sing (772 tokens). Examples: of, Hong, de, the, Donald, Joseph, Kiina, Qing, North, Trump

Relations

PROPN nodes are attached to their parents using 19 different relations: flat:name (372; 25% instances), nmod:poss (322; 21% instances), nsubj (239; 16% instances), obl (162; 11% instances), conj (91; 6% instances), appos (66; 4% instances), obj (65; 4% instances), compound:nn (63; 4% instances), nsubj:cop (44; 3% instances), nmod (38; 3% instances), root (12; 1% instances), nmod:gsubj (9; 1% instances), nmod:gobj (7; 0% instances), advcl (4; 0% instances), ccomp (4; 0% instances), parataxis (3; 0% instances), acl:relcl (2; 0% instances), orphan (1; 0% instances), xcomp (1; 0% instances)

Parents of PROPN nodes belong to 10 different parts of speech: NOUN (514; 34% instances), PROPN (475; 32% instances), VERB (464; 31% instances), ADJ (22; 1% instances), (12; 1% instances), PRON (8; 1% instances), ADV (4; 0% instances), NUM (4; 0% instances), ADP (1; 0% instances), AUX (1; 0% instances)

961 (64%) PROPN nodes are leaves.

299 (20%) PROPN nodes have one child.

129 (9%) PROPN nodes have two children.

116 (8%) PROPN nodes have three or more children.

The highest child degree of a PROPN node is 8.

Children of PROPN nodes are attached using 22 different relations: flat:name (379; 38% instances), punct (137; 14% instances), conj (99; 10% instances), cc (72; 7% instances), compound:nn (70; 7% instances), case (39; 4% instances), advmod (34; 3% instances), amod (27; 3% instances), acl:relcl (23; 2% instances), nmod (22; 2% instances), nsubj:cop (20; 2% instances), acl (19; 2% instances), nmod:poss (17; 2% instances), appos (16; 2% instances), cop (15; 1% instances), mark (6; 1% instances), cop:own (5; 0% instances), obl (4; 0% instances), cc:preconj (2; 0% instances), orphan (2; 0% instances), aux (1; 0% instances), parataxis (1; 0% instances)

Children of PROPN nodes belong to 13 different parts of speech: PROPN (475; 47% instances), NOUN (140; 14% instances), PUNCT (137; 14% instances), CCONJ (74; 7% instances), ADP (39; 4% instances), VERB (38; 4% instances), ADV (36; 4% instances), ADJ (34; 3% instances), AUX (21; 2% instances), SCONJ (6; 1% instances), PRON (4; 0% instances), X (4; 0% instances), NUM (2; 0% instances)