home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Latvian-LVTB: POS Tags: PROPN

There are 4189 PROPN lemmas (18%), 5520 PROPN types (11%) and 12506 PROPN tokens (4%). Out of 17 observed tags, the rank of PROPN is: 3 in number of lemmas, 4 in number of types and 7 in number of tokens.

The 10 most frequent PROPN lemmas: Latvija, Rīga, Eiropa, Krievija, Sofija, ES, Saeima, LETA, Andris, Baltija

The 10 most frequent PROPN types: Latvijas, Latvijā, Eiropas, Rīgas, ES, Krievijas, LETA, Sofija, Rīgā, Baltijas

The 10 most frequent ambiguous lemmas: m. (PROPN 12, X 1), d. (PROPN 10, X 1), EK (PROPN 15, NOUN 1), g. (NOUN 11, PROPN 4), FM (PROPN 9, X 1), Positivus (PROPN 6, X 5), V (PROPN 6, ADJ 1, NUM 1, SYM 1), BAS (PROPN 4, X 1), Google (PROPN 4, X 1), SEB (PROPN 4, SYM 1)

The 10 most frequent ambiguous types: Liepājas (PROPN 36, NOUN 2), Saules (PROPN 26, NOUN 2), M. (PROPN 23, X 1), Satversmes (PROPN 21, NOUN 1), D. (PROPN 19, X 1), EK (PROPN 15, NOUN 1), Jāņu (PROPN 15, NOUN 1), Mēness (PROPN 14, NOUN 2), airBaltic (PROPN 10, X 1), FM (PROPN 9, X 1)

Morphology

The form / lemma ratio of PROPN is 1.317737 (the average of all parts of speech is 2.244795).

The 1st highest number of forms (7) was observed with the lemma “Jānis”: Jāni, Jānim, Jānis, Jāņa, Jāņi, Jāņiem, Jāņu.

The 2nd highest number of forms (6) was observed with the lemma “Eiropa”: EIROPAS, Eiropa, Eiropai, Eiropas, Eiropu, Eiropā.

The 3rd highest number of forms (6) was observed with the lemma “Rīga”: RĪGAS, Rīga, Rīgai, Rīgas, Rīgu, Rīgā.

PROPN occurs with 5 features: Gender (10808; 86% instances), Case (10790; 86% instances), Number (10790; 86% instances), Abbr (1190; 10% instances), Typo (14; 0% instances)

PROPN occurs with 13 feature-value pairs: Abbr=Yes, Case=Acc, Case=Dat, Case=Gen, Case=Loc, Case=Nom, Case=Voc, Gender=Fem, Gender=Masc, Number=Plur, Number=Ptan, Number=Sing, Typo=Yes

PROPN occurs with 42 feature combinations. The most frequent feature combination is Case=Gen|Gender=Fem|Number=Sing (2806 tokens). Examples: Latvijas, Eiropas, Rīgas, Krievijas, Baltijas, Jelgavas, Saeimas, Bauskas, Liepājas, Lietuvas

Relations

PROPN nodes are attached to their parents using 21 different relations: nmod (4191; 34% instances), nsubj (2740; 22% instances), flat:name (1949; 16% instances), obl (1281; 10% instances), conj (1008; 8% instances), iobj (367; 3% instances), obj (274; 2% instances), parataxis (200; 2% instances), root (152; 1% instances), appos (98; 1% instances), nsubj:pass (82; 1% instances), discourse (36; 0% instances), vocative (32; 0% instances), dep (25; 0% instances), orphan (25; 0% instances), acl (22; 0% instances), xcomp (20; 0% instances), advcl (1; 0% instances), ccomp (1; 0% instances), csubj (1; 0% instances), flat:foreign (1; 0% instances)

Parents of PROPN nodes belong to 15 different parts of speech: NOUN (4550; 36% instances), VERB (4539; 36% instances), PROPN (2954; 24% instances), (152; 1% instances), ADJ (107; 1% instances), ADV (59; 0% instances), X (53; 0% instances), PRON (37; 0% instances), NUM (35; 0% instances), AUX (7; 0% instances), INTJ (5; 0% instances), SYM (4; 0% instances), DET (2; 0% instances), CCONJ (1; 0% instances), PUNCT (1; 0% instances)

7750 (62%) PROPN nodes are leaves.

2248 (18%) PROPN nodes have one child.

1464 (12%) PROPN nodes have two children.

1044 (8%) PROPN nodes have three or more children.

The highest child degree of a PROPN node is 34.

Children of PROPN nodes are attached using 25 different relations: punct (2190; 24% instances), flat:name (1984; 22% instances), nmod (1574; 17% instances), conj (1056; 12% instances), case (706; 8% instances), cc (519; 6% instances), acl (210; 2% instances), parataxis (196; 2% instances), discourse (179; 2% instances), amod (136; 1% instances), appos (66; 1% instances), det (57; 1% instances), orphan (47; 1% instances), dep (34; 0% instances), nsubj (34; 0% instances), advmod (33; 0% instances), cop (32; 0% instances), advcl (16; 0% instances), obl (16; 0% instances), mark (8; 0% instances), nummod (6; 0% instances), flat:foreign (4; 0% instances), aux (1; 0% instances), flat (1; 0% instances), iobj (1; 0% instances)

Children of PROPN nodes belong to 16 different parts of speech: PROPN (2954; 32% instances), PUNCT (2190; 24% instances), NOUN (1805; 20% instances), ADP (669; 7% instances), CCONJ (512; 6% instances), VERB (257; 3% instances), PART (151; 2% instances), ADJ (140; 2% instances), X (90; 1% instances), ADV (79; 1% instances), NUM (74; 1% instances), DET (52; 1% instances), SYM (39; 0% instances), SCONJ (35; 0% instances), AUX (33; 0% instances), PRON (26; 0% instances)