home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Lithuanian-HSE: POS Tags: PROPN

There are 151 PROPN lemmas (9%), 183 PROPN types (8%) and 323 PROPN tokens (6%). Out of 16 observed tags, the rank of PROPN is: 5 in number of lemmas, 4 in number of types and 5 in number of tokens.

The 10 most frequent PROPN lemmas: Lietuva, Sokratas, Rusija, Strepsiadas, Europa, Vilma, tu-154, BM, MARS, Vilnius

The 10 most frequent PROPN types: Lietuvos, Strepsiado, Sokratas, Sokrato, Europos, Strepsiadas, Rusijos, Tu-154, Aristofano, BM

The 10 most frequent ambiguous lemmas: sąjunga (PROPN 4, NOUN 3), sovietas (PROPN 2, NOUN 1), holokaustas (NOUN 1, PROPN 1)

The 10 most frequent ambiguous types:

Morphology

The form / lemma ratio of PROPN is 1.211921 (the average of all parts of speech is 1.442977).

The 1st highest number of forms (5) was observed with the lemma “Lietuva”: Lietuva, Lietuvai, Lietuvoje, Lietuvos, Lietuvą.

The 2nd highest number of forms (4) was observed with the lemma “Rusija”: Rusija, Rusijai, Rusijos, Rusiją.

The 3rd highest number of forms (3) was observed with the lemma “Aristofanas”: Aristofanas, Aristofano, Aristofaną.

PROPN occurs with 3 features: Case (300; 93% instances), Gender (300; 93% instances), Number (300; 93% instances)

PROPN occurs with 11 feature-value pairs: Case=Acc, Case=Dat, Case=Gen, Case=Ins, Case=Loc, Case=Nom, Case=Voc, Gender=Fem, Gender=Masc, Number=Plur, Number=Sing

PROPN occurs with 21 feature combinations. The most frequent feature combination is Case=Gen|Gender=Masc|Number=Sing (71 tokens). Examples: Strepsiado, Sokrato, Aristofano, Stalino, Sąjūdžio, Tu-154, Beniušio, Kemalio, Schmidto, Vilniaus

Relations

PROPN nodes are attached to their parents using 15 different relations: nmod (103; 32% instances), nsubj (61; 19% instances), flat (46; 14% instances), conj (37; 11% instances), obl (31; 10% instances), appos (13; 4% instances), obj (10; 3% instances), parataxis (6; 2% instances), iobj (5; 2% instances), compound (3; 1% instances), list (2; 1% instances), root (2; 1% instances), vocative (2; 1% instances), amod (1; 0% instances), xcomp (1; 0% instances)

Parents of PROPN nodes belong to 7 different parts of speech: NOUN (136; 42% instances), VERB (93; 29% instances), PROPN (80; 25% instances), ADJ (8; 2% instances), ADV (3; 1% instances), (2; 1% instances), PART (1; 0% instances)

174 (54%) PROPN nodes are leaves.

65 (20%) PROPN nodes have one child.

55 (17%) PROPN nodes have two children.

29 (9%) PROPN nodes have three or more children.

The highest child degree of a PROPN node is 7.

Children of PROPN nodes are attached using 22 different relations: punct (69; 24% instances), flat (45; 16% instances), nmod (42; 15% instances), conj (27; 10% instances), case (25; 9% instances), cc (22; 8% instances), amod (11; 4% instances), advmod:emph (7; 2% instances), det (5; 2% instances), acl (4; 1% instances), appos (4; 1% instances), nsubj (4; 1% instances), acl:relcl (3; 1% instances), advmod (3; 1% instances), cop (2; 1% instances), list (2; 1% instances), parataxis (2; 1% instances), aux (1; 0% instances), mark (1; 0% instances), obj (1; 0% instances), obl (1; 0% instances), orphan (1; 0% instances)

Children of PROPN nodes belong to 13 different parts of speech: PROPN (80; 28% instances), PUNCT (69; 24% instances), NOUN (46; 16% instances), ADP (23; 8% instances), CCONJ (22; 8% instances), ADJ (11; 4% instances), PART (10; 4% instances), VERB (8; 3% instances), DET (5; 2% instances), AUX (3; 1% instances), SCONJ (3; 1% instances), ADV (1; 0% instances), PRON (1; 0% instances)