home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Sinhala-STB: POS Tags: PROPN

There are 25 PROPN lemmas (5%), 27 PROPN types (5%) and 38 PROPN tokens (4%). Out of 13 observed tags, the rank of PROPN is: 5 in number of lemmas, 4 in number of types and 8 in number of tokens.

The 10 most frequent PROPN lemmas: ලංකා, ශ්‍රී, මහින්ද, රනිල්, රාජපක්ෂ, වික්‍රමසිංහ, ෆොන්සේකා, අමෙරිකා, ඉන්දියා, ඉරාන

The 10 most frequent PROPN types: ශ්‍රී, ලංකාව, මහින්ද, රනිල්, රාජපක්ෂ, වික්‍රමසිංහ, ෆොන්සේකා, අමෙරිකාවේ, ඉන්දියාව, ඉරානය

The 10 most frequent ambiguous lemmas: ලංකා (PROPN 5, NOUN 1), ශ්‍රී (PROPN 5, ADJ 1), ඉන්දියා (NOUN 1, PROPN 1), ඉරාන (NOUN 1, PROPN 1)

The 10 most frequent ambiguous types: ශ්‍රී (PROPN 5, ADJ 1), ලංකාවට (NOUN 1, PROPN 1)

Morphology

The form / lemma ratio of PROPN is 1.080000 (the average of all parts of speech is 1.145336).

The 1st highest number of forms (3) was observed with the lemma “ලංකා”: ලංකාව, ලංකාවක්, ලංකාවට.

The 2nd highest number of forms (1) was observed with the lemma “අමෙරිකා”: අමෙරිකාවේ.

The 3rd highest number of forms (1) was observed with the lemma “ඉන්දියා”: ඉන්දියාව.

PROPN occurs with 7 features: Case (31; 82% instances), Number (28; 74% instances), Gender (20; 53% instances), Person (14; 37% instances), Definite (12; 32% instances), Animacy (8; 21% instances), Foreign (4; 11% instances)

PROPN occurs with 13 feature-value pairs: Animacy=Anim, Animacy=Inan, Case=Acc, Case=Dat, Case=Gen, Case=Nom, Definite=Def, Definite=Ind, Foreign=Yes, Gender=Masc, Gender=Neut, Number=Sing, Person=3

PROPN occurs with 14 feature combinations. The most frequent feature combination is Case=Nom|Gender=Masc|Number=Sing|Person=3 (8 tokens). Examples: මහින්ද, රනිල්, වික්‍රමසිංහ, ෆොන්සේකා

Relations

PROPN nodes are attached to their parents using 10 different relations: nsubj (14; 37% instances), flat (9; 24% instances), nmod (4; 11% instances), compound (2; 5% instances), dep (2; 5% instances), obl (2; 5% instances), root (2; 5% instances), conj (1; 3% instances), nmod:poss (1; 3% instances), obl:lmod (1; 3% instances)

Parents of PROPN nodes belong to 5 different parts of speech: PROPN (12; 32% instances), VERB (12; 32% instances), NOUN (11; 29% instances), (2; 5% instances), PART (1; 3% instances)

16 (42%) PROPN nodes are leaves.

14 (37%) PROPN nodes have one child.

6 (16%) PROPN nodes have two children.

2 (5%) PROPN nodes have three or more children.

The highest child degree of a PROPN node is 5.

Children of PROPN nodes are attached using 9 different relations: flat (17; 49% instances), case (4; 11% instances), compound (4; 11% instances), dep (3; 9% instances), csubj (2; 6% instances), punct (2; 6% instances), amod (1; 3% instances), cc (1; 3% instances), conj (1; 3% instances)

Children of PROPN nodes belong to 8 different parts of speech: PROPN (12; 34% instances), NOUN (10; 29% instances), PART (6; 17% instances), ADP (2; 6% instances), PUNCT (2; 6% instances), ADJ (1; 3% instances), CCONJ (1; 3% instances), VERB (1; 3% instances)