Treebank Statistics: UD_Sinhala-STB: Features: Gender
This feature is universal.
It occurs with 2 different values: Masc
, Neut
.
264 tokens (30%) have a non-empty value of Gender
.
217 types (43%) occur at least once with a non-empty value of Gender
.
187 lemmas (45%) occur at least once with a non-empty value of Gender
.
The feature is used with 3 part-of-speech tags: NOUN (223; 25% instances), PRON (21; 2% instances), PROPN (20; 2% instances).
NOUN
223 NOUN tokens (72% of all NOUN
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which NOUN
and Gender
co-occurred: Animacy=EMPTY (182; 82%), Number=Sing (141; 63%).
NOUN
tokens may have the following values of Gender
:
Masc
(36; 16% of non-emptyGender
): මහතා, ජනතාව, ප්රධානයකු, අධිපතිවරයාට, අස්සන්, ආරක්ෂක, කෙනාම, ත්රස්තවාදීන්, තැනැත්තන්, දෙන්නකුNeut
(187; 84% of non-emptyGender
): අයවැය, ආණ්ඩුව, ආර්ථික, තත්ත්වය, දේශපාලන, යුද, අවස්ථාව, ආර්ථිකය, උද්ධමනය, ක්රමයEMPTY
(85): කිරීම, ජනතාවට, සිදු, අද, අහෝසි, ආර්ථික, කොටි, බොහෝ, හැඟීමක්, අංශ
Gender
seems to be lexical feature of NOUN
. 100% lemmas (173) occur only with one value of Gender
.
PRON
21 PRON tokens (48% of all PRON
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which PRON
and Gender
co-occurred: Poss=EMPTY (21; 100%), Animacy=EMPTY (19; 90%), Person=EMPTY (18; 86%), Number=Sing (16; 76%), Case=Nom (13; 62%), PronType=Dem (11; 52%).
PRON
tokens may have the following values of Gender
:
Masc
(10; 48% of non-emptyGender
): ඔහු, ඔහුටNeut
(11; 52% of non-emptyGender
): ඒ, ඊට, එය, ඉන්, මෙයEMPTY
(23): එය, එහි, ඒ, සිය, අප, අපට, අපේ, එකිනෙකා, එම, ඔව්හු
PROPN
20 PROPN tokens (53% of all PROPN
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which PROPN
and Gender
co-occurred: Animacy=EMPTY (20; 100%), Number=Sing (20; 100%), Foreign=EMPTY (19; 95%), Case=Nom (13; 65%), Person=EMPTY (12; 60%).
PROPN
tokens may have the following values of Gender
:
Masc
(8; 40% of non-emptyGender
): මහින්ද, රනිල්, වික්රමසිංහ, ෆොන්සේකාNeut
(12; 60% of non-emptyGender
): ලංකාව, ඉන්දියාව, ඉරානය, චීනය, ටැන්සානියාව, පලස්තීනය, පාකිස්ථානය, ලංකාවක්, ලංකාවට, සිංගප්පූරුවEMPTY
(18): ශ්රී, රාජපක්ෂ, අමෙරිකාවේ, කොසෝවෝ, ජුලියස්, නියරේරේ, මාඕ, යුනෙස්කෝ, ලිප්ටන්, ෂැවොලින්
Gender
seems to be lexical feature of PROPN
. 100% lemmas (12) occur only with one value of Gender
.
Relations with Agreement in Gender
The 10 most frequent relations where parent and child node agree in Gender
:
NOUN –[compound]–> NOUN (9; 53%),
PROPN –[flat]–> NOUN (6; 86%),
NOUN –[dep]–> NOUN (5; 83%),
NOUN –[nmod:poss]–> NOUN (4; 80%),
NOUN –[nsubj]–> PROPN (3; 60%),
NOUN –[case]–> NOUN (1; 100%),
NOUN –[conj]–> NOUN (1; 100%),
PROPN –[conj]–> PROPN (1; 100%).