Treebank Statistics: UD_Albanian-STAF: Features: Gender
This feature is universal.
It occurs with 2 different values: Fem
, Masc
.
1255 tokens (35%) have a non-empty value of Gender
.
683 types (56%) occur at least once with a non-empty value of Gender
.
550 lemmas (56%) occur at least once with a non-empty value of Gender
.
The feature is used with 5 part-of-speech tags: NOUN (595; 17% instances), DET (235; 7% instances), PRON (232; 7% instances), ADJ (162; 5% instances), PROPN (31; 1% instances).
NOUN
595 NOUN tokens (95% of all NOUN
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which NOUN
and Gender
co-occurred: Number=Sing (488; 82%), Definite=Def (349; 59%).
NOUN
tokens may have the following values of Gender
:
Fem
(354; 59% of non-emptyGender
): ditë, sytë, Nëna, gjendjes, shtëpia, dorën, gjë, grua, herë, kohënMasc
(241; 41% of non-emptyGender
): gjenerali, shi, fillim, prifti, babai, fund, krahasim, njeri, njerëzit, njerëzveEMPTY
(30): gjak, Mysafiri, babait, brejtja, djalë, dëshirës, errur, fillin, fundin, gjendje
Paradigm sytë | Masc | Fem |
---|---|---|
Case=Acc|Number=Sing | sytë | |
Case=Acc|Number=Plur | sytë | |
Case=Nom|Number=Plur | sytë |
Gender
seems to be lexical feature of NOUN
. 99% lemmas (378) occur only with one value of Gender
.
DET
235 DET tokens (78% of all DET
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which DET
and Gender
co-occurred: Definite=EMPTY (235; 100%), Number=Sing (185; 79%), PronType=Art (157; 67%).
DET
tokens may have the following values of Gender
:
Fem
(147; 63% of non-emptyGender
): e, të, së, iMasc
(88; 37% of non-emptyGender
): të, i, e, sëEMPTY
(65): një, e, të, nja, i, pak
Paradigm e | Masc | Fem |
---|---|---|
Case=Acc|Number=Sing|PronType=Art | e | e |
Case=Acc|Number=Plur|PronType=Art | e | e |
Case=Gen|Number=Plur|PronType=Art | së | |
Case=Nom|Number=Sing | e | |
Case=Nom|Number=Sing|PronType=Art | e | |
Case=Nom|Number=Plur|PronType=Art | e | e |
Number=Sing | e | e |
Number=Plur | e | e |
Number=Plur|PronType=Art | e |
PRON
232 PRON tokens (54% of all PRON
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which PRON
and Gender
co-occurred: Number=Sing (190; 82%), PronType=Prs (152; 66%).
PRON
tokens may have the following values of Gender
:
Fem
(77; 33% of non-emptyGender
): e, kjo, i, ajo, ime, saj, kësaj, këto, sime, asajMasc
(155; 67% of non-emptyGender
): i, e, ai, ky, tij, atë, cilët, im, ata, këtëEMPTY
(198): më, e, që, unë, i, na, ç’, asgjë, diçka, mua
Paradigm ai | Masc | Fem |
---|---|---|
Case=Acc|Number=Sing|Person=3|PronType=Prs | e, atë, i, të | e |
Case=Acc|Number=Sing|PronType=Dem | atë | |
Case=Acc|Number=Sing|PronType=Prs | i | |
Case=Acc|Number=Plur|Person=3|PronType=Prs | i | i |
Case=Dat|Number=Sing|Person=3|PronType=Prs | atij, i | i |
Case=Nom|Number=Sing|Person=3|PronType=Dem | ai | |
Case=Nom|Number=Sing|Person=3|PronType=Prs | ai | |
Case=Nom|Number=Sing|PronType=Dem | atë |
ADJ
162 ADJ tokens (91% of all ADJ
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which ADJ
and Gender
co-occurred: Degree=Pos (152; 94%), Number=Sing (130; 80%).
ADJ
tokens may have the following values of Gender
:
Fem
(103; 64% of non-emptyGender
): bardhë, bukur, fundit, djathtë, parë, re, huaj, jashtëzakonshme, lodhun, majtëMasc
(59; 36% of non-emptyGender
): sigurt, bardhë, bukur, huaj, parë, ri, çmendur, Madh, ardhshëm, arsyeshëmEMPTY
(16): dytë, fundit, hijerëndë, imperiale, keqja, kureshtar, mira, relative, rrallë, saktë
Paradigm bardhë | Masc | Fem |
---|---|---|
Case=Acc|Number=Sing | bardhë | |
Case=Gen|Number=Sing | bardhë | bardhë |
Case=Nom|Number=Sing | bardhë | bardhë |
Case=Nom|Number=Plur | bardha |
PROPN
31 PROPN tokens (79% of all PROPN
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which PROPN
and Gender
co-occurred: Number=Sing (30; 97%), Definite=Def (24; 77%).
PROPN
tokens may have the following values of Gender
:
Fem
(11; 35% of non-emptyGender
): Shqipëri, Ervehenë, Linda, Marga, Margën, Margës, Shqipërisë, Vedat, shtunëMasc
(20; 65% of non-emptyGender
): Ernesti, Ernestit, Vedati, Berti, Dizit, Ernest, Hadi, Hadin, Linda, LorinEMPTY
(8): Bamit, Dizi, Dizin, Ernesti, Lindën, Nerminja, Odise, Varrit
Paradigm Vedat | Masc | Fem |
---|---|---|
Case=Gen|Number=Sing | Vedatit | |
Case=Nom|Number=Sing | Vedati | |
Case=Nom|Number=Plur | Vedat |
Gender
seems to be lexical feature of PROPN
. 94% lemmas (16) occur only with one value of Gender
.
Relations with Agreement in Gender
The 10 most frequent relations where parent and child node agree in Gender
:
ADJ –[det:adj]–> DET (109; 91%),
NOUN –[amod]–> ADJ (107; 95%),
NOUN –[det]–> PRON (26; 51%),
NOUN –[det:poss]–> PRON (24; 71%),
NOUN –[nmod:poss]–> NOUN (24; 51%),
NOUN –[conj]–> NOUN (16; 55%),
ADJ –[det]–> DET (14; 88%),
PRON –[det]–> DET (11; 52%),
ADJ –[obl]–> NOUN (7; 64%),
PRON –[det:pron]–> DET (7; 78%).