Treebank Statistics: UD_Albanian-STAF: Features: Gender
This feature is universal.
It occurs with 2 different values: Fem, Masc.
1255 tokens (35%) have a non-empty value of Gender.
683 types (56%) occur at least once with a non-empty value of Gender.
550 lemmas (56%) occur at least once with a non-empty value of Gender.
The feature is used with 5 part-of-speech tags: NOUN (595; 17% instances), DET (234; 7% instances), PRON (233; 7% instances), ADJ (162; 5% instances), PROPN (31; 1% instances).
NOUN
595 NOUN tokens (95% of all NOUN tokens) have a non-empty value of Gender.
The most frequent other feature values with which NOUN and Gender co-occurred: Number=Sing (488; 82%), Definite=Def (349; 59%).
NOUN tokens may have the following values of Gender:
Fem(354; 59% of non-emptyGender): ditë, sytë, Nëna, gjendjes, shtëpia, dorën, gjë, grua, herë, kohënMasc(241; 41% of non-emptyGender): gjenerali, shi, fillim, prifti, babai, fund, krahasim, njeri, njerëzit, njerëzveEMPTY(30): gjak, Mysafiri, babait, brejtja, djalë, dëshirës, errur, fillin, fundin, gjendje
| Paradigm sytë | Masc | Fem |
|---|---|---|
| Case=Acc|Number=Sing | sytë | |
| Case=Acc|Number=Plur | sytë | |
| Case=Nom|Number=Plur | sytë |
Gender seems to be lexical feature of NOUN. 99% lemmas (378) occur only with one value of Gender.
DET
234 DET tokens (78% of all DET tokens) have a non-empty value of Gender.
The most frequent other feature values with which DET and Gender co-occurred: Definite=EMPTY (234; 100%), Number=Sing (184; 79%), PronType=Art (157; 67%).
DET tokens may have the following values of Gender:
Fem(147; 63% of non-emptyGender): e, të, së, iMasc(87; 37% of non-emptyGender): të, i, e, sëEMPTY(65): një, e, të, nja, i, pak
| Paradigm e | Masc | Fem |
|---|---|---|
| Case=Acc|Number=Sing|PronType=Art | e | e |
| Case=Acc|Number=Plur|PronType=Art | e | e |
| Case=Gen|Number=Plur|PronType=Art | së | |
| Case=Nom|Number=Sing | e | |
| Case=Nom|Number=Sing|PronType=Art | e | |
| Case=Nom|Number=Plur|PronType=Art | e | e |
| Number=Sing | e | e |
| Number=Sing|PronType=Art | e | |
| Number=Plur | e | e |
| Number=Plur|PronType=Art | e |
PRON
233 PRON tokens (54% of all PRON tokens) have a non-empty value of Gender.
The most frequent other feature values with which PRON and Gender co-occurred: Number=Sing (191; 82%), PronType=Prs (152; 65%).
PRON tokens may have the following values of Gender:
Fem(77; 33% of non-emptyGender): e, kjo, i, ajo, ime, saj, kësaj, këto, sime, asajMasc(156; 67% of non-emptyGender): i, e, ai, ky, tij, atë, cilët, im, ata, këtëEMPTY(198): më, e, që, unë, i, na, ç’, asgjë, diçka, mua
| Paradigm ai | Masc | Fem |
|---|---|---|
| Case=Acc|Number=Sing|Person=3 | e | |
| Case=Acc|Number=Sing|Person=3|PronType=Prs | e, atë, i, të | e |
| Case=Acc|Number=Sing|PronType=Dem | atë | |
| Case=Acc|Number=Sing|PronType=Prs | i | |
| Case=Acc|Number=Plur|Person=3|PronType=Prs | i | i |
| Case=Dat|Number=Sing|Person=3|PronType=Prs | atij, i | i |
| Case=Nom|Number=Sing|Person=3|PronType=Dem | ai | |
| Case=Nom|Number=Sing|Person=3|PronType=Prs | ai | |
| Case=Nom|Number=Sing|PronType=Dem | atë |
ADJ
162 ADJ tokens (91% of all ADJ tokens) have a non-empty value of Gender.
The most frequent other feature values with which ADJ and Gender co-occurred: Degree=Pos (152; 94%), Number=Sing (130; 80%).
ADJ tokens may have the following values of Gender:
Fem(103; 64% of non-emptyGender): bardhë, bukur, fundit, djathtë, parë, re, huaj, jashtëzakonshme, lodhun, majtëMasc(59; 36% of non-emptyGender): sigurt, bardhë, bukur, huaj, parë, ri, çmendur, Madh, ardhshëm, arsyeshëmEMPTY(16): dytë, fundit, hijerëndë, imperiale, keqja, kureshtar, mira, relative, rrallë, saktë
| Paradigm bardhë | Masc | Fem |
|---|---|---|
| Case=Acc|Number=Sing | bardhë | |
| Case=Gen|Number=Sing | bardhë | bardhë |
| Case=Nom|Number=Sing | bardhë | bardhë |
| Case=Nom|Number=Plur | bardha |
PROPN
31 PROPN tokens (79% of all PROPN tokens) have a non-empty value of Gender.
The most frequent other feature values with which PROPN and Gender co-occurred: Number=Sing (30; 97%), Definite=Def (24; 77%).
PROPN tokens may have the following values of Gender:
Fem(11; 35% of non-emptyGender): Shqipëri, Ervehenë, Linda, Marga, Margën, Margës, Shqipërisë, Vedat, shtunëMasc(20; 65% of non-emptyGender): Ernesti, Ernestit, Vedati, Berti, Dizit, Ernest, Hadi, Hadin, Linda, LorinEMPTY(8): Bamit, Dizi, Dizin, Ernesti, Lindën, Nerminja, Odise, Varrit
| Paradigm Vedat | Masc | Fem |
|---|---|---|
| Case=Gen|Number=Sing | Vedatit | |
| Case=Nom|Number=Sing | Vedati | |
| Case=Nom|Number=Plur | Vedat |
Gender seems to be lexical feature of PROPN. 94% lemmas (16) occur only with one value of Gender.
Relations with Agreement in Gender
The 10 most frequent relations where parent and child node agree in Gender:
ADJ –[det:adj]–> DET (109; 91%),
NOUN –[amod]–> ADJ (107; 95%),
NOUN –[det]–> PRON (26; 51%),
NOUN –[det:poss]–> PRON (24; 71%),
NOUN –[nmod:poss]–> NOUN (24; 51%),
NOUN –[conj]–> NOUN (16; 53%),
ADJ –[det]–> DET (14; 88%),
PRON –[det]–> DET (11; 52%),
ADJ –[obl]–> NOUN (7; 64%),
PRON –[det:pron]–> DET (7; 78%).