This is part of archived UD v1 documentation. See http://universaldependencies.org/ for the current version.
home bg/feat issue tracker

Gender: gender

Gender is usually a lexical feature of nouns and inflectional feature of other parts of speech (adjectives, verbs) that mark agreement with nouns. In Bulgarian gender is grammatical.

There are three genders: masculine(m), feminine (f) and neuter (n).

Masc: masculine gender

Nouns denoting male persons are masculine. Other nouns may be also grammatically masculine, without any relation to sex.

Example: [bg] замък / zamak “castle”

Fem: feminine gender

Nouns denoting female persons are feminine. Other nouns may be also grammatically feminine, without any relation to sex.

Example: [bg] маса / masa “table”

Neut: neuter gender

Neither masculine nor feminine (grammatically).

Example: [bg] дете / dete “child”


Treebank Statistics (UD_Bulgarian)

This feature is universal. It occurs with 3 different values: Fem, Masc, Neut.

58268 tokens (37%) have a non-empty value of Gender. 19359 types (73%) occur at least once with a non-empty value of Gender. 11215 lemmas (75%) occur at least once with a non-empty value of Gender. The feature is used with 8 part-of-speech tags: bg-pos/NOUN (33290; 21% instances), bg-pos/ADJ (9449; 6% instances), bg-pos/PROPN (8146; 5% instances), bg-pos/PRON (3210; 2% instances), bg-pos/VERB (1909; 1% instances), bg-pos/DET (1692; 1% instances), bg-pos/NUM (494; 0% instances), bg-pos/AUX (78; 0% instances).

NOUN

33290 bg-pos/NOUN tokens (97% of all NOUN tokens) have a non-empty value of Gender.

The most frequent other feature values with which NOUN and Gender co-occurred: Number=Sing (23776; 71%), Definite=Ind (20478; 62%).

NOUN tokens may have the following values of Gender:

Paradigm главаMascFem
Definite=Def|Number=Singглавата
Definite=Def|Number=Plurглавите
Definite=Ind|Number=Singглаваглава
Definite=Ind|Number=Plurглави

Gender seems to be lexical feature of NOUN. 100% lemmas (5476) occur only with one value of Gender.

ADJ

9449 bg-pos/ADJ tokens (70% of all ADJ tokens) have a non-empty value of Gender.

The most frequent other feature values with which ADJ and Gender co-occurred: Number=Sing (9449; 100%), Aspect=EMPTY (8610; 91%), Voice=EMPTY (8610; 91%), VerbForm=EMPTY (8610; 91%), Degree=Pos (7855; 83%), Definite=Ind (5206; 55%).

ADJ tokens may have the following values of Gender:

Paradigm новMascFemNeut
Case=Voc|Degree=PosНови
Definite=Def|Degree=Posновия, новиятноватановото
Definite=Def|Degree=Supнай-новиятнай-новатаНай-новото
Definite=Ind|Degree=Posновнованово

PROPN

8146 bg-pos/PROPN tokens (97% of all PROPN tokens) have a non-empty value of Gender.

The most frequent other feature values with which PROPN and Gender co-occurred: Number=Sing (8018; 98%), Definite=Ind (7868; 97%).

PROPN tokens may have the following values of Gender:

Paradigm аMascFemNeut
ААа

Gender seems to be lexical feature of PROPN. 99% lemmas (2820) occur only with one value of Gender.

PRON

3210 bg-pos/PRON tokens (32% of all PRON tokens) have a non-empty value of Gender.

The most frequent other feature values with which PRON and Gender co-occurred: Number=Sing (3210; 100%), Reflex=EMPTY (3210; 100%), Poss=EMPTY (3210; 100%), Case=Nom (2206; 69%), Person=3 (1765; 55%), PronType=Prs (1765; 55%).

PRON tokens may have the following values of Gender:

Paradigm азMascFemNeut
Case=Accго, негоя, неяго, него
Case=Datму, немуйму
Case=Nomтойтято
й

VERB

1909 bg-pos/VERB tokens (10% of all VERB tokens) have a non-empty value of Gender.

The most frequent other feature values with which VERB and Gender co-occurred: Person=EMPTY (1909; 100%), VerbForm=Part (1909; 100%), Number=Sing (1909; 100%), Definite=Ind (1909; 100%), Mood=EMPTY (1808; 95%), Aspect=Perf (1366; 72%), Voice=Act (1206; 63%).

VERB tokens may have the following values of Gender:

Paradigm съмMascFemNeut
билбилабило

DET

1692 bg-pos/DET tokens (70% of all DET tokens) have a non-empty value of Gender.

The most frequent other feature values with which DET and Gender co-occurred: Number=Sing (1692; 100%), Person=EMPTY (1444; 85%), Poss=EMPTY (1366; 81%), Definite=EMPTY (1105; 65%).

DET tokens may have the following values of Gender:

Paradigm тозиMascFemNeut
този, тоя, оня, онзитази, тая, онази, тeзитова, онова, туй

NUM

494 bg-pos/NUM tokens (23% of all NUM tokens) have a non-empty value of Gender.

The most frequent other feature values with which NUM and Gender co-occurred: NumType=Card (494; 100%), Definite=Ind (433; 88%), Number=Plur (266; 54%).

NUM tokens may have the following values of Gender:

Paradigm дваMascFemNeut
Definite=Defдватадветедвете
Definite=Indдва, 2две, 2две

AUX

78 bg-pos/AUX tokens (4% of all AUX tokens) have a non-empty value of Gender.

The most frequent other feature values with which AUX and Gender co-occurred: Tense=Past (78; 100%), Number=Sing (78; 100%), VerbForm=Part (78; 100%), Person=EMPTY (78; 100%), Aspect=Imp (78; 100%), Voice=Act (78; 100%), Mood=EMPTY (78; 100%).

AUX tokens may have the following values of Gender:

Paradigm съмMascFemNeut
билбилабило

Relations with Agreement in Gender

The 10 most frequent relations where parent and child node agree in Gender: NOUN –[amod]–> ADJ (7991; 71%), NOUN –[nmod]–> PROPN (1778; 56%), NOUN –[det]–> DET (1360; 70%), PROPN –[name]–> PROPN (1181; 93%), PROPN –[nmod]–> PROPN (512; 84%), PROPN –[conj]–> PROPN (403; 71%), PROPN –[amod]–> ADJ (256; 82%), ADJ –[nsubj]–> NOUN (249; 72%), ADJ –[conj]–> ADJ (248; 98%), PROPN –[nmod]–> NOUN (227; 67%).


Gender in other languages: [bg] [cs] [de] [el] [en] [es] [eu] [fa] [fr] [ga] [he] [hu] [it] [ja] [ko] [sv] [u]