home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Bulgarian-BTB: Features: Gender

This feature is universal. It occurs with 3 different values: Fem, Masc, Neut.

58980 tokens (38%) have a non-empty value of Gender. 19497 types (74%) occur at least once with a non-empty value of Gender. 11282 lemmas (76%) occur at least once with a non-empty value of Gender. The feature is used with 9 part-of-speech tags: NOUN (33602; 22% instances), ADJ (9557; 6% instances), PROPN (8342; 5% instances), PRON (3244; 2% instances), VERB (1822; 1% instances), DET (1718; 1% instances), NUM (515; 0% instances), AUX (179; 0% instances), ADP (1; 0% instances).

NOUN

33602 NOUN tokens (98% of all NOUN tokens) have a non-empty value of Gender.

The most frequent other feature values with which NOUN and Gender co-occurred: Number=Sing (23922; 71%), Definite=Ind (20591; 61%).

NOUN tokens may have the following values of Gender:

Paradigm главаMascFem
Definite=Def|Number=Singглавата
Definite=Def|Number=Plurглавите
Definite=Ind|Number=Singглаваглава
Definite=Ind|Number=Plurглави

Gender seems to be lexical feature of NOUN. 100% lemmas (5501) occur only with one value of Gender.

ADJ

9557 ADJ tokens (70% of all ADJ tokens) have a non-empty value of Gender.

The most frequent other feature values with which ADJ and Gender co-occurred: Number=Sing (9557; 100%), Degree=Pos (9044; 95%), Aspect=EMPTY (8705; 91%), VerbForm=EMPTY (8705; 91%), Voice=EMPTY (8705; 91%), Definite=Ind (5220; 55%).

ADJ tokens may have the following values of Gender:

Paradigm новMascFemNeut
Case=Voc|Degree=PosНови
Definite=Def|Degree=Posновия, новиятноватановото
Definite=Def|Degree=Supнай-новиятнай-новатаНай-новото
Definite=Ind|Degree=Posновнованово

PROPN

8342 PROPN tokens (99% of all PROPN tokens) have a non-empty value of Gender.

The most frequent other feature values with which PROPN and Gender co-occurred: Number=Sing (8213; 98%), Definite=Ind (8057; 97%).

PROPN tokens may have the following values of Gender:

Paradigm аMascFemNeut
ААа

Gender seems to be lexical feature of PROPN. 99% lemmas (2863) occur only with one value of Gender.

PRON

3244 PRON tokens (32% of all PRON tokens) have a non-empty value of Gender.

The most frequent other feature values with which PRON and Gender co-occurred: Number=Sing (3244; 100%), Poss=EMPTY (3244; 100%), Reflex=EMPTY (3244; 100%), Case=Nom (2234; 69%), Person=3 (1781; 55%), PronType=Prs (1781; 55%).

PRON tokens may have the following values of Gender:

Paradigm азMascFemNeut
Case=Accго, негоя, неяго, него
Case=Datму, немуйму
Case=Nomтойтято
й

VERB

1822 VERB tokens (11% of all VERB tokens) have a non-empty value of Gender.

The most frequent other feature values with which VERB and Gender co-occurred: Mood=EMPTY (1822; 100%), Number=Sing (1822; 100%), Person=EMPTY (1822; 100%), VerbForm=Part (1822; 100%), Definite=Ind (1821; 100%), Aspect=Perf (1375; 75%), Voice=Act (1112; 61%), Tense=Past (950; 52%).

VERB tokens may have the following values of Gender:

Paradigm могаMascFemNeut
Tense=Impможелможеламожело
Tense=Pastмогълмогламогло

DET

1718 DET tokens (71% of all DET tokens) have a non-empty value of Gender.

The most frequent other feature values with which DET and Gender co-occurred: Number=Sing (1718; 100%), Person=EMPTY (1461; 85%), Poss=EMPTY (1382; 80%), Definite=EMPTY (1119; 65%), Case=EMPTY (1029; 60%).

DET tokens may have the following values of Gender:

Paradigm тозиMascFemNeut
Case=Nomтази, тая, онази, тeзитова, онова, туй
този, тоя, оня, онзи

NUM

515 NUM tokens (24% of all NUM tokens) have a non-empty value of Gender.

The most frequent other feature values with which NUM and Gender co-occurred: NumType=Card (515; 100%), Definite=Ind (448; 87%), Number=Plur (282; 55%).

NUM tokens may have the following values of Gender:

Paradigm дваMascFemNeut
Definite=Defдватадветедвете
Definite=Indдва, 2две, 2две

AUX

179 AUX tokens (2% of all AUX tokens) have a non-empty value of Gender.

The most frequent other feature values with which AUX and Gender co-occurred: Aspect=Imp (179; 100%), Mood=Ind (179; 100%), Number=Sing (179; 100%), Person=EMPTY (179; 100%), Tense=EMPTY (179; 100%), VerbForm=Part (179; 100%), Voice=Act (179; 100%).

AUX tokens may have the following values of Gender:

Paradigm съмMascFemNeut
билбилабило

ADP

1 ADP tokens (0% of all ADP tokens) have a non-empty value of Gender.

ADP tokens may have the following values of Gender:

Relations with Agreement in Gender

The 10 most frequent relations where parent and child node agree in Gender: NOUN –[amod]–> ADJ (7988; 70%), NOUN –[nmod]–> PROPN (1793; 55%), PROPN –[flat]–> PROPN (1522; 95%), NOUN –[det]–> DET (1360; 69%), PROPN –[conj]–> PROPN (416; 71%), ADJ –[nsubj]–> NOUN (285; 73%), ADJ –[conj]–> ADJ (249; 97%), PROPN –[nmod]–> PROPN (247; 72%), PROPN –[amod]–> ADJ (239; 82%), PROPN –[nmod]–> NOUN (225; 67%).