home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Russian-Taiga: Features: Gender

This feature is universal. It occurs with 3 different values: Fem, Masc, Neut.

76248 tokens (39%) have a non-empty value of Gender. 28179 types (74%) occur at least once with a non-empty value of Gender. 14248 lemmas (70%) occur at least once with a non-empty value of Gender. The feature is used with 8 part-of-speech tags: NOUN (43554; 22% instances), ADJ (12181; 6% instances), PRON (6148; 3% instances), VERB (5884; 3% instances), PROPN (3792; 2% instances), DET (3745; 2% instances), AUX (586; 0% instances), NUM (358; 0% instances).

NOUN

43554 NOUN tokens (98% of all NOUN tokens) have a non-empty value of Gender.

The most frequent other feature values with which NOUN and Gender co-occurred: Animacy=Inan (37106; 85%), Number=Sing (31845; 73%).

NOUN tokens may have the following values of Gender:

Paradigm ценаFemNeut
Case=Acc|Number=Singценуцена
Case=Acc|Number=Plurцены
Case=Dat|Number=Singцене
Case=Dat|Number=Plurценам
Case=Gen|Number=Singцены
Case=Gen|Number=Plurцен
Case=Ins|Number=Singценой
Case=Ins|Number=Plurценами
Case=Loc|Number=Singцене
Case=Loc|Number=Plurценах
Case=Nom|Number=Singцена
Case=Nom|Number=Plurцены

Gender seems to be lexical feature of NOUN. 99% lemmas (7814) occur only with one value of Gender.

ADJ

12181 ADJ tokens (72% of all ADJ tokens) have a non-empty value of Gender.

The most frequent other feature values with which ADJ and Gender co-occurred: Number=Sing (12179; 100%), Degree=Pos (11975; 98%), Variant=EMPTY (10073; 83%).

ADJ tokens may have the following values of Gender:

Paradigm хорошийMascFemNeut
Animacy=Inan|Case=Accхороший
Case=Accхорошуюхорошее
Case=Datхорошей
Case=Genхорошегохорошейхорошего
Case=Insхорошимхорошейхорошим
Case=Locхорошемхорошем
Case=Nomхороший, хоро, хорошиихорошая, Шорошаяхорошее
Case=Nom|Typo=YesХорлшийХорошое
Variant=Shortхорошхорошахорошо

PRON

6148 PRON tokens (55% of all PRON tokens) have a non-empty value of Gender.

The most frequent other feature values with which PRON and Gender co-occurred: Number=Sing (6146; 100%), Person=EMPTY (4078; 66%), Animacy=Inan (3394; 55%), Case=Nom (3126; 51%).

PRON tokens may have the following values of Gender:

Paradigm которыйMascFemNeut
Animacy=Anim|Case=Acc|Number=Singкоторого
Animacy=Anim|Case=Ins|Number=Singкоторым
Animacy=Inan|Case=Acc|Number=Singкоторыйкоторое
Animacy=Inan|Case=Ins|Number=Singкоторым
Animacy=Inan|Case=Nom|Number=Singкоторое
Case=Acc|Number=Singкоторыйкоторуюкоторое
Case=Dat|Number=Singкоторомукоторой
Case=Gen|Number=Singкоторогокоторойкоторого
Case=Ins|Number=Singкоторымкоторой
Case=Loc|Number=Singкоторомкоторой
Case=Nom|Number=Singкоторыйкотораякоторое
Case=Nom|Number=Plur|Typo=Yesклторые

VERB

5884 VERB tokens (24% of all VERB tokens) have a non-empty value of Gender.

The most frequent other feature values with which VERB and Gender co-occurred: Number=Sing (5884; 100%), Person=EMPTY (5884; 100%), Tense=Past (5697; 97%), Mood=Ind (4827; 82%), VerbForm=Fin (4826; 82%), Aspect=Perf (3998; 68%), Voice=Act (3895; 66%).

VERB tokens may have the following values of Gender:

Paradigm бытьMascFemNeut
Case=Nom|VerbForm=Partбывший
Mood=Ind|Typo=Yes|VerbForm=Finбыл
Mood=Ind|VerbForm=Finбылбылабыло

PROPN

3792 PROPN tokens (85% of all PROPN tokens) have a non-empty value of Gender.

The most frequent other feature values with which PROPN and Gender co-occurred: Number=Sing (3726; 98%), Animacy=Anim (1986; 52%).

PROPN tokens may have the following values of Gender:

Paradigm ФранцияMascFem
Case=AccФранцию
Case=GenфранцииФранции
Case=LocФранции
Case=NomФранция

Gender seems to be lexical feature of PROPN. 99% lemmas (1813) occur only with one value of Gender.

DET

3745 DET tokens (66% of all DET tokens) have a non-empty value of Gender.

The most frequent other feature values with which DET and Gender co-occurred: Number=Sing (3743; 100%), Animacy=EMPTY (3236; 86%), Poss=EMPTY (2886; 77%).

DET tokens may have the following values of Gender:

Paradigm этотMascFemNeut
Animacy=Anim|Case=Accэтого
Animacy=Inan|Case=Accэтот
Case=Accэтуэто
Case=Acc|Typo=Yesэто
Case=Datэтомуэтойэтому
Case=Genэтогоэтойэтого
Case=Insэтимэтойэтим
Case=Locэтомэтойэтом
Case=Nomэтотэтаэто
Case=Nom|Typo=Yesэто

AUX

586 AUX tokens (37% of all AUX tokens) have a non-empty value of Gender.

The most frequent other feature values with which AUX and Gender co-occurred: Aspect=Imp (586; 100%), Number=Sing (586; 100%), Person=EMPTY (586; 100%), Tense=Past (586; 100%), Voice=Act (586; 100%), Mood=Ind (584; 100%), VerbForm=Fin (584; 100%).

AUX tokens may have the following values of Gender:

Paradigm бытьMascFemNeut
Case=Nom|VerbForm=Partбывшийбывшая
Mood=Ind|VerbForm=Finбылбылабыло

NUM

358 NUM tokens (12% of all NUM tokens) have a non-empty value of Gender.

The most frequent other feature values with which NUM and Gender co-occurred: NumForm=Word (356; 99%), NumType=Card (322; 90%).

NUM tokens may have the following values of Gender:

Paradigm одинMascFemNeut
Animacy=Anim|Case=Accодного
Animacy=Inan|Case=Accодин
Animacy=Inan|Case=Acc|Typo=Yesоден
Case=Accодиноднуодно
Case=Datодномуодной
Case=Genодногооднойодного
Case=Insоднимоднойодним
Case=Locодномоднойодном
Case=Nomодиноднаодно

Relations with Agreement in Gender

The 10 most frequent relations where parent and child node agree in Gender: NOUN –[amod]–> ADJ (7773; 72%), NOUN –[det]–> DET (2747; 64%), ADJ –[nsubj]–> NOUN (833; 58%), ADJ –[conj]–> ADJ (571; 81%), PROPN –[flat:name]–> PROPN (401; 84%), ADJ –[nsubj]–> PRON (362; 74%), NOUN –[appos]–> NOUN (274; 55%), VERB –[nsubj]–> PROPN (272; 54%), NOUN –[appos]–> PROPN (235; 59%), VERB –[nsubj:pass]–> NOUN (232; 51%).