home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Czech-PUD: Features: Gender

This feature is universal. It occurs with 3 different values: Fem, Masc, Neut. Some words have combined values of the feature; 3 combinations have been observed: Fem|Masc, Fem|Neut, Masc|Neut.

This is a layered feature with the following layers: Gender, Gender[psor].

9478 tokens (51%) have a non-empty value of Gender. 6231 types (82%) occur at least once with a non-empty value of Gender. 4159 lemmas (79%) occur at least once with a non-empty value of Gender. The feature is used with 8 part-of-speech tags: NOUN (4337; 23% instances), ADJ (2229; 12% instances), PROPN (967; 5% instances), VERB (879; 5% instances), DET (641; 3% instances), AUX (260; 1% instances), PRON (103; 1% instances), NUM (62; 0% instances).

NOUN

4337 NOUN tokens (97% of all NOUN tokens) have a non-empty value of Gender.

The most frequent other feature values with which NOUN and Gender co-occurred: Number=Sing (3085; 71%), Animacy=EMPTY (2425; 56%).

NOUN tokens may have the following values of Gender:

Paradigm rokMascNeut
Animacy=Inan|Case=Acc|Number=Singrok
Animacy=Inan|Case=Acc|Number=Plurroky
Animacy=Inan|Case=Gen|Number=Singroku, roka
Animacy=Inan|Case=Ins|Number=Singrokem
Animacy=Inan|Case=Loc|Number=Singroce, roku
Animacy=Inan|Case=Nom|Number=Singrok
Case=Acc|Number=Plurléta
Case=Gen|Number=Plurlet
Case=Ins|Number=Plurlety
Case=Loc|Number=Plurletech

Gender seems to be lexical feature of NOUN. 100% lemmas (1856) occur only with one value of Gender.

ADJ

2229 ADJ tokens (98% of all ADJ tokens) have a non-empty value of Gender.

The most frequent other feature values with which ADJ and Gender co-occurred: Polarity=Pos (2060; 92%), Degree=Pos (1969; 88%), VerbForm=EMPTY (1813; 81%), Voice=EMPTY (1813; 81%), Number=Sing (1460; 66%), Animacy=EMPTY (1362; 61%).

ADJ tokens may have the following values of Gender:

Paradigm známýFem,NeutMascFemNeut
Animacy=Anim|Case=Nom|Degree=Sup|Number=Sing|Polarity=Posnejznámější
Animacy=Inan|Case=Acc|Degree=Pos|Number=Sing|Polarity=Posznámý
Animacy=Inan|Case=Gen|Degree=Pos|Number=Plur|Polarity=Posznámých
Animacy=Inan|Case=Nom|Degree=Pos|Number=Sing|Polarity=Posznámý
Animacy=Inan|Case=Nom|Degree=Pos|Number=Plur|Polarity=Posznámé
Case=Nom|Degree=Pos|Number=Sing|Polarity=Negneznámé
Case=Nom|Degree=Pos|Number=Sing|Polarity=Posznámá
Degree=Pos|Number=Sing|Polarity=Pos|Variant=Shortznámo
Degree=Pos|Number=Plur,Sing|Polarity=Pos|Variant=Shortznáma

PROPN

967 PROPN tokens (89% of all PROPN tokens) have a non-empty value of Gender.

The most frequent other feature values with which PROPN and Gender co-occurred: Foreign=EMPTY (898; 93%), Number=Sing (835; 86%).

PROPN tokens may have the following values of Gender:

Paradigm AndyMascFem
Animacy=Anim|Case=Dat|NameType=Giv|Number=SingAndymu
Animacy=Anim|Case=Gen|NameType=Giv|Number=SingAndyho
Case=Nom|NameType=Geo|Number=PlurAndy

Gender seems to be lexical feature of PROPN. 99% lemmas (682) occur only with one value of Gender.

VERB

879 VERB tokens (51% of all VERB tokens) have a non-empty value of Gender.

The most frequent other feature values with which VERB and Gender co-occurred: Mood=EMPTY (879; 100%), Person=EMPTY (879; 100%), Voice=Act (879; 100%), Tense=Past (878; 100%), VerbForm=Part (877; 100%), Polarity=Pos (843; 96%), Animacy=EMPTY (692; 79%), Number=Sing (492; 56%).

VERB tokens may have the following values of Gender:

Paradigm začítFem,MascFem,NeutMascFemNeut
Animacy=Anim|Number=Plurzačali
Animacy=Inan|Number=Plurzačaly
Number=Singzačalzačalazačalo
Number=Plur,Singzačala

DET

641 DET tokens (76% of all DET tokens) have a non-empty value of Gender.

The most frequent other feature values with which DET and Gender co-occurred: Number[psor]=EMPTY (596; 93%), Person=EMPTY (596; 93%), Reflex=EMPTY (567; 88%), Animacy=EMPTY (546; 85%), Poss=EMPTY (522; 81%), Number=Sing (516; 80%), Case=Nom (326; 51%).

DET tokens may have the following values of Gender:

Paradigm kterýMascMasc,NeutFemNeut
Animacy=Anim|Case=Acc|Number=Singkterého
Animacy=Anim|Case=Nom|Number=Plurkteří
Animacy=Inan|Case=Acc|Number=Singkterý
Animacy=Inan|Case=Acc|Number=Plurkteré
Animacy=Inan|Case=Nom|Number=Plurkteré
Case=Acc|Number=Singkteroukteré
Case=Acc|Number=Plurkterékterékterá
Case=Dat|Number=Singkterému
Case=Gen|Number=Singkteréhokteré
Case=Loc|Number=Singkterémkteré
Case=Nom|Number=Singkterýkterákteré
Case=Nom|Number=Plurkterékteré, kterákterá

AUX

260 AUX tokens (38% of all AUX tokens) have a non-empty value of Gender.

The most frequent other feature values with which AUX and Gender co-occurred: Aspect=Imp (260; 100%), Mood=EMPTY (260; 100%), Person=EMPTY (260; 100%), Tense=Past (260; 100%), VerbForm=Part (260; 100%), Voice=Act (260; 100%), Polarity=Pos (242; 93%), Number=Sing (156; 60%).

AUX tokens may have the following values of Gender:

Paradigm býtFem,MascFem,NeutMascFemNeut
Animacy=Anim|Number=Plur|Polarity=Posbyli
Animacy=Inan|Number=Plur|Polarity=Negnebyly
Animacy=Inan|Number=Plur|Polarity=Posbyly
Number=Sing|Polarity=Negnebyl, bylnebylo
Number=Sing|Polarity=Posbylbylabylo
Number=Plur,Sing|Polarity=Negnebyla
Number=Plur,Sing|Polarity=Posbyla

PRON

103 PRON tokens (18% of all PRON tokens) have a non-empty value of Gender.

The most frequent other feature values with which PRON and Gender co-occurred: Reflex=EMPTY (103; 100%), Number=Sing (85; 83%), Variant=EMPTY (83; 81%), Person=3 (71; 69%), PronType=Prs (71; 69%), PrepCase=EMPTY (62; 60%).

PRON tokens may have the following values of Gender:

Paradigm onMascMasc,NeutFemNeut
Animacy=Anim|Case=Nom|Number=Pluroni
Case=Acc|Number=Sing|PrepCase=Preněj, ho
Case=Acc|Number=Singjije
Case=Acc|Number=Sing|Variant=Shortho
Case=Dat|Number=Sing|PrepCase=Preněmu
Case=Dat|Number=Sing
Case=Dat|Number=Sing|Variant=Shortmu
Case=Gen|Number=Sing|PrepCase=Preněj
Case=Gen|Number=Sing
Case=Ins|Number=Sing|PrepCase=Prením
Case=Ins|Number=Singjím
Case=Loc|Number=Sing|PrepCase=Preněm
Case=Nom|Number=Singonona

NUM

62 NUM tokens (14% of all NUM tokens) have a non-empty value of Gender.

The most frequent other feature values with which NUM and Gender co-occurred: NumForm=Word (62; 100%), NumType=Card (62; 100%), Number=Sing (38; 61%).

NUM tokens may have the following values of Gender:

Paradigm jedenMascMasc,NeutFemNeut
Case=Accjedenjednujedno
Case=Genjednohojedné
Case=Insjednímjednou
Case=Locjednomjedné
Case=Nomjedenjednajedno

Relations with Agreement in Gender

The 10 most frequent relations where parent and child node agree in Gender: NOUN –[amod]–> ADJ (1694; 99%), VERB –[nsubj]–> PROPN (142; 66%), ADJ –[aux:pass]–> AUX (128; 78%), PROPN –[flat]–> PROPN (125; 88%), PROPN –[amod]–> ADJ (87; 99%), VERB –[conj]–> VERB (75; 63%), PROPN –[nmod]–> NOUN (61; 87%), ADJ –[nsubj]–> NOUN (54; 78%), ADJ –[conj]–> ADJ (44; 86%), PROPN –[conj]–> PROPN (33; 57%).