home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Czech-PUD: Features: Gender

This feature is universal. It occurs with 3 different values: Fem, Masc, Neut. Some words have combined values of the feature; 3 combinations have been observed: Fem|Masc, Fem|Neut, Masc|Neut.

This is a layered feature with the following layers: Gender, Gender[psor].

9481 tokens (51%) have a non-empty value of Gender. 6232 types (82%) occur at least once with a non-empty value of Gender. 4158 lemmas (79%) occur at least once with a non-empty value of Gender. The feature is used with 8 part-of-speech tags: NOUN (4337; 23% instances), ADJ (2230; 12% instances), PROPN (966; 5% instances), VERB (879; 5% instances), DET (644; 3% instances), AUX (260; 1% instances), PRON (103; 1% instances), NUM (62; 0% instances).

NOUN

4337 NOUN tokens (97% of all NOUN tokens) have a non-empty value of Gender.

The most frequent other feature values with which NOUN and Gender co-occurred: Polarity=Pos (4329; 100%), Number=Sing (3086; 71%), Animacy=EMPTY (2423; 56%).

NOUN tokens may have the following values of Gender:

Paradigm rokMascNeut
Animacy=Inan|Case=Acc|Number=Singrok
Animacy=Inan|Case=Acc|Number=Plurroky
Animacy=Inan|Case=Gen|Number=Singroku, roka
Animacy=Inan|Case=Ins|Number=Singrokem
Animacy=Inan|Case=Loc|Number=Singroce, roku
Animacy=Inan|Case=Nom|Number=Singrok
Case=Acc|Number=Plur|Style=Archléta
Case=Gen|Number=Plurlet
Case=Ins|Number=Plurlety
Case=Loc|Number=Plurletech

Gender seems to be lexical feature of NOUN. 100% lemmas (1856) occur only with one value of Gender.

ADJ

2230 ADJ tokens (98% of all ADJ tokens) have a non-empty value of Gender.

The most frequent other feature values with which ADJ and Gender co-occurred: Polarity=Pos (2061; 92%), VerbForm=EMPTY (1816; 81%), Voice=EMPTY (1816; 81%), Degree=Pos (1748; 78%), Number=Sing (1459; 65%), Animacy=EMPTY (1362; 61%).

ADJ tokens may have the following values of Gender:

Paradigm známýFem,NeutMascFemNeut
Animacy=Anim|Case=Nom|Degree=Sup|Number=Sing|Polarity=Posnejznámější
Animacy=Inan|Case=Acc|Degree=Pos|Number=Sing|Polarity=Posznámý
Animacy=Inan|Case=Gen|Degree=Pos|Number=Plur|Polarity=Posznámých
Animacy=Inan|Case=Nom|Degree=Pos|Number=Sing|Polarity=Posznámý
Animacy=Inan|Case=Nom|Degree=Pos|Number=Plur|Polarity=Posznámé
Case=Nom|Degree=Pos|Number=Sing|Polarity=Negneznámé
Case=Nom|Degree=Pos|Number=Sing|Polarity=Posznámá
Number=Sing|Polarity=Pos|Variant=Shortznámo
Number=Plur,Sing|Polarity=Pos|Variant=Shortznáma

PROPN

966 PROPN tokens (89% of all PROPN tokens) have a non-empty value of Gender.

The most frequent other feature values with which PROPN and Gender co-occurred: Polarity=Pos (966; 100%), Foreign=EMPTY (897; 93%), Number=Sing (834; 86%).

PROPN tokens may have the following values of Gender:

Paradigm AndyMascFem
Animacy=Anim|Case=Dat|NameType=Giv|Number=SingAndymu
Animacy=Anim|Case=Gen|NameType=Giv|Number=SingAndyho
Case=Nom|NameType=Geo|Number=PlurAndy

Gender seems to be lexical feature of PROPN. 99% lemmas (681) occur only with one value of Gender.

VERB

879 VERB tokens (51% of all VERB tokens) have a non-empty value of Gender.

The most frequent other feature values with which VERB and Gender co-occurred: Mood=EMPTY (879; 100%), Person=EMPTY (879; 100%), Voice=Act (879; 100%), Tense=Past (878; 100%), VerbForm=Part (877; 100%), Polarity=Pos (843; 96%), Animacy=EMPTY (692; 79%), Number=Sing (492; 56%).

VERB tokens may have the following values of Gender:

Paradigm začítFem,MascFem,NeutMascFemNeut
Animacy=Anim|Number=Plurzačali
Animacy=Inan|Number=Plurzačaly
Number=Singzačalzačalazačalo
Number=Plur,Singzačala

DET

644 DET tokens (76% of all DET tokens) have a non-empty value of Gender.

The most frequent other feature values with which DET and Gender co-occurred: Number[psor]=EMPTY (599; 93%), Person=EMPTY (599; 93%), Reflex=EMPTY (570; 89%), Animacy=EMPTY (549; 85%), Poss=EMPTY (525; 82%), Number=Sing (519; 81%), Case=Nom (327; 51%).

DET tokens may have the following values of Gender:

Paradigm tenMascMasc,NeutFemNeut
Animacy=Anim|Case=Acc|Number=Plurty
Animacy=Inan|Case=Acc|Number=Singten
Animacy=Inan|Case=Nom|Number=Plurty
Case=Acc|Number=Singto
Case=Acc|Number=Plurty
Case=Dat|Number=Singtomu
Case=Gen|Number=Singtoho
Case=Ins|Number=Singtímtou
Case=Loc|Number=Singtom
Case=Nom|Number=Singtentato
Case=Nom|Number=Plurty

AUX

260 AUX tokens (38% of all AUX tokens) have a non-empty value of Gender.

The most frequent other feature values with which AUX and Gender co-occurred: Mood=EMPTY (260; 100%), Person=EMPTY (260; 100%), Tense=Past (260; 100%), VerbForm=Part (260; 100%), Voice=Act (260; 100%), Polarity=Pos (242; 93%), Number=Sing (155; 60%).

AUX tokens may have the following values of Gender:

Paradigm býtFem,MascFem,NeutMascFemNeut
Animacy=Anim|Number=Plur|Polarity=Posbyli
Animacy=Inan|Number=Plur|Polarity=Negnebyly
Animacy=Inan|Number=Plur|Polarity=Posbyly
Aspect=Imp|Number=Sing|Polarity=Negbyl
Number=Sing|Polarity=Negnebylnebylo
Number=Sing|Polarity=Posbylbylabylo
Number=Plur,Sing|Polarity=Negnebyla
Number=Plur,Sing|Polarity=Posbyla

PRON

103 PRON tokens (18% of all PRON tokens) have a non-empty value of Gender.

The most frequent other feature values with which PRON and Gender co-occurred: Reflex=EMPTY (103; 100%), Number=Sing (85; 83%), Variant=EMPTY (83; 81%), Person=3 (71; 69%), PronType=Prs (71; 69%), PrepCase=EMPTY (62; 60%).

PRON tokens may have the following values of Gender:

Paradigm onMascMasc,NeutFemNeut
Animacy=Anim|Case=Nom|Number=Pluroni
Case=Acc|Number=Sing|PrepCase=Preněj, ho, něho
Case=Acc|Number=Singjije
Case=Acc|Number=Sing|Variant=Shortho
Case=Dat|Number=Sing|PrepCase=Preněmu
Case=Dat|Number=Sing
Case=Dat|Number=Sing|Variant=Shortmu
Case=Gen|Number=Sing|PrepCase=Preněj
Case=Gen|Number=Sing
Case=Ins|Number=Sing|PrepCase=Prením
Case=Ins|Number=Singjím
Case=Loc|Number=Sing|PrepCase=Preněm
Case=Nom|Number=Singonona

NUM

62 NUM tokens (14% of all NUM tokens) have a non-empty value of Gender.

The most frequent other feature values with which NUM and Gender co-occurred: NumForm=Word (62; 100%), NumType=Card (62; 100%), NumValue=1,2,3 (62; 100%), Number=Sing (38; 61%).

NUM tokens may have the following values of Gender:

Paradigm jedenMascMasc,NeutFemNeut
Case=Accjedenjednujedno
Case=Genjednohojedné
Case=Insjednímjednou
Case=Locjednomjedné
Case=Nomjedenjednajedno

Relations with Agreement in Gender

The 10 most frequent relations where parent and child node agree in Gender: NOUN –[amod]–> ADJ (1695; 99%), VERB –[nsubj]–> PROPN (141; 66%), ADJ –[aux:pass]–> AUX (128; 78%), PROPN –[flat]–> PROPN (124; 88%), PROPN –[amod]–> ADJ (87; 99%), VERB –[conj]–> VERB (75; 63%), PROPN –[nmod]–> NOUN (60; 87%), ADJ –[nsubj]–> NOUN (54; 78%), ADJ –[conj]–> ADJ (44; 86%), PROPN –[conj]–> PROPN (33; 57%).