home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_German-HDT: Features: Gender

This feature is universal. It occurs with 3 different values: Fem, Masc, Neut. Some words have combined values of the feature; 1 combinations have been observed: Masc|Neut.

This is a layered feature with the following layers: Gender, Gender[psor].

1236230 tokens (36%) have a non-empty value of Gender. 116363 types (62%) occur at least once with a non-empty value of Gender. 93344 lemmas (64%) occur at least once with a non-empty value of Gender. The feature is used with 8 part-of-speech tags: NOUN (684409; 20% instances), DET (395436; 11% instances), ADJ (84146; 2% instances), PRON (44113; 1% instances), PROPN (27734; 1% instances), ADV (188; 0% instances), X (178; 0% instances), NUM (26; 0% instances).

NOUN

684409 NOUN tokens (94% of all NOUN tokens) have a non-empty value of Gender.

The most frequent other feature values with which NOUN and Gender co-occurred: Case=EMPTY (608206; 89%), Number=Sing (450013; 66%).

NOUN tokens may have the following values of Gender:

Paradigm DeutschMascFemNeut
Case=AccDeutschen
Case=NomDeutsche, Deutscher
Deutsche, DeutscherDeutsch, Deutsche, Deutschen

Gender seems to be lexical feature of NOUN. 100% lemmas (85728) occur only with one value of Gender.

DET

395436 DET tokens (80% of all DET tokens) have a non-empty value of Gender.

The most frequent other feature values with which DET and Gender co-occurred: Number=Sing (394886; 100%), PronType=Art (356611; 90%), NumType=EMPTY (326524; 83%), Definite=Def (287697; 73%).

DET tokens may have the following values of Gender:

Paradigm derMascMasc,NeutFemNeut
Case=Accden, derdiedas, 's
Case=Datdem, desdemder, diedem, das, des
Case=Gendes, derderdes
Case=Nomderdie, derdas

ADJ

84146 ADJ tokens (32% of all ADJ tokens) have a non-empty value of Gender.

The most frequent other feature values with which ADJ and Gender co-occurred: Variant=EMPTY (84145; 100%), Number=Sing (75312; 90%), Degree=Pos (74297; 88%), Case=EMPTY (55347; 66%).

ADJ tokens may have the following values of Gender:

Paradigm neuMascFemNeut
Case=Acc|Degree=Pos|Number=Singneuen
Case=Acc|Degree=Pos|Number=Plurneuenneuenneuen
Case=Acc|Degree=Cmp|Number=Singneueren
Case=Acc|Degree=Sup|Number=Singneuesten
Case=Acc|Degree=Sup|Number=Plurneuestenneuesten, neustenneuesten, neusten
Case=Dat|Degree=Pos|Number=Singneuenneuen, neuerneuen
Case=Dat|Degree=Pos|Number=Plurneuenneuenneuen
Case=Dat|Degree=Cmp|Number=Plurneuerenneueren
Case=Dat|Degree=Sup|Number=Singneuesten
Case=Dat|Degree=Sup|Number=Plurneuesten
Case=Gen|Degree=Pos|Number=Plurneuen, neuerneuer, neuenneuer, neuen
Case=Nom|Degree=Pos|Number=Singneue, neuerneues
Case=Nom|Degree=Pos|Number=Plurneuenneuenneuen
Case=Nom|Degree=Cmp|Number=Singneuere, neuerer
Case=Nom|Degree=Cmp|Number=Plurneueren
Case=Nom|Degree=Sup|Number=Singneueste, neuester, neuste
Case=Nom|Degree=Sup|Number=Plurneuesten
Degree=Pos|Number=Singneuenneue, neuen, neuerneues, neue, neuen
Degree=Pos|Number=Plurneueneue, neuenneue, neuen
Degree=Cmp|Number=Singneuerenneuere, neueren, neuererneuere, neueres
Degree=Cmp|Number=PlurNeuereNeuere
Degree=Sup|Number=Singneuestenneueste, neuester, neuesten, neustenneueste, neuestes, neuesten
Degree=Sup|Number=Plurneueste

PRON

44113 PRON tokens (47% of all PRON tokens) have a non-empty value of Gender.

The most frequent other feature values with which PRON and Gender co-occurred: Number=Sing (44113; 100%), Reflex=EMPTY (44113; 100%), Case=Nom (34207; 78%), Person=3 (22812; 52%), PronType=Prs (22810; 52%).

PRON tokens may have the following values of Gender:

Paradigm derMascFemNeut
Abbr=Yes|Case=Nomd.
Case=Accdendiedas
Case=Datdemderdem
Case=Gendessenderer, Derendessen
Case=Nomderdiedas
Case=Nom|Typo=Yesda

Gender seems to be lexical feature of PRON. 93% lemmas (13) occur only with one value of Gender.

PROPN

27734 PROPN tokens (14% of all PROPN tokens) have a non-empty value of Gender.

The most frequent other feature values with which PROPN and Gender co-occurred: Number=Sing (27723; 100%), Case=EMPTY (25062; 90%).

PROPN tokens may have the following values of Gender:

Paradigm NylisMascFem
NylisNylis

Gender seems to be lexical feature of PROPN. 100% lemmas (1583) occur only with one value of Gender.

ADV

188 ADV tokens (0% of all ADV tokens) have a non-empty value of Gender.

The most frequent other feature values with which ADV and Gender co-occurred: PronType=Ind (187; 99%).

ADV tokens may have the following values of Gender:

Paradigm meistMascFemNeut
Case=Accmeisten
meistemeiste

X

178 X tokens (0% of all X tokens) have a non-empty value of Gender.

The most frequent other feature values with which X and Gender co-occurred: Foreign=Yes (178; 100%).

X tokens may have the following values of Gender:

NUM

26 NUM tokens (0% of all NUM tokens) have a non-empty value of Gender.

The most frequent other feature values with which NUM and Gender co-occurred: NumType=Card (26; 100%), Number=Sing (26; 100%).

NUM tokens may have the following values of Gender:

Paradigm einMascFemNeut
Case=Acceineneineein
Case=Dateinemeinereinem
Case=Nomeineein

Relations with Agreement in Gender

The 10 most frequent relations where parent and child node agree in Gender: NOUN –[det]–> DET (297197; 67%), DET –[nmod]–> NOUN (1262; 65%), ADJ –[conj]–> ADJ (592; 77%), NOUN –[expl]–> PRON (251; 61%), DET –[conj]–> NOUN (50; 52%), DET –[conj]–> DET (44; 54%), DET –[nsubj]–> PRON (44; 54%), DET –[det]–> PRON (35; 100%), PRON –[appos]–> DET (31; 97%), ADJ –[det]–> PRON (29; 97%).