home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_German-GSD: Features: Gender

This feature is universal. It occurs with 3 different values: Fem, Masc, Neut.

This is a layered feature with the following layers: Gender, Gender[psor].

133586 tokens (46%) have a non-empty value of Gender. 40691 types (80%) occur at least once with a non-empty value of Gender. 34870 lemmas (83%) occur at least once with a non-empty value of Gender. The feature is used with 9 part-of-speech tags: NOUN (50955; 17% instances), DET (35812; 12% instances), PROPN (26200; 9% instances), ADJ (14123; 5% instances), PRON (6248; 2% instances), NUM (102; 0% instances), X (79; 0% instances), ADV (57; 0% instances), SYM (10; 0% instances).

NOUN

50955 NOUN tokens (97% of all NOUN tokens) have a non-empty value of Gender.

The most frequent other feature values with which NOUN and Gender co-occurred: Number=Sing (36849; 72%).

NOUN tokens may have the following values of Gender:

Paradigm TagMascFemNeut
Case=Acc|Number=SingTag
Case=Acc|Number=PlurTage
Case=Dat|Number=SingTag, Tage
Case=Dat|Number=PlurTagen
Case=Gen|Number=SingTages, Tags
Case=Gen|Number=PlurTageTages
Case=Nom|Number=SingTagTage
Case=Nom|Number=PlurTage

Gender seems to be lexical feature of NOUN. 94% lemmas (16946) occur only with one value of Gender.

DET

35812 DET tokens (87% of all DET tokens) have a non-empty value of Gender.

The most frequent other feature values with which DET and Gender co-occurred: Number=Sing (33990; 95%), NumType=EMPTY (30232; 84%), PronType=Art (30043; 84%), Definite=Def (24594; 69%).

DET tokens may have the following values of Gender:

Paradigm derMascFemNeut
Case=Acc|Number=Singdendiedas, 's
Case=Acc|Number=Plurden
Case=Dat|Number=Singdem, der, desder, diedem, das, des
Case=Gen|Number=Singdes, derderdes, der
Case=Gen|Number=Plurderder
Case=Nom|Number=Singderdiedas

PROPN

26200 PROPN tokens (86% of all PROPN tokens) have a non-empty value of Gender.

The most frequent other feature values with which PROPN and Gender co-occurred: Number=Sing (25073; 96%).

PROPN tokens may have the following values of Gender:

Paradigm DeutschlandMascFemNeut
Case=AccDeutschland
Case=DatDeutschlandDeutschland
Case=GenDeutschlands, Deutschland
Case=NomDeutschlandDeutschland

Gender seems to be lexical feature of PROPN. 91% lemmas (13215) occur only with one value of Gender.

ADJ

14123 ADJ tokens (65% of all ADJ tokens) have a non-empty value of Gender.

The most frequent other feature values with which ADJ and Gender co-occurred: Degree=Pos (13050; 92%), Number=Sing (9922; 70%).

ADJ tokens may have the following values of Gender:

Paradigm erstMascFemNeut
Case=Acc|Number=Singerstenersteerste, erstes
Case=Acc|Number=Plurersten, ersteerste, erstenerste, ersten
Case=Dat|Number=Singerstenersten, ersterersten
Case=Dat|Number=Plurerstenerstenersten
Case=Gen|Number=Singerstenerstenersten
Case=Gen|Number=Plurerstenerstenersten
Case=Nom|Number=Singerste, ersterersteerste, erstes
Case=Nom|Number=Plurersten, ersteersten, ersteersten, Erste

PRON

6248 PRON tokens (58% of all PRON tokens) have a non-empty value of Gender.

The most frequent other feature values with which PRON and Gender co-occurred: Reflex=EMPTY (6248; 100%), Number=Sing (6220; 100%), Case=Nom (4874; 78%), PronType=Prs (4292; 69%), Person=3 (4268; 68%).

PRON tokens may have the following values of Gender:

Paradigm derMascFemNeut
Case=Accden, derdiedas
Case=Datdem, derderdem, Das
Case=Gendessenderen, der, dererdessen
Case=Nomder, diediedas, die

NUM

102 NUM tokens (1% of all NUM tokens) have a non-empty value of Gender.

The most frequent other feature values with which NUM and Gender co-occurred: NumType=Card (102; 100%).

NUM tokens may have the following values of Gender:

Paradigm 2MascFemNeut
Case=Acc2
Case=Dat2
Case=Nom2

X

79 X tokens (25% of all X tokens) have a non-empty value of Gender.

The most frequent other feature values with which X and Gender co-occurred: Foreign=EMPTY (79; 100%), Number=Sing (60; 76%).

X tokens may have the following values of Gender:

Paradigm B.MascFemNeut
Case=DatB.
Case=NomB.B.

Gender seems to be lexical feature of X. 92% lemmas (46) occur only with one value of Gender.

ADV

57 ADV tokens (0% of all ADV tokens) have a non-empty value of Gender.

ADV tokens may have the following values of Gender:

Paradigm caFemNeut
Case=Acccaca
Case=Datca

Gender seems to be lexical feature of ADV. 91% lemmas (43) occur only with one value of Gender.

SYM

10 SYM tokens (10% of all SYM tokens) have a non-empty value of Gender.

SYM tokens may have the following values of Gender:

Paradigm °MascFem
°°

Relations with Agreement in Gender

The 10 most frequent relations where parent and child node agree in Gender: NOUN –[det]–> DET (26061; 84%), NOUN –[amod]–> ADJ (11914; 91%), PROPN –[flat]–> PROPN (4768; 82%), PROPN –[det]–> DET (4538; 82%), NOUN –[det:poss]–> DET (2173; 95%), NOUN –[appos]–> PROPN (1762; 55%), PROPN –[conj]–> PROPN (1313; 63%), PROPN –[amod]–> PROPN (1060; 75%), NOUN –[compound]–> NOUN (667; 78%), PROPN –[flat]–> NOUN (659; 84%).