Treebank Statistics: UD_German-GSD: Features: Gender
This feature is universal.
It occurs with 3 different values: Fem, Masc, Neut.
This is a layered feature with the following layers: Gender, Gender[psor].
133583 tokens (46%) have a non-empty value of Gender.
40691 types (80%) occur at least once with a non-empty value of Gender.
34870 lemmas (83%) occur at least once with a non-empty value of Gender.
The feature is used with 9 part-of-speech tags: NOUN (50955; 17% instances), DET (35812; 12% instances), PROPN (26200; 9% instances), ADJ (14124; 5% instances), PRON (6245; 2% instances), NUM (102; 0% instances), X (78; 0% instances), ADV (57; 0% instances), SYM (10; 0% instances).
NOUN
50955 NOUN tokens (97% of all NOUN tokens) have a non-empty value of Gender.
The most frequent other feature values with which NOUN and Gender co-occurred: Number=Sing (36849; 72%).
NOUN tokens may have the following values of Gender:
Fem(21199; 42% of non-emptyGender): Zeit, Stadt, Familie, Gemeinde, Saison, Frau, Gruppe, Region, Geschichte, KircheMasc(18417; 36% of non-emptyGender): Teil, Ort, Menschen, Platz, Sohn, km, Namen, Anfang, Titel, MeterNeut(11339; 22% of non-emptyGender): jahr, Jahre, Jahren, Prozent, Ende, %, Unternehmen, Kinder, Leben, MitgliedEMPTY(1341): mm, Eltern, Jahrhundert, Leute, Kosten, °, m, mal, Deutschen, Beschäftigten
| Paradigm Tag | Masc | Fem | Neut |
|---|---|---|---|
| Case=Acc|Number=Sing | Tag | ||
| Case=Acc|Number=Plur | Tage | ||
| Case=Dat|Number=Sing | Tag, Tage | ||
| Case=Dat|Number=Plur | Tagen | ||
| Case=Gen|Number=Sing | Tages, Tags | ||
| Case=Gen|Number=Plur | Tage | Tages | |
| Case=Nom|Number=Sing | Tag | Tage | |
| Case=Nom|Number=Plur | Tage |
Gender seems to be lexical feature of NOUN. 94% lemmas (16945) occur only with one value of Gender.
DET
35812 DET tokens (87% of all DET tokens) have a non-empty value of Gender.
The most frequent other feature values with which DET and Gender co-occurred: Number=Sing (33990; 95%), NumType=EMPTY (30232; 84%), PronType=Art (30043; 84%), Definite=Def (24594; 69%).
DET tokens may have the following values of Gender:
Fem(15474; 43% of non-emptyGender): der, die, eine, einer, seine, diese, seiner, dieser, ihre, keineMasc(12196; 34% of non-emptyGender): dem, der, den, des, ein, einen, einem, eines, seinen, diesemNeut(8142; 23% of non-emptyGender): dem, das, ein, des, einem, dies, sein, eines, dieses, allemEMPTY(5393): die, den, der, the, diese, alle, mehr, viel, viele, beiden
| Paradigm der | Masc | Fem | Neut |
|---|---|---|---|
| Case=Acc|Number=Sing | den | die | das, 's |
| Case=Acc|Number=Plur | den | ||
| Case=Dat|Number=Sing | dem, der, des | der, die | dem, das, des |
| Case=Gen|Number=Sing | des, der | der | des, der |
| Case=Gen|Number=Plur | der | der | |
| Case=Nom|Number=Sing | der | die | das |
PROPN
26200 PROPN tokens (86% of all PROPN tokens) have a non-empty value of Gender.
The most frequent other feature values with which PROPN and Gender co-occurred: Number=Sing (25073; 96%).
PROPN tokens may have the following values of Gender:
Fem(5722; 22% of non-emptyGender): SPD, Mark, Universität, Schweiz, US, Maria, DDR, Deutschen, CDU, StraßeMasc(12773; 49% of non-emptyGender): Oktober, US, August, Mai, November, September, Juli, Peter, Weltkrieg, JohannNeut(7705; 29% of non-emptyGender): Deutschland, Berlin, Frankreich, München, Wien, London, New, Paris, St., ItalienEMPTY(4216): of, de, la, a, University, II, Wiener, Berliner, 1, B
| Paradigm Deutschland | Masc | Fem | Neut |
|---|---|---|---|
| Case=Acc | Deutschland | ||
| Case=Dat | Deutschland | Deutschland | |
| Case=Gen | Deutschlands, Deutschland | ||
| Case=Nom | Deutschland | Deutschland |
Gender seems to be lexical feature of PROPN. 91% lemmas (13215) occur only with one value of Gender.
ADJ
14124 ADJ tokens (65% of all ADJ tokens) have a non-empty value of Gender.
The most frequent other feature values with which ADJ and Gender co-occurred: Degree=Pos (13050; 92%), Number=Sing (9923; 70%).
ADJ tokens may have the following values of Gender:
Fem(6404; 45% of non-emptyGender): erste, ersten, neue, weitere, große, gute, deutschen, verschiedenen, deutsche, großenMasc(4726; 33% of non-emptyGender): ersten, zweiten, neuen, großen, erste, weiteren, weitere, heutigen, amerikanischen, neueNeut(2994; 21% of non-emptyGender): ersten, erste, letzten, weitere, neuen, gleichen, neues, gutes, neue, folgendenEMPTY(7618): später, gut, bekannt, kurz, freundlich, schnell, lang, neu, direkt, super
| Paradigm erst | Masc | Fem | Neut |
|---|---|---|---|
| Case=Acc|Number=Sing | ersten | erste | erste, erstes |
| Case=Acc|Number=Plur | ersten, erste | erste, ersten | erste, ersten |
| Case=Dat|Number=Sing | ersten | ersten, erster | ersten |
| Case=Dat|Number=Plur | ersten | ersten | ersten |
| Case=Gen|Number=Sing | ersten | ersten | ersten |
| Case=Gen|Number=Plur | ersten | ersten | ersten |
| Case=Nom|Number=Sing | erste, erster | erste | erste, erstes |
| Case=Nom|Number=Plur | ersten, erste | ersten, erste | ersten, Erste |
PRON
6245 PRON tokens (58% of all PRON tokens) have a non-empty value of Gender.
The most frequent other feature values with which PRON and Gender co-occurred: Reflex=EMPTY (6245; 100%), Number=Sing (6218; 100%), Case=Nom (4872; 78%), PronType=Prs (4293; 69%), Person=3 (4268; 68%).
PRON tokens may have the following values of Gender:
Fem(1278; 20% of non-emptyGender): sie, die, der, ihr, deren, ich, mich, wir, She, dererMasc(3082; 49% of non-emptyGender): er, der, ihm, ihn, dem, dessen, den, ich, wer, sieNeut(1885; 30% of non-emptyGender): es, das, was, dem, nichts, etwas, it, dessen, ‘s, nixEMPTY(4599): sich, ich, die, sie, man, wir, uns, mir, mich, denen
| Paradigm der | Masc | Fem | Neut |
|---|---|---|---|
| Case=Acc | den, der | die | das |
| Case=Dat | dem, der | der | dem, Das |
| Case=Gen | dessen | deren, der, derer | dessen |
| Case=Nom | der, die | die | das, die |
NUM
102 NUM tokens (1% of all NUM tokens) have a non-empty value of Gender.
The most frequent other feature values with which NUM and Gender co-occurred: NumType=Card (102; 100%).
NUM tokens may have the following values of Gender:
Fem(39; 38% of non-emptyGender): Millionen, zweier, 15, Million, 30, 35, 6, 1.681.469, 132,5-165, 1834-1911Masc(34; 33% of non-emptyGender): 50, 10, 28, 7, -10, -2288,9, -60, 0:2, 0:3, 1Neut(29; 28% of non-emptyGender): 10, 3, 1:1, ², +7,6, 100, 1000, 17, 1846-1925, 1882-1953EMPTY(7234): zwei, drei, vier, 2007, 2006, fünf, 2009, 2010, sechs, 2008
| Paradigm 2 | Masc | Fem | Neut |
|---|---|---|---|
| Case=Acc | 2 | ||
| Case=Dat | 2 | ||
| Case=Nom | 2 |
X
78 X tokens (25% of all X tokens) have a non-empty value of Gender.
The most frequent other feature values with which X and Gender co-occurred: Foreign=EMPTY (78; 100%), Number=Sing (59; 76%).
X tokens may have the following values of Gender:
Fem(24; 31% of non-emptyGender): Chr, B., E, S., €, #, B, C, La, MEZMasc(20; 26% of non-emptyGender): :-), B., :), ???a?, ??µ?????, A, Fr, Hauswurde, Hl, MinNeut(34; 44% of non-emptyGender): %, B., Abs, 4Jahren, ???????, Aufl, Az., C., Chr, GrEMPTY(233): ’s, u.a., etc., z.B., z., a, †, u, z, *
| Paradigm B. | Masc | Fem | Neut |
|---|---|---|---|
| Case=Dat | B. | ||
| Case=Nom | B. | B. |
Gender seems to be lexical feature of X. 92% lemmas (45) occur only with one value of Gender.
ADV
57 ADV tokens (0% of all ADV tokens) have a non-empty value of Gender.
ADV tokens may have the following values of Gender:
Fem(17; 30% of non-emptyGender): lange, super, Allzeit, Kehrt, Nahe, Wenige, Zügig, absolute, aka, caMasc(21; 37% of non-emptyGender): Abends, Anfangs, ECHT, EINFACH, Ex, Gottlob, Katzelmacher, Křižanov, NIE, NIEMALSNeut(19; 33% of non-emptyGender): was, ca, Dort, How, Mal, PMMA, Rääts, SEHR, Weitere, ersteEMPTY(13788): auch, nur, noch, sehr, so, dort, wieder, hier, mehr, heute
| Paradigm ca | Fem | Neut |
|---|---|---|
| Case=Acc | ca | ca |
| Case=Dat | ca |
Gender seems to be lexical feature of ADV. 91% lemmas (43) occur only with one value of Gender.
SYM
10 SYM tokens (10% of all SYM tokens) have a non-empty value of Gender.
SYM tokens may have the following values of Gender:
Fem(1; 10% of non-emptyGender): °Masc(4; 40% of non-emptyGender): :-), o, °, ·Neut(5; 50% of non-emptyGender): %, ×EMPTY(91): &, =, /, ×, +, *, €, “, -, :-)
| Paradigm ° | Masc | Fem |
|---|---|---|
| ° | ° |
Relations with Agreement in Gender
The 10 most frequent relations where parent and child node agree in Gender:
NOUN –[det]–> DET (26166; 84%),
NOUN –[amod]–> ADJ (11916; 91%),
PROPN –[flat]–> PROPN (4768; 82%),
PROPN –[det]–> DET (4556; 82%),
NOUN –[det:poss]–> DET (2173; 95%),
NOUN –[appos]–> PROPN (1762; 55%),
PROPN –[conj]–> PROPN (1313; 63%),
PROPN –[amod]–> PROPN (1059; 75%),
NOUN –[compound]–> NOUN (667; 78%),
PROPN –[flat]–> NOUN (659; 84%).