Treebank Statistics: UD_German-HDT: Features: Gender
This feature is universal.
It occurs with 3 different values: Fem, Masc, Neut.
Some words have combined values of the feature; 1 combinations have been observed: Masc|Neut.
This is a layered feature with the following layers: Gender, Gender[psor].
1236231 tokens (36%) have a non-empty value of Gender.
116362 types (62%) occur at least once with a non-empty value of Gender.
93342 lemmas (64%) occur at least once with a non-empty value of Gender.
The feature is used with 8 part-of-speech tags: NOUN (684408; 20% instances), DET (395438; 11% instances), ADJ (84146; 2% instances), PRON (44113; 1% instances), PROPN (27734; 1% instances), ADV (188; 0% instances), X (178; 0% instances), NUM (26; 0% instances).
NOUN
684408 NOUN tokens (94% of all NOUN tokens) have a non-empty value of Gender.
The most frequent other feature values with which NOUN and Gender co-occurred: Case=EMPTY (608205; 89%), Number=Sing (450011; 66%).
NOUN tokens may have the following values of Gender:
Fem(270252; 39% of non-emptyGender): Millionen, Mark, Milliarden, Firma, Angaben, Software, Zeit, Firmen, Version, InformationenMasc(249767; 36% of non-emptyGender): US-Dollar, Euro, Markt, Dollar, Hersteller, Computer, Umsatz, Preis, Anfang, MitarbeiterNeut(164389; 24% of non-emptyGender): Prozent, Internet, Unternehmen, Jahr, Ende, Quartal, Jahres, Jahren, Netz, DatenEMPTY(44692): Kunden, Teil, Pentium, Kunde, Teile, Steuern, Befragten, Beschäftigten, informations-, Angestellten
| Paradigm Deutsch | Masc | Fem | Neut |
|---|---|---|---|
| Case=Acc | Deutschen | ||
| Case=Nom | Deutsche, Deutscher | ||
| Deutsche, Deutscher | Deutsch, Deutsche, Deutschen |
Gender seems to be lexical feature of NOUN. 100% lemmas (85727) occur only with one value of Gender.
DET
395438 DET tokens (80% of all DET tokens) have a non-empty value of Gender.
The most frequent other feature values with which DET and Gender co-occurred: Number=Sing (394888; 100%), PronType=Art (356613; 90%), NumType=EMPTY (326526; 83%), Definite=Def (287699; 73%).
DET tokens may have the following values of Gender:
Fem(158053; 40% of non-emptyGender): die, der, eine, einer, diese, seiner, seine, dieser, keine, ihreMasc(108580; 27% of non-emptyGender): der, den, des, dem, einen, ein, einem, eines, diesem, seinenMasc,Neut(47860; 12% of non-emptyGender): demNeut(80945; 20% of non-emptyGender): das, ein, des, dem, einem, allem, dies, dieses, eines, diesemEMPTY(98929): die, der, den, alle, ihre, diese, keine, viele, anderen, seine
| Paradigm der | Masc | Masc,Neut | Fem | Neut |
|---|---|---|---|---|
| Case=Acc | den, der | die | das, 's | |
| Case=Dat | dem, des | dem | der, die | dem, das, des |
| Case=Gen | des, der | der | des | |
| Case=Nom | der | die, der | das |
ADJ
84146 ADJ tokens (32% of all ADJ tokens) have a non-empty value of Gender.
The most frequent other feature values with which ADJ and Gender co-occurred: Variant=EMPTY (84145; 100%), Number=Sing (75312; 90%), Degree=Pos (74297; 88%), Case=EMPTY (55347; 66%).
ADJ tokens may have the following values of Gender:
Fem(38429; 46% of non-emptyGender): neue, deutsche, erste, weitere, eigene, große, nächste, deutschen, digitale, letzteMasc(26557; 32% of non-emptyGender): neuen, neue, ersten, neuer, deutsche, deutschen, großen, größte, erste, eigenenNeut(19160; 23% of non-emptyGender): neue, neues, erste, weiteres, ersten, laufende, neuen, eigenes, erstes, zweiteEMPTY(178465): neuen, ersten, deutschen, neue, vergangenen, eigenen, letzten, nächsten, möglich, gut
| Paradigm neu | Masc | Fem | Neut |
|---|---|---|---|
| Case=Acc|Degree=Pos|Number=Sing | neuen | ||
| Case=Acc|Degree=Pos|Number=Plur | neuen | neuen | neuen |
| Case=Acc|Degree=Cmp|Number=Sing | neueren | ||
| Case=Acc|Degree=Sup|Number=Sing | neuesten | ||
| Case=Acc|Degree=Sup|Number=Plur | neuesten | neuesten, neusten | neuesten, neusten |
| Case=Dat|Degree=Pos|Number=Sing | neuen | neuen, neuer | neuen |
| Case=Dat|Degree=Pos|Number=Plur | neuen | neuen | neuen |
| Case=Dat|Degree=Cmp|Number=Plur | neueren | neueren | |
| Case=Dat|Degree=Sup|Number=Sing | neuesten | ||
| Case=Dat|Degree=Sup|Number=Plur | neuesten | ||
| Case=Gen|Degree=Pos|Number=Plur | neuen, neuer | neuer, neuen | neuer, neuen |
| Case=Nom|Degree=Pos|Number=Sing | neue, neuer | neues | |
| Case=Nom|Degree=Pos|Number=Plur | neuen | neuen | neuen |
| Case=Nom|Degree=Cmp|Number=Sing | neuere, neuerer | ||
| Case=Nom|Degree=Cmp|Number=Plur | neueren | ||
| Case=Nom|Degree=Sup|Number=Sing | neueste, neuester, neuste | ||
| Case=Nom|Degree=Sup|Number=Plur | neuesten | ||
| Degree=Pos|Number=Sing | neuen | neue, neuen, neuer | neues, neue, neuen |
| Degree=Pos|Number=Plur | neue | neue, neuen | neue, neuen |
| Degree=Cmp|Number=Sing | neueren | neuere, neueren, neuerer | neuere, neueres |
| Degree=Cmp|Number=Plur | Neuere | Neuere | |
| Degree=Sup|Number=Sing | neuesten | neueste, neuester, neuesten, neusten | neueste, neuestes, neuesten |
| Degree=Sup|Number=Plur | neueste |
PRON
44113 PRON tokens (47% of all PRON tokens) have a non-empty value of Gender.
The most frequent other feature values with which PRON and Gender co-occurred: Number=Sing (44113; 100%), Reflex=EMPTY (44113; 100%), Case=Nom (34207; 78%), Person=3 (22812; 52%), PronType=Prs (22810; 52%).
PRON tokens may have the following values of Gender:
Fem(8545; 19% of non-emptyGender): die, sie, der, ihr, derer, Deren, er/sieMasc(11303; 26% of non-emptyGender): er, der, dem, den, ihn, ihm, dessen, die/derNeut(24265; 55% of non-emptyGender): es, das, was, dem, nichts, etwas, ihm, ‘s, dessen, sEMPTY(50734): sich, die, man, sie, wir, wer, denen, ich, deren, uns
| Paradigm der | Masc | Fem | Neut |
|---|---|---|---|
| Abbr=Yes|Case=Nom | d. | ||
| Case=Acc | den | die | das |
| Case=Dat | dem | der | dem |
| Case=Gen | dessen | derer, Deren | dessen |
| Case=Nom | der | die | das |
| Case=Nom|Typo=Yes | da |
Gender seems to be lexical feature of PRON. 93% lemmas (13) occur only with one value of Gender.
PROPN
27734 PROPN tokens (14% of all PROPN tokens) have a non-empty value of Gender.
The most frequent other feature values with which PROPN and Gender co-occurred: Number=Sing (27723; 100%), Case=EMPTY (25062; 90%).
PROPN tokens may have the following values of Gender:
Fem(15265; 55% of non-emptyGender): Telekom, c’t, Europa, AMD, Sun, Telecom, T-Online, Bertelsmann, dpa, ViagMasc(12440; 45% of non-emptyGender): Bill, Warner, Michael, Thomas, Steve, Ron, John, Jackson, Gerhard, PeterNeut(29; 0% of non-emptyGender): AppleStore, PowerBooks, KurzFilmFestival, PowerBook, RealVideo, BusinessCall, Deutschland, FeRAMs, G3-PowerBook, InternetTeamEMPTY(166205): Microsoft, Deutschland, Intel, USA, AOL, ibm, telepolis, Apple, Linux, Windows
| Paradigm Nylis | Masc | Fem |
|---|---|---|
| Nylis | Nylis |
Gender seems to be lexical feature of PROPN. 100% lemmas (1583) occur only with one value of Gender.
ADV
188 ADV tokens (0% of all ADV tokens) have a non-empty value of Gender.
The most frequent other feature values with which ADV and Gender co-occurred: PronType=Ind (187; 99%).
ADV tokens may have the following values of Gender:
Fem(1; 1% of non-emptyGender): meisteMasc(2; 1% of non-emptyGender): meistenNeut(185; 98% of non-emptyGender): mehr, weniger, erstenmal, meisteEMPTY(196405): auch, noch, nur, so, aber, bereits, mehr, allerdings, damit, schon
| Paradigm meist | Masc | Fem | Neut |
|---|---|---|---|
| Case=Acc | meisten | ||
| meiste | meiste |
X
178 X tokens (0% of all X tokens) have a non-empty value of Gender.
The most frequent other feature values with which X and Gender co-occurred: Foreign=Yes (178; 100%).
X tokens may have the following values of Gender:
Neut(178; 100% of non-emptyGender): Inc., Corp.EMPTY(53513): of, internet, the, and, digital, mobile, media, for, OS, network
NUM
26 NUM tokens (0% of all NUM tokens) have a non-empty value of Gender.
The most frequent other feature values with which NUM and Gender co-occurred: NumType=Card (26; 100%), Number=Sing (26; 100%).
NUM tokens may have the following values of Gender:
Fem(16; 62% of non-emptyGender): eine, einerMasc(3; 12% of non-emptyGender): einem, einenNeut(7; 27% of non-emptyGender): ein, einemEMPTY(71282): zwei, 2000, drei, 2001, 1999, vier, fünf, 20, 100, 30
| Paradigm ein | Masc | Fem | Neut |
|---|---|---|---|
| Case=Acc | einen | eine | ein |
| Case=Dat | einem | einer | einem |
| Case=Nom | eine | ein |
Relations with Agreement in Gender
The 10 most frequent relations where parent and child node agree in Gender:
NOUN –[det]–> DET (297197; 67%),
DET –[nmod]–> NOUN (1262; 65%),
ADJ –[conj]–> ADJ (592; 77%),
NOUN –[expl]–> PRON (250; 61%),
DET –[conj]–> NOUN (50; 52%),
DET –[conj]–> DET (44; 54%),
DET –[nsubj]–> PRON (44; 54%),
DET –[det]–> PRON (35; 100%),
PRON –[appos]–> DET (31; 97%),
ADJ –[det]–> PRON (29; 97%).