Treebank Statistics: UD_German-HDT: Features: Gender
This feature is universal.
It occurs with 3 different values: Fem, Masc, Neut.
Some words have combined values of the feature; 1 combinations have been observed: Masc|Neut.
This is a layered feature with the following layers: Gender, Gender[psor].
1392292 tokens (40%) have a non-empty value of Gender.
125497 types (67%) occur at least once with a non-empty value of Gender.
99515 lemmas (69%) occur at least once with a non-empty value of Gender.
The feature is used with 8 part-of-speech tags: NOUN (687983; 20% instances), DET (456716; 13% instances), ADJ (175354; 5% instances), PRON (44113; 1% instances), PROPN (27734; 1% instances), ADV (188; 0% instances), X (178; 0% instances), NUM (26; 0% instances).
NOUN
687983 NOUN tokens (94% of all NOUN tokens) have a non-empty value of Gender.
The most frequent other feature values with which NOUN and Gender co-occurred: Number=Sing (453591; 66%).
NOUN tokens may have the following values of Gender:
Fem(271365; 39% of non-emptyGender): Millionen, Mark, Milliarden, Firma, Angaben, Software, Zeit, Firmen, Version, InformationenMasc(251513; 37% of non-emptyGender): US-Dollar, Euro, Markt, Dollar, Hersteller, Umsatz, Computer, Preis, Anfang, MitarbeiterNeut(165105; 24% of non-emptyGender): Prozent, Internet, Unternehmen, Jahr, Ende, Quartal, Jahres, Jahren, Netz, DatenEMPTY(41117): Kunden, Pentium, Teil, Teile, Steuern, Befragten, Beschäftigten, informations-, Angestellten, Deutschen
| Paradigm unknown | Masc | Fem | Neut |
|---|---|---|---|
| Case=Acc|Number=Sing | 256bittigen | Milliardstel | 48bittige, COmputergestütztes |
| Case=Dat|Number=Sing | Wirtschaftswissenschaftlichen | ||
| Case=Dat|Number=Plur | Milliaren | ||
| Case=Gen|Number=Sing | Internationbalen | ||
| Case=Gen|Number=Plur | Rekonfigurierbaren | 128bittigen, Zellularen | |
| Number=Sing | Miliarden, Milliardenn | Amyotrophe | |
| Number=Plur | Kostenpflichtige | Regenerative |
Gender seems to be lexical feature of NOUN. 100% lemmas (86791) occur only with one value of Gender.
DET
456716 DET tokens (92% of all DET tokens) have a non-empty value of Gender.
The most frequent other feature values with which DET and Gender co-occurred: PronType=Art (417891; 91%), Number=Sing (394985; 86%), NumType=EMPTY (387796; 85%), Definite=Def (348969; 76%).
DET tokens may have the following values of Gender:
Fem(178082; 39% of non-emptyGender): die, der, eine, einer, den, diese, seiner, seine, dieser, keineMasc(162453; 36% of non-emptyGender): dem, der, den, die, des, einen, ein, einem, eines, diesemMasc,Neut(3768; 1% of non-emptyGender): demNeut(112413; 25% of non-emptyGender): das, dem, ein, des, die, einem, der, den, allem, diesEMPTY(37651): die, den, der, alle, ihre, diese, keine, viele, anderen, seine
| Paradigm der | Masc | Masc,Neut | Fem | Neut |
|---|---|---|---|---|
| Case=Acc|Number=Sing | den, der | die | das, 's | |
| Case=Acc|Number=Plur | die, den | die | die | |
| Case=Dat|Number=Sing | dem, des, den | dem | der, die | dem, das, des |
| Case=Dat|Number=Plur | den, die, der | den, der | den, der, die | |
| Case=Gen|Number=Sing | des, der | der | des | |
| Case=Gen|Number=Plur | der | der | der | |
| Case=Nom|Number=Sing | der | die, der | das | |
| Case=Nom|Number=Plur | die, der | die | die |
ADJ
175354 ADJ tokens (67% of all ADJ tokens) have a non-empty value of Gender.
The most frequent other feature values with which ADJ and Gender co-occurred: Variant=EMPTY (175352; 100%), Degree=Pos (147844; 84%), Number=Sing (115545; 66%).
ADJ tokens may have the following values of Gender:
Fem(72803; 42% of non-emptyGender): neue, deutsche, neuen, weitere, eigenen, deutschen, erste, ersten, eigene, großeMasc(59147; 34% of non-emptyGender): neuen, neue, ersten, deutschen, heutigen, 1., großen, letzten, neuer, eigenenNeut(43404; 25% of non-emptyGender): ersten, neue, neuen, vergangenen, letzten, nächsten, erste, zweiten, neues, drittenEMPTY(87257): möglich, gut, ganz, weltweit, deutlich, beiden, knapp, künftig, bekannt, schnell
| Paradigm neu | Masc | Fem | Neut |
|---|---|---|---|
| Case=Acc|Degree=Pos|Number=Sing | neuen | neue | neues, neue |
| Case=Acc|Degree=Pos|Number=Plur | neue, neuen | neue, neuen | neue, neuen |
| Case=Acc|Degree=Cmp|Number=Sing | neueren | neuere | neueres |
| Case=Acc|Degree=Cmp|Number=Plur | neuere, neueren | ||
| Case=Acc|Degree=Sup|Number=Sing | neuesten | neueste | neueste, neuestes |
| Case=Acc|Degree=Sup|Number=Plur | neuesten | neuesten, neueste, neusten | neuesten, neueste, neusten |
| Case=Dat|Degree=Pos|Number=Sing | neuen, neuem | neuen, neuer, neue | neuen, neuem |
| Case=Dat|Degree=Pos|Number=Plur | neuen | neuen, neue | neuen, neue |
| Case=Dat|Degree=Cmp|Number=Sing | neueren | neueren, neuerer | |
| Case=Dat|Degree=Cmp|Number=Plur | neueren | neueren | neueren |
| Case=Dat|Degree=Sup|Number=Sing | neuesten, neuestem, neusten | neuesten, neuester, neusten | neuesten, neuestem, neusten |
| Case=Dat|Degree=Sup|Number=Plur | neuesten | neuesten | neuesten |
| Case=Gen|Degree=Pos|Number=Sing | neuen | neuen, neue | neuen, neues |
| Case=Gen|Degree=Pos|Number=Plur | neuer, neuen | neuer, neuen | neuer, neuen, neue |
| Case=Gen|Degree=Cmp|Number=Sing | neueren | neueren | neueren |
| Case=Gen|Degree=Cmp|Number=Plur | neueren | ||
| Case=Gen|Degree=Sup|Number=Sing | neuesten | neuesten | neuesten |
| Case=Gen|Degree=Sup|Number=Plur | neuesten, neuester | neuesten | |
| Case=Nom|Degree=Pos|Number=Sing | neue, neuer | neue | neue, neues |
| Case=Nom|Degree=Pos|Number=Plur | neuen, neue | neuen, neue | neuen, neue |
| Case=Nom|Degree=Cmp|Number=Sing | neuere, neuerer | neuere | |
| Case=Nom|Degree=Cmp|Number=Plur | neueren | Neuere | Neuere, neueren |
| Case=Nom|Degree=Sup|Number=Sing | neueste, neuester, neuste | neueste | neueste |
| Case=Nom|Degree=Sup|Number=Plur | neuesten | neuesten | neuesten |
| Degree=Pos|Number=Sing | neuen | neue, neuer, neuen | neues, neue, neuen |
| Degree=Pos|Number=Plur | neue, neuen | neue, neuen, Internet/Neue | neue, neuen |
| Degree=Cmp|Number=Sing | neuere, neueres | ||
| Degree=Cmp|Number=Plur | neuere | neuere | |
| Degree=Sup|Number=Sing | neueste, neuester | neuestes, neueste, neuesten | |
| Degree=Sup|Number=Plur | neueste, neuesten | neueste, neuesten |
PRON
44113 PRON tokens (47% of all PRON tokens) have a non-empty value of Gender.
The most frequent other feature values with which PRON and Gender co-occurred: Number=Sing (44113; 100%), Reflex=EMPTY (44113; 100%), Case=Nom (34207; 78%), Person=3 (22812; 52%), PronType=Prs (22810; 52%).
PRON tokens may have the following values of Gender:
Fem(8545; 19% of non-emptyGender): die, sie, der, ihr, derer, Deren, er/sieMasc(11303; 26% of non-emptyGender): er, der, dem, den, ihn, ihm, dessen, die/derNeut(24265; 55% of non-emptyGender): es, das, was, dem, nichts, etwas, ihm, ‘s, dessen, sEMPTY(50734): sich, die, man, sie, wir, wer, denen, ich, deren, uns
| Paradigm der | Masc | Fem | Neut |
|---|---|---|---|
| Abbr=Yes|Case=Nom | d. | ||
| Case=Acc | den | die | das |
| Case=Dat | dem | der | dem |
| Case=Gen | dessen | derer, Deren | dessen |
| Case=Nom | der | die | das |
| Case=Nom|Typo=Yes | da |
Gender seems to be lexical feature of PRON. 93% lemmas (13) occur only with one value of Gender.
PROPN
27734 PROPN tokens (14% of all PROPN tokens) have a non-empty value of Gender.
The most frequent other feature values with which PROPN and Gender co-occurred: Number=Sing (27723; 100%), Case=EMPTY (25062; 90%).
PROPN tokens may have the following values of Gender:
Fem(15265; 55% of non-emptyGender): Telekom, c’t, Europa, AMD, Sun, Telecom, T-Online, Bertelsmann, dpa, ViagMasc(12440; 45% of non-emptyGender): Bill, Warner, Michael, Thomas, Steve, Ron, John, Jackson, Gerhard, PeterNeut(29; 0% of non-emptyGender): AppleStore, PowerBooks, KurzFilmFestival, PowerBook, RealVideo, BusinessCall, Deutschland, FeRAMs, G3-PowerBook, InternetTeamEMPTY(166205): Microsoft, Deutschland, Intel, USA, AOL, ibm, telepolis, Apple, Linux, Windows
| Paradigm Nylis | Masc | Fem |
|---|---|---|
| Nylis | Nylis |
Gender seems to be lexical feature of PROPN. 100% lemmas (1583) occur only with one value of Gender.
ADV
188 ADV tokens (0% of all ADV tokens) have a non-empty value of Gender.
The most frequent other feature values with which ADV and Gender co-occurred: PronType=Ind (187; 99%).
ADV tokens may have the following values of Gender:
Fem(1; 1% of non-emptyGender): meisteMasc(2; 1% of non-emptyGender): meistenNeut(185; 98% of non-emptyGender): mehr, weniger, erstenmal, meisteEMPTY(196405): auch, noch, nur, so, aber, bereits, mehr, allerdings, damit, schon
| Paradigm meist | Masc | Fem | Neut |
|---|---|---|---|
| Case=Acc | meisten | ||
| meiste | meiste |
X
178 X tokens (0% of all X tokens) have a non-empty value of Gender.
The most frequent other feature values with which X and Gender co-occurred: Foreign=Yes (178; 100%).
X tokens may have the following values of Gender:
Neut(178; 100% of non-emptyGender): Inc., Corp.EMPTY(53513): of, internet, the, and, digital, mobile, media, for, OS, network
NUM
26 NUM tokens (0% of all NUM tokens) have a non-empty value of Gender.
The most frequent other feature values with which NUM and Gender co-occurred: NumType=Card (26; 100%), Number=Sing (26; 100%).
NUM tokens may have the following values of Gender:
Fem(16; 62% of non-emptyGender): eine, einerMasc(3; 12% of non-emptyGender): einem, einenNeut(7; 27% of non-emptyGender): ein, einemEMPTY(71282): zwei, 2000, drei, 2001, 1999, vier, fünf, 20, 100, 30
| Paradigm ein | Masc | Fem | Neut |
|---|---|---|---|
| Case=Acc | einen | eine | ein |
| Case=Dat | einem | einer | einem |
| Case=Nom | eine | ein |
Relations with Agreement in Gender
The 10 most frequent relations where parent and child node agree in Gender:
NOUN –[det]–> DET (406003; 92%),
NOUN –[amod]–> ADJ (166389; 97%),
ADJ –[conj]–> ADJ (2043; 97%),
DET –[nmod]–> NOUN (1262; 65%),
NOUN –[expl]–> PRON (250; 61%),
NOUN –[nmod]–> ADJ (204; 51%),
NOUN –[appos]–> ADJ (67; 63%),
DET –[conj]–> NOUN (50; 52%),
DET –[conj]–> DET (48; 59%),
DET –[nsubj]–> PRON (44; 54%).