Treebank Statistics: UD_French-ParisStories: Features: Gender
This feature is universal.
It occurs with 2 different values: Fem, Masc.
12668 tokens (30%) have a non-empty value of Gender.
2009 types (62%) occur at least once with a non-empty value of Gender.
1666 lemmas (70%) occur at least once with a non-empty value of Gender.
The feature is used with 10 part-of-speech tags: NOUN (4395; 10% instances), PRON (3190; 7% instances), DET (2644; 6% instances), ADJ (1188; 3% instances), VERB (1145; 3% instances), AUX (42; 0% instances), ADV (33; 0% instances), PROPN (15; 0% instances), X (10; 0% instances), NUM (6; 0% instances).
NOUN
4395 NOUN tokens (99% of all NOUN tokens) have a non-empty value of Gender.
The most frequent other feature values with which NOUN and Gender co-occurred: Number=Sing (3501; 80%).
NOUN tokens may have the following values of Gender:
Fem(1676; 38% of non-emptyGender): fois, maison, mère, heures, année, chose, vie, peur, ville, heureMasc(2719; 62% of non-emptyGender): coup, fait, peu, genre, temps, ans, moment, jour, truc, mondeEMPTY(32): gens, potes, enfants, autres, collègues, élèves, chmeliers, fans, neuneu, patati
| Paradigm truc | Masc | Fem |
|---|---|---|
| Number=Sing | truc | truc |
| Number=Plur | trucs |
Gender seems to be lexical feature of NOUN. 98% lemmas (1075) occur only with one value of Gender.
PRON
3190 PRON tokens (50% of all PRON tokens) have a non-empty value of Gender.
The most frequent other feature values with which PRON and Gender co-occurred: Person=3 (3148; 99%), Number=Sing (3017; 95%), Emph=No (1823; 57%), Case=Nom (1744; 55%).
PRON tokens may have the following values of Gender:
Fem(308; 10% of non-emptyGender): elle, elles, la, une, lesquelles, toutes, elle-mêmeMasc(2882; 90% of non-emptyGender): on, c’, il, ça, ils, ce, le, lui, -ce, tousEMPTY(3185): je, j’, y, qui, tu, me, moi, s’, se, nous
| Paradigm lui | Masc | Fem |
|---|---|---|
| Case=Acc|Emph=No|Number=Sing|Person=3 | le | la |
| Case=Acc|Emph=No|Number=Sing | le | |
| Case=Dat|Emph=No|Number=Sing|Person=3 | lui | |
| Case=Nom|Emph=No|ExtPos=ADP|Number=Sing|Person=3 | il | |
| Case=Nom|Emph=No|ExtPos=VERB|Number=Sing|Person=3 | il | |
| Case=Nom|Emph=No|Number=Sing|Person=3 | il, elle | elle |
| Case=Nom|Emph=No|Number=Plur|Person=3 | ils | elles |
| Emph=No|Number=Sing|Person=3 | lui, le | |
| Emph=Yes|Number=Sing|Person=3 | lui | elle |
| Emph=Yes|Number=Plur|Person=3 | eux | elles |
DET
2644 DET tokens (76% of all DET tokens) have a non-empty value of Gender.
The most frequent other feature values with which DET and Gender co-occurred: Number=Sing (2639; 100%), Number[psor]=EMPTY (2357; 89%), Person[psor]=EMPTY (2357; 89%), Poss=EMPTY (2357; 89%), PronType=Art (2247; 85%), Definite=Def (1560; 59%).
DET tokens may have the following values of Gender:
Fem(983; 37% of non-emptyGender): la, une, l’, ma, cette, sa, mon, ta, aucune, quelleMasc(1661; 63% of non-emptyGender): le, un, mon, l’, ce, son, du, ton, cet, aucunEMPTY(823): les, des, mes, ses, nos, notre, l’, quelque, chaque, quelques
| Paradigm le | Masc | Fem |
|---|---|---|
| Definite=Def|Number=Sing | le, l' | la, l' |
| Definite=Def | la | |
| Definite=Ind|Number=Sing | le |
ADJ
1188 ADJ tokens (100% of all ADJ tokens) have a non-empty value of Gender.
The most frequent other feature values with which ADJ and Gender co-occurred: Number=Sing (944; 79%).
ADJ tokens may have the following values of Gender:
Fem(437; 37% of non-emptyGender): petite, première, toute, bonne, toutes, même, contente, seule, autre, grandeMasc(751; 63% of non-emptyGender): tout, petit, tous, vrai, même, premier, bizarre, sympa, gros, mignonEMPTY(1): incapable
| Paradigm tout | Masc | Fem |
|---|---|---|
| Number=Sing | tout | toute |
| Number=Sing|PronType=Ind | tout | toute |
| Number=Plur | tous, tout | toutes |
| Number=Plur|PronType=Ind | tous |
VERB
1145 VERB tokens (27% of all VERB tokens) have a non-empty value of Gender.
The most frequent other feature values with which VERB and Gender co-occurred: Mood=EMPTY (1095; 96%), Tense=EMPTY (1095; 96%), VerbForm=Part (1094; 96%), Person=EMPTY (1091; 95%), Number=Sing (1070; 93%), Voice=Act (623; 54%).
VERB tokens may have the following values of Gender:
Fem(210; 18% of non-emptyGender): allée, rencontrée, vue, arrivée, partie, venue, accompagnée, rentrée, mise, devenueMasc(935; 82% of non-emptyGender): fait, dit, vu, eu, passé, pris, allé, parlé, commencé, rencontréEMPTY(3088): avait, a, sais, voilà, faire, dit, va, aller, avais, vois
| Paradigm avoir | Masc | Fem |
|---|---|---|
| eu | eue |
AUX
42 AUX tokens (2% of all AUX tokens) have a non-empty value of Gender.
The most frequent other feature values with which AUX and Gender co-occurred: Mood=EMPTY (42; 100%), Number=Sing (42; 100%), VerbForm=Part (42; 100%), Person=EMPTY (41; 98%), Tense=Past (37; 88%).
AUX tokens may have the following values of Gender:
Masc(42; 100% of non-emptyGender): été, fait, euEMPTY(2234): est, était, a, ai, suis, étais, avait, sont, avais, étaient
ADV
33 ADV tokens (1% of all ADV tokens) have a non-empty value of Gender.
The most frequent other feature values with which ADV and Gender co-occurred: ExtPos=EMPTY (33; 100%), Polarity=EMPTY (33; 100%).
ADV tokens may have the following values of Gender:
Masc(33; 100% of non-emptyGender): mal, tout, plus, superEMPTY(3470): pas, donc, parce, enfin, plus, vraiment, là, très, même, après
PROPN
15 PROPN tokens (4% of all PROPN tokens) have a non-empty value of Gender.
PROPN tokens may have the following values of Gender:
Fem(10; 67% of non-emptyGender): Flora, Caraïbes, Ecosse, Île, GoPro, Latine, TerresMasc(5; 33% of non-emptyGender): Anglais, PSG, MEMPTY(399): Paris, CROUS, Z, Agen, CP, Ecosse, Sanga, Athis, France, Liège
Gender seems to be lexical feature of PROPN. 100% lemmas (10) occur only with one value of Gender.
X
10 X tokens (8% of all X tokens) have a non-empty value of Gender.
The most frequent other feature values with which X and Gender co-occurred: Number=Sing (10; 100%), ExtPos=NOUN (6; 60%).
X tokens may have the following values of Gender:
Fem(2; 20% of non-emptyGender): ju~, quest~Masc(8; 80% of non-emptyGender): re~, dispro~, fa~, frig~, fr~, hu~, mid~EMPTY(117): XXX, s~, d~, j~, a~, euh~, i~, m~, pl~, qu~
NUM
6 NUM tokens (2% of all NUM tokens) have a non-empty value of Gender.
The most frequent other feature values with which NUM and Gender co-occurred: Number=Plur (5; 83%).
NUM tokens may have the following values of Gender:
Fem(1; 17% of non-emptyGender): uneMasc(5; 83% of non-emptyGender): neuf, unEMPTY(237): deux, trois, six, dix, cinq, mille, quatre, huit, quatorze, sept
| Paradigm un | Masc | Fem |
|---|---|---|
| Number=Sing | un | |
| Number=Plur | un | une |
Relations with Agreement in Gender
The 10 most frequent relations where parent and child node agree in Gender:
NOUN –[det]–> DET (2305; 77%),
NOUN –[amod]–> ADJ (578; 99%),
ADJ –[nsubj]–> PRON (255; 78%),
DET –[fixed]–> NOUN (79; 99%),
NOUN –[conj]–> NOUN (74; 62%),
PRON –[reparandum]–> PRON (65; 96%),
NOUN –[reparandum]–> NOUN (57; 83%),
DET –[reparandum]–> DET (51; 78%),
ADJ –[obl:mod]–> NOUN (43; 68%),
PRON –[nsubj]–> PRON (40; 55%).