Treebank Statistics: UD_French-GSD: Features: Gender
This feature is universal.
It occurs with 2 different values: Fem, Masc.
168176 tokens (42%) have a non-empty value of Gender.
23120 types (54%) occur at least once with a non-empty value of Gender.
15916 lemmas (48%) occur at least once with a non-empty value of Gender.
The feature is used with 10 part-of-speech tags: NOUN (75175; 19% instances), DET (44453; 11% instances), ADJ (23820; 6% instances), VERB (11161; 3% instances), PRON (9298; 2% instances), PROPN (3214; 1% instances), AUX (904; 0% instances), X (85; 0% instances), NUM (61; 0% instances), SYM (5; 0% instances).
NOUN
75175 NOUN tokens (100% of all NOUN tokens) have a non-empty value of Gender.
The most frequent other feature values with which NOUN and Gender co-occurred: Number=Sing (56343; 75%).
NOUN tokens may have the following values of Gender:
Fem(33433; 44% of non-emptyGender): ville, partie, fois, région, commune, années, famille, année, fin, placeMasc(41742; 56% of non-emptyGender): ans, pays, nom, monde, temps, groupe, siècle, état, cours, lieuEMPTY(155): enfants, gens, enfant, Economiste, Hc, IP, NO, ch
| Paradigm partie | Masc | Fem |
|---|---|---|
| Number=Sing | partie | |
| Number=Sing|Typo=Yes | parti | partie |
| Number=Plur | parties |
Gender seems to be lexical feature of NOUN. 99% lemmas (9330) occur only with one value of Gender.
DET
44453 DET tokens (73% of all DET tokens) have a non-empty value of Gender.
The most frequent other feature values with which DET and Gender co-occurred: Number=Sing (44204; 99%), PronType=Art (39692; 89%), Definite=Def (31882; 72%).
DET tokens may have the following values of Gender:
Fem(19907; 45% of non-emptyGender): la, une, l’, sa, cette, son, ma, aucune, certaines, touteMasc(24546; 55% of non-emptyGender): le, un, l’, son, ce, cet, du, certains, mon, aucunEMPTY(16645): les, des, l’, ses, leur, de, ces, plusieurs, leurs, son
| Paradigm le | Masc | Fem |
|---|---|---|
| ExtPos=ADV | le | la |
| ExtPos=NOUN | le | |
| ExtPos=PRON | le | |
| le, l' | la, l' | |
| Typo=Yes | le | la, là |
ADJ
23820 ADJ tokens (100% of all ADJ tokens) have a non-empty value of Gender.
The most frequent other feature values with which ADJ and Gender co-occurred: Number=Sing (17171; 72%).
ADJ tokens may have the following values of Gender:
Fem(11094; 47% of non-emptyGender): première, française, grande, même, nouvelle, toutes, nombreuses, nationale, autres, seuleMasc(12726; 53% of non-emptyGender): premier, français, tous, dernier, grand, autres, nouveau, même, nombreux, petit
| Paradigm premier | Masc | Fem |
|---|---|---|
| Number=Sing | premier | première |
| Number=Sing|Typo=Yes | premier | |
| Number=Plur | premiers | premières |
VERB
11161 VERB tokens (35% of all VERB tokens) have a non-empty value of Gender.
The most frequent other feature values with which VERB and Gender co-occurred: Mood=EMPTY (11161; 100%), Person=EMPTY (11161; 100%), Tense=EMPTY (11161; 100%), VerbForm=Part (11161; 100%), Number=Sing (8845; 79%), Voice=Pass (7693; 69%).
VERB tokens may have the following values of Gender:
Fem(3252; 29% of non-emptyGender): située, née, créée, appelée, utilisée, connue, construite, mise, publiée, nomméeMasc(7909; 71% of non-emptyGender): né, situé, eu, fait, mort, connu, nommé, réalisé, utilisé, misEMPTY(20608): a, peut, fait, faire, partir, trouve, devient, doit, ont, permet
| Paradigm faire | Masc | Fem |
|---|---|---|
| ExtPos=ADJ|Number=Sing|Voice=Pass | faite | |
| Number=Sing|Typo=Yes|Voice=Act | fais | |
| Number=Sing|Typo=Yes|Voice=Pass | fait | |
| Number=Sing|Voice=Act | fait | faite |
| Number=Sing|Voice=Pass | fait | faite |
| Number=Plur|Voice=Act | faits | |
| Number=Plur|Voice=Pass | faits | faites |
PRON
9298 PRON tokens (53% of all PRON tokens) have a non-empty value of Gender.
The most frequent other feature values with which PRON and Gender co-occurred: Reflex=EMPTY (9197; 99%), Person=3 (8959; 96%), Number=Sing (7911; 85%), Emph=No (6342; 68%), PronType=Prs (6091; 66%), Case=Nom (5707; 61%).
PRON tokens may have the following values of Gender:
Fem(1720; 18% of non-emptyGender): elle, elles, une, la, celle, laquelle, celles, celle-ci, lesquelles, elle-mêmeMasc(7578; 82% of non-emptyGender): il, c’, on, ils, ce, le, un, lui, cela, toutEMPTY(8389): qui, se, s’, y, nous, je, dont, en, vous, qu’
| Paradigm lui | Masc | Fem |
|---|---|---|
| Case=Acc|Emph=No|Number=Sing | le | la |
| Case=Dat|Emph=No|Number=Sing | lui | |
| Case=Nom|Emph=No|ExtPos=ADP|Number=Sing | il | |
| Case=Nom|Emph=No|Number=Sing | il, -il | elle, -elle |
| Case=Nom|Emph=No|Number=Sing|Typo=Yes | il | elle |
| Case=Nom|Emph=No|Number=Plur | ils, -ils | elles |
| Case=Nom|Emph=No|Number=Plur|Typo=Yes | Elles | |
| Emph=No|Number=Sing | -t-il, -il, le, lui | -t-elle, -elle, la |
| Emph=No|Number=Sing|Typo=Yes | t-il, -il, -le, t'il | |
| Emph=No|Number=Plur | -ils | -elles, ELLES |
| Emph=Yes|Number=Sing | lui | elle |
| Emph=Yes|Number=Plur | eux | elles |
| Emph=Yes|Number=Plur|Typo=Yes | -eux |
PROPN
3214 PROPN tokens (12% of all PROPN tokens) have a non-empty value of Gender.
The most frequent other feature values with which PROPN and Gender co-occurred: Number=Sing (3164; 98%).
PROPN tokens may have the following values of Gender:
Fem(1216; 38% of non-emptyGender): France, Russie, Chine, Loire, Grèce, Amérique, Belgique, Europe, Mauritanie, RenaissanceMasc(1998; 62% of non-emptyGender): Maroc, Sahara, Canada, Québec, Japon, Royaume-Uni, Brésil, Mali, Mans, MexiqueEMPTY(24479): France, Paris, États-Unis, de, Europe, Jean, Espagne, York, New, Pierre
| Paradigm Afrique | Masc | Fem |
|---|---|---|
| Afrique | Afrique |
Gender seems to be lexical feature of PROPN. 99% lemmas (1865) occur only with one value of Gender.
AUX
904 AUX tokens (7% of all AUX tokens) have a non-empty value of Gender.
The most frequent other feature values with which AUX and Gender co-occurred: Mood=EMPTY (904; 100%), Number=Sing (904; 100%), Person=EMPTY (904; 100%), Tense=EMPTY (904; 100%), VerbForm=Part (904; 100%).
AUX tokens may have the following values of Gender:
Fem(1; 0% of non-emptyGender): faiteMasc(903; 100% of non-emptyGender): été, fait, vuEMPTY(12177): est, a, sont, ont, était, fut, être, avait, avoir, ai
| Paradigm faire | Masc | Fem |
|---|---|---|
| fait | faite |
X
85 X tokens (3% of all X tokens) have a non-empty value of Gender.
The most frequent other feature values with which X and Gender co-occurred: Foreign=EMPTY (85; 100%), ExtPos=PROPN (43; 51%).
X tokens may have the following values of Gender:
Fem(22; 26% of non-emptyGender): 3D, BoJ, CEDH, CSL, FW17, Lincoln’s, RN113, SFIO, Scouting, TVMasc(63; 74% of non-emptyGender): DKK, statu, B, CWA, D.III, DA, FDLP, FPLP, G.I., G8EMPTY(2872): the, of, de, and, etc., in, a, del, for, Company
Gender seems to be lexical feature of X. 100% lemmas (83) occur only with one value of Gender.
NUM
61 NUM tokens (1% of all NUM tokens) have a non-empty value of Gender.
The most frequent other feature values with which NUM and Gender co-occurred: Number=Plur (61; 100%).
NUM tokens may have the following values of Gender:
Fem(61; 100% of non-emptyGender): uneEMPTY(10418): deux, trois, 2, 3, 5, quatre, 2010, 4, 20, 2009
SYM
5 SYM tokens (1% of all SYM tokens) have a non-empty value of Gender.
The most frequent other feature values with which SYM and Gender co-occurred: Number=Sing (4; 80%), ExtPos=NOUN (3; 60%).
SYM tokens may have the following values of Gender:
Masc(5; 100% of non-emptyGender): %, CsBi4Te6, M, X, kEMPTY(713): %, /, €, °, &, +, n°, $, =, k
Relations with Agreement in Gender
The 10 most frequent relations where parent and child node agree in Gender:
NOUN –[det]–> DET (39084; 73%),
NOUN –[amod]–> ADJ (19323; 100%),
NOUN –[conj]–> NOUN (3298; 63%),
PROPN –[det]–> DET (3000; 98%),
NOUN –[acl]–> VERB (2999; 70%),
VERB –[nsubj:pass]–> NOUN (1832; 81%),
ADJ –[nsubj]–> NOUN (951; 98%),
ADJ –[conj]–> ADJ (908; 97%),
NOUN –[appos]–> NOUN (896; 62%),
VERB –[nsubj:pass]–> PRON (683; 70%).