Treebank Statistics: UD_French-GSD: Features: Gender
This feature is universal.
It occurs with 2 different values: Fem, Masc.
80300 tokens (20%) have a non-empty value of Gender.
9257 types (22%) occur at least once with a non-empty value of Gender.
6079 lemmas (19%) occur at least once with a non-empty value of Gender.
The feature is used with 9 part-of-speech tags: DET (38619; 10% instances), ADJ (16953; 4% instances), VERB (11161; 3% instances), PRON (9298; 2% instances), PROPN (3214; 1% instances), AUX (904; 0% instances), X (85; 0% instances), NUM (61; 0% instances), SYM (5; 0% instances).
DET
38619 DET tokens (63% of all DET tokens) have a non-empty value of Gender.
The most frequent other feature values with which DET and Gender co-occurred: Number=Sing (38372; 99%), PronType=Art (34196; 89%), Definite=Def (26384; 68%).
DET tokens may have the following values of Gender:
Fem(16679; 43% of non-emptyGender): la, une, sa, cette, son, ma, aucune, certaines, toute, toutesMasc(21940; 57% of non-emptyGender): le, un, son, ce, cet, du, certains, aucun, tout, monEMPTY(22478): les, l’, des, ses, son, leur, de, ces, plusieurs, leurs
| Paradigm le | Masc | Fem |
|---|---|---|
| ExtPos=ADV | le | la |
| ExtPos=NOUN | le | |
| ExtPos=PRON | le | |
| le, L' | la | |
| Typo=Yes | le | la, là |
ADJ
16953 ADJ tokens (71% of all ADJ tokens) have a non-empty value of Gender.
The most frequent other feature values with which ADJ and Gender co-occurred: Number=Sing (11606; 68%).
ADJ tokens may have the following values of Gender:
Fem(7833; 46% of non-emptyGender): première, française, grande, nouvelle, toutes, nombreuses, nationale, seule, dernière, internationaleMasc(9120; 54% of non-emptyGender): français, premier, nombreux, tous, dernier, grand, nouveau, petit, seul, ancienEMPTY(6868): autres, même, autre, politique, deuxième, troisième, jeune, militaire, propre, proche
| Paradigm premier | Masc | Fem |
|---|---|---|
| Number=Sing | premier | première |
| Number=Sing|Typo=Yes | premier | |
| Number=Plur | premiers | premières |
VERB
11161 VERB tokens (35% of all VERB tokens) have a non-empty value of Gender.
The most frequent other feature values with which VERB and Gender co-occurred: Mood=EMPTY (11161; 100%), Person=EMPTY (11161; 100%), Tense=EMPTY (11161; 100%), VerbForm=Part (11161; 100%), Number=Sing (8845; 79%), Voice=Pass (7693; 69%).
VERB tokens may have the following values of Gender:
Fem(3252; 29% of non-emptyGender): située, née, créée, appelée, utilisée, connue, construite, mise, publiée, nomméeMasc(7909; 71% of non-emptyGender): né, situé, eu, fait, mort, connu, nommé, réalisé, utilisé, misEMPTY(20608): a, peut, fait, faire, partir, trouve, devient, doit, ont, permet
| Paradigm faire | Masc | Fem |
|---|---|---|
| ExtPos=ADJ|Number=Sing|Voice=Pass | faite | |
| Number=Sing|Typo=Yes|Voice=Act | fais | |
| Number=Sing|Typo=Yes|Voice=Pass | fait | |
| Number=Sing|Voice=Act | fait | faite |
| Number=Sing|Voice=Pass | fait | faite |
| Number=Plur|Voice=Act | faits | |
| Number=Plur|Voice=Pass | faits | faites |
PRON
9298 PRON tokens (51% of all PRON tokens) have a non-empty value of Gender.
The most frequent other feature values with which PRON and Gender co-occurred: Reflex=EMPTY (9197; 99%), Person=3 (8959; 96%), Number=Sing (7911; 85%), Emph=No (6342; 68%), PronType=Prs (6091; 66%), Case=Nom (5707; 61%).
PRON tokens may have the following values of Gender:
Fem(1720; 18% of non-emptyGender): elle, elles, une, la, celle, laquelle, celles, celle-ci, lesquelles, elle-mêmeMasc(7578; 82% of non-emptyGender): il, c’, on, ils, ce, le, un, lui, cela, toutEMPTY(8791): qui, se, s’, y, où, nous, je, dont, en, vous
| Paradigm lui | Masc | Fem |
|---|---|---|
| Case=Acc|Emph=No|Number=Sing | le | la |
| Case=Dat|Emph=No|Number=Sing | lui | |
| Case=Nom|Emph=No|ExtPos=ADP|Number=Sing | il | |
| Case=Nom|Emph=No|Number=Sing | il, -il | elle, -elle |
| Case=Nom|Emph=No|Number=Sing|Typo=Yes | il | elle |
| Case=Nom|Emph=No|Number=Plur | ils | elles |
| Emph=No|Number=Sing | -t-il, -il, le, lui | -t-elle, -elle, la |
| Emph=No|Number=Sing|Typo=Yes | t-il, -il, -le, t'il | |
| Emph=Yes|Number=Sing | lui | elle |
| Emph=Yes|Number=Plur | elles |
PROPN
3214 PROPN tokens (12% of all PROPN tokens) have a non-empty value of Gender.
The most frequent other feature values with which PROPN and Gender co-occurred: Number=Sing (3164; 98%).
PROPN tokens may have the following values of Gender:
Fem(1216; 38% of non-emptyGender): France, Russie, Chine, Loire, Grèce, Amérique, Belgique, Europe, Mauritanie, RenaissanceMasc(1998; 62% of non-emptyGender): Maroc, Sahara, Canada, Québec, Japon, Royaume-Uni, Brésil, Mali, Mans, MexiqueEMPTY(24479): France, Paris, États-Unis, de, Europe, Jean, Espagne, York, New, Pierre
| Paradigm Afrique | Masc | Fem |
|---|---|---|
| Afrique | Afrique |
Gender seems to be lexical feature of PROPN. 99% lemmas (1865) occur only with one value of Gender.
AUX
904 AUX tokens (7% of all AUX tokens) have a non-empty value of Gender.
The most frequent other feature values with which AUX and Gender co-occurred: Mood=EMPTY (904; 100%), Number=Sing (904; 100%), Person=EMPTY (904; 100%), Tense=EMPTY (904; 100%), VerbForm=Part (904; 100%).
AUX tokens may have the following values of Gender:
Fem(1; 0% of non-emptyGender): faiteMasc(903; 100% of non-emptyGender): été, fait, vuEMPTY(12177): est, a, sont, ont, était, fut, être, avait, avoir, ai
| Paradigm faire | Masc | Fem |
|---|---|---|
| fait | faite |
X
85 X tokens (3% of all X tokens) have a non-empty value of Gender.
The most frequent other feature values with which X and Gender co-occurred: Foreign=EMPTY (85; 100%), ExtPos=PROPN (43; 51%).
X tokens may have the following values of Gender:
Fem(22; 26% of non-emptyGender): 3D, BoJ, CEDH, CSL, FW17, Lincoln’s, RN113, SFIO, Scouting, TVMasc(63; 74% of non-emptyGender): DKK, statu, B, CWA, D.III, DA, FDLP, FPLP, G.I., G8EMPTY(2872): the, of, de, and, etc., in, a, del, for, Company
Gender seems to be lexical feature of X. 100% lemmas (83) occur only with one value of Gender.
NUM
61 NUM tokens (1% of all NUM tokens) have a non-empty value of Gender.
NUM tokens may have the following values of Gender:
Fem(61; 100% of non-emptyGender): uneEMPTY(10419): deux, trois, 2, 3, 5, quatre, 2010, 4, 20, 2009
SYM
5 SYM tokens (1% of all SYM tokens) have a non-empty value of Gender.
The most frequent other feature values with which SYM and Gender co-occurred: Number=Sing (4; 80%), ExtPos=NOUN (3; 60%).
SYM tokens may have the following values of Gender:
Masc(5; 100% of non-emptyGender): %, CsBi4Te6, M, X, kEMPTY(713): %, /, €, °, &, +, n°, $, =, k
Relations with Agreement in Gender
The 10 most frequent relations where parent and child node agree in Gender:
PROPN –[det]–> DET (2950; 96%),
VERB –[nsubj:pass]–> PRON (684; 70%),
VERB –[conj]–> VERB (637; 51%),
PROPN –[amod]–> ADJ (335; 82%),
ADJ –[det]–> DET (285; 57%),
PRON –[amod]–> ADJ (74; 86%),
PRON –[acl]–> VERB (55; 50%),
ADJ –[nsubj]–> PROPN (40; 65%),
PRON –[nsubj]–> PRON (32; 51%),
PRON –[conj]–> PRON (22; 51%).