Treebank Statistics: UD_French-PUD: Features: Gender
This feature is universal.
It occurs with 2 different values: Fem, Masc.
12548 tokens (51%) have a non-empty value of Gender.
4397 types (74%) occur at least once with a non-empty value of Gender.
3531 lemmas (77%) occur at least once with a non-empty value of Gender.
The feature is used with 8 part-of-speech tags: NOUN (4672; 19% instances), DET (3870; 16% instances), ADJ (1619; 7% instances), PROPN (970; 4% instances), VERB (836; 3% instances), PRON (478; 2% instances), AUX (99; 0% instances), NUM (4; 0% instances).
NOUN
4672 NOUN tokens (100% of all NOUN tokens) have a non-empty value of Gender.
The most frequent other feature values with which NOUN and Gender co-occurred: Number=Sing (3368; 72%).
NOUN tokens may have the following values of Gender:
Fem(2176; 47% of non-emptyGender): années, guerre, partie, ville, année, fois, mer, personnes, région, histoireMasc(2496; 53% of non-emptyGender): ans, nord, état, gouvernement, siècle, jour, monde, pays, sud, temps
| Paradigm sud | Masc | Fem |
|---|---|---|
| sud | sud |
Gender seems to be lexical feature of NOUN. 99% lemmas (1828) occur only with one value of Gender.
DET
3870 DET tokens (100% of all DET tokens) have a non-empty value of Gender.
The most frequent other feature values with which DET and Gender co-occurred: PronType=Art (3451; 89%), Number=Sing (2861; 74%), Definite=Def (2781; 72%).
DET tokens may have the following values of Gender:
Fem(1851; 48% of non-emptyGender): la, les, une, l’, l’, sa, des, cette, leur, sesMasc(2019; 52% of non-emptyGender): le, les, un, l’, des, l’, son, ce, ses, cesEMPTY(5): les, d’, l’, le
| Paradigm le | Masc | Fem |
|---|---|---|
| Definite=Def|ExtPos=PRON|Number=Sing | l’ | |
| Definite=Def|Number=Sing | le, l', l’, les, l‘ | la, l', l’, l‘ |
| Definite=Def|Number=Plur | les | les |
| Number=Sing | La, L’ |
ADJ
1619 ADJ tokens (100% of all ADJ tokens) have a non-empty value of Gender.
The most frequent other feature values with which ADJ and Gender co-occurred: Number=Sing (1086; 67%).
ADJ tokens may have the following values of Gender:
Fem(795; 49% of non-emptyGender): première, grande, nouvelle, dernière, dernières, plusieurs, nombreuses, nouvelles, autres, deuxièmeMasc(824; 51% of non-emptyGender): autres, grand, dernier, derniers, général, nouveaux, plusieurs, certains, nouveau, chaqueEMPTY(5): Associated, New, North, Select, Simple
| Paradigm nouveau | Masc | Fem |
|---|---|---|
| Number=Sing | nouveau, nouvel | nouvelle |
| Number=Plur | nouveaux | nouvelles |
PROPN
970 PROPN tokens (76% of all PROPN tokens) have a non-empty value of Gender.
The most frequent other feature values with which PROPN and Gender co-occurred: Number=Sing (933; 96%).
PROPN tokens may have the following values of Gender:
Fem(359; 37% of non-emptyGender): Chine, Amérique, Europe, Australie, France, Italie, Afrique, Albanie, Caraïbes, Grande-BretagneMasc(611; 63% of non-emptyGender): Trump, J.-C., États-Unis, Joseph, Donald, Gerry, Cameroun, Edgar, Mexique, RaffertyEMPTY(302): Hong, Kong, Paris, Pékin, Londres, Brisbane, Qing, Rome, Twitter, Uber
| Paradigm Trump | Masc | Fem |
|---|---|---|
| Trump | Trump |
Gender seems to be lexical feature of PROPN. 99% lemmas (640) occur only with one value of Gender.
VERB
836 VERB tokens (37% of all VERB tokens) have a non-empty value of Gender.
The most frequent other feature values with which VERB and Gender co-occurred: Mood=EMPTY (836; 100%), Person=EMPTY (836; 100%), Tense=Past (836; 100%), VerbForm=Part (836; 100%), Number=Sing (690; 83%).
VERB tokens may have the following values of Gender:
Fem(181; 22% of non-emptyGender): composée, dirigée, apparue, devenue, utilisée, considérée, considérées, déroulée, détruite, faiteMasc(655; 78% of non-emptyGender): eu, déclaré, dit, fait, commencé, indiqué, décidé, joué, utilisé, comprisEMPTY(1417): a, est, peut, faire, avait, pourrait, était, peuvent, sont, avoir
| Paradigm faire | Masc | Fem |
|---|---|---|
| Number=Sing | fait | faite |
| Number=Plur | faites |
PRON
478 PRON tokens (44% of all PRON tokens) have a non-empty value of Gender.
The most frequent other feature values with which PRON and Gender co-occurred: PronType=Prs (432; 90%), Person=3 (414; 87%), Number=Sing (381; 80%).
PRON tokens may have the following values of Gender:
Fem(102; 21% of non-emptyGender): elle, elles, s’, l’, laquelle, celle, lesquelles, lui, celles, elle-mêmeMasc(376; 79% of non-emptyGender): il, ils, lui, le, s’, un, eux, je, -il, ceuxEMPTY(619): qui, se, y, s’, on, ce, nous, je, où, c’
| Paradigm il | Masc | Fem |
|---|---|---|
| ExtPos=ADP|Number=Sing|Person=3 | il | |
| Number=Sing|Person=1 | je, J’, j' | |
| Number=Sing|Person=3 | il, -il, -t-il | elle, -elle |
| Number=Plur|Person=1 | nous | |
| Number=Plur|Person=3 | ils, -ils | elles |
AUX
99 AUX tokens (10% of all AUX tokens) have a non-empty value of Gender.
The most frequent other feature values with which AUX and Gender co-occurred: Mood=EMPTY (99; 100%), Number=Sing (99; 100%), Person=EMPTY (99; 100%), Tense=Past (99; 100%), VerbForm=Part (99; 100%).
AUX tokens may have the following values of Gender:
Masc(99; 100% of non-emptyGender): été, faitEMPTY(930): a, est, ont, sont, était, avait, fut, être, avaient, étaient
NUM
4 NUM tokens (1% of all NUM tokens) have a non-empty value of Gender.
NUM tokens may have the following values of Gender:
Masc(4; 100% of non-emptyGender): 1er, premierEMPTY(447): deux, trois, quatre, 1, 3, 10, II, III, dix, milliards
Relations with Agreement in Gender
The 10 most frequent relations where parent and child node agree in Gender:
NOUN –[det]–> DET (3471; 100%),
NOUN –[amod]–> ADJ (1353; 100%),
NOUN –[nmod]–> NOUN (687; 53%),
PROPN –[det]–> DET (244; 96%),
PROPN –[flat:name]–> PROPN (189; 96%),
NOUN –[conj]–> NOUN (151; 60%),
VERB –[nsubj:pass]–> NOUN (121; 96%),
NOUN –[acl]–> VERB (105; 70%),
NOUN –[appos]–> PROPN (74; 62%),
VERB –[conj]–> VERB (53; 54%).