Treebank Statistics: UD_French-FTB: Features: Gender
This feature is universal.
It occurs with 2 different values: Fem
, Masc
.
293967 tokens (51%) have a non-empty value of Gender
.
1370 types (74%) occur at least once with a non-empty value of Gender
.
1189 lemmas (75%) occur at least once with a non-empty value of Gender
.
The feature is used with 9 part-of-speech tags: NOUN (115471; 20% instances), DET (77216; 13% instances), ADJ (34368; 6% instances), PRON (21812; 4% instances), PROPN (17832; 3% instances), VERB (14796; 3% instances), NUM (11279; 2% instances), AUX (1104; 0% instances), ADP (89; 0% instances).
NOUN
115471 NOUN tokens (99% of all NOUN
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which NOUN
and Gender
co-occurred: Number=Sing (76662; 66%).
NOUN
tokens may have the following values of Gender
:
Fem
(48214; 42% of non-emptyGender
): _, face, Fin, Grâce, Mme, Conséquence, Faute, Abstentions, Réunion, ConcurrenceMasc
(67257; 58% of non-emptyGender
): _, M., Mr, DOC, Résultat, Article, Côté, Vendredi, Jeudi, DébutEMPTY
(1616): _, Secrétaire, Agents, Cahin, Chefs, MODE, REPRÉSENTANT, Éditions
Gender
seems to be lexical feature of NOUN
. 100% lemmas (454) occur only with one value of Gender
.
DET
77216 DET tokens (90% of all DET
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which DET
and Gender
co-occurred: PronType=Art (68137; 88%), Number=Sing (61189; 79%), Definite=Def (54610; 71%).
DET
tokens may have the following values of Gender
:
Fem
(34987; 45% of non-emptyGender
): _, la, L’, les, Cette, une, ces, des, Sa, LeurMasc
(42229; 55% of non-emptyGender
): _, le, les, l’, un, Ce, ces, des, Son, CetEMPTY
(8278): _, L’, Des, Les, D’, quelqu’, le
ADJ
34368 ADJ tokens (94% of all ADJ
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which ADJ
and Gender
co-occurred: Number=Sing (22349; 65%).
ADJ
tokens may have the following values of Gender
:
Fem
(15469; 45% of non-emptyGender
): _, Seule, Toutes, Première, Autre, Quelle, toute, Dernière, Même, DeuxièmeMasc
(18899; 55% of non-emptyGender
): _, Autre, Tout, tous, Seul, Seuls, Difficile, Premier, Dernier, DeuxièmeEMPTY
(2194): _, Quitte
PRON
21812 PRON tokens (95% of all PRON
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which PRON
and Gender
co-occurred: Person=3 (20342; 93%), Reflex=EMPTY (18133; 83%), Number=Sing (15759; 72%), PronType=EMPTY (12913; 59%).
PRON
tokens may have the following values of Gender
:
Fem
(5690; 26% of non-emptyGender
): _, Elle, elles, Celle, Celles, Se, S’, En, Où, AucuneMasc
(16122; 74% of non-emptyGender
): _, il, c’, On, ils, ce, nous, Cela, Je, CeuxEMPTY
(1137): _, il, C’, Cela, Ce, 30 000, Ceci, Lui, Quarante, Y
PROPN
17832 PROPN tokens (82% of all PROPN
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which PROPN
and Gender
co-occurred: Number=Sing (17588; 99%).
PROPN
tokens may have the following values of Gender
:
Fem
(5574; 31% of non-emptyGender
): _, FO, EDF, Genève, CGT, Jean, Anita, France, BOURSE, BTMasc
(12258; 69% of non-emptyGender
): _, Paris, Michel, France, Air, FRANCFORT, Hachette, Jacques, Matra, LONDRESEMPTY
(4028): _, CFE, Thomson, Volkswagen, Elf, Pékin, TF, Washington, ABOU, AMIENS
Gender
seems to be lexical feature of PROPN
. 97% lemmas (336) occur only with one value of Gender
.
VERB
14796 VERB tokens (31% of all VERB
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which VERB
and Gender
co-occurred: Mood=EMPTY (14796; 100%), Person=EMPTY (14796; 100%), Tense=Past (14796; 100%), VerbForm=Part (14795; 100%), Number=Sing (11141; 75%).
VERB
tokens may have the following values of Gender
:
Fem
(4126; 28% of non-emptyGender
): _, Basée, Devenue, Décidée, Emises, Fixée, Lancée, Liée, Née, PartieMasc
(10670; 72% of non-emptyGender
): _, Interrogé, Exprimés, Né, Réuni, Réunis, Entré, Mis, Nommé, PasséEMPTY
(32917): _, Reste, Est, Peut, Voilà, faut, Lire, Notons, Ajoutons, Donnant
NUM
11279 NUM tokens (63% of all NUM
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which NUM
and Gender
co-occurred: NumType=Card (11261; 100%), Number=Plur (5863; 52%).
NUM
tokens may have the following values of Gender
:
Fem
(4114; 36% of non-emptyGender
): _, Deux, 1992, Quatre, Trois, 1993, Huit, 1991, Sept, 1989Masc
(7165; 64% of non-emptyGender
): _, Deux, Trois, Cinq, 4, Dix, Quatre, 27, Sept, 12EMPTY
(6518): _, Cent, Quarante, Vingt, Dix, Deux, Sept, Soixante, 1, 24
AUX
1104 AUX tokens (9% of all AUX
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which AUX
and Gender
co-occurred: Mood=EMPTY (1104; 100%), Person=EMPTY (1104; 100%), Tense=Past (1104; 100%), VerbForm=Part (1104; 100%), Number=Sing (1103; 100%).
AUX
tokens may have the following values of Gender
:
Fem
(2; 0% of non-emptyGender
): _Masc
(1102; 100% of non-emptyGender
): _EMPTY
(11765): _, Peut, Ayant, Avez, Avoir, Est, Peuvent, Seront, Sont, A
ADP
89 ADP tokens (0% of all ADP
tokens) have a non-empty value of Gender
.
ADP
tokens may have the following values of Gender
:
Fem
(20; 22% of non-emptyGender
): _Masc
(69; 78% of non-emptyGender
): _, ÀEMPTY
(92507): _, en, A, Pour, à, dans, de, d’, après, avec
Relations with Agreement in Gender
The 10 most frequent relations where parent and child node agree in Gender
:
NOUN –[det]–> DET (65141; 90%),
NOUN –[amod]–> ADJ (22840; 98%),
NOUN –[nmod]–> NOUN (17148; 52%),
NOUN –[nummod]–> NUM (7737; 90%),
PROPN –[det]–> DET (3980; 79%),
NOUN –[acl]–> VERB (3905; 67%),
NOUN –[conj]–> NOUN (3532; 60%),
NOUN –[fixed]–> ADJ (2885; 68%),
NOUN –[flat:name]–> PROPN (2323; 95%),
PROPN –[flat:name]–> PROPN (1370; 93%).