Treebank Statistics: UD_Spanish-PUD: Features: Gender
This feature is universal.
It occurs with 2 different values: Fem, Masc.
11489 tokens (49%) have a non-empty value of Gender.
4161 types (70%) occur at least once with a non-empty value of Gender.
3359 lemmas (75%) occur at least once with a non-empty value of Gender.
The feature is used with 8 part-of-speech tags: NOUN (4706; 20% instances), DET (3325; 14% instances), ADJ (1288; 6% instances), PROPN (709; 3% instances), PRON (619; 3% instances), NUM (430; 2% instances), VERB (397; 2% instances), AUX (15; 0% instances).
NOUN
4706 NOUN tokens (98% of all NOUN tokens) have a non-empty value of Gender.
The most frequent other feature values with which NOUN and Gender co-occurred: Number=Sing (3338; 71%).
NOUN tokens may have the following values of Gender:
Fem(1997; 42% of non-emptyGender): guerra, parte, ciudad, vez, personas, historia, región, mayoría, vida, vecesMasc(2709; 58% of non-emptyGender): años, año, lugar, gobierno, estado, millones, día, embargo, mar, mundoEMPTY(101): internet, arte, Bank, GCA, Ground, News, North, Street, cápita, estudiantes
| Paradigm todo | Masc | Fem |
|---|---|---|
| Number=Sing | todo | |
| Number=Plur | todos | todas |
Gender seems to be lexical feature of NOUN. 99% lemmas (1837) occur only with one value of Gender.
DET
3325 DET tokens (100% of all DET tokens) have a non-empty value of Gender.
The most frequent other feature values with which DET and Gender co-occurred: PronType=Art (2984; 90%), Definite=Def (2528; 76%), Number=Sing (2527; 76%).
DET tokens may have the following values of Gender:
Fem(1342; 40% of non-emptyGender): la, las, una, esta, muchas, todas, otra, cada, varias, estasMasc(1983; 60% of non-emptyGender): el, los, un, este, esto, ese, cada, muchos, eso, estosEMPTY(13): The, cualquier, a, That
| Paradigm el | Masc | Fem |
|---|---|---|
| Number=Sing | el | la |
| Number=Plur | los | las |
ADJ
1288 ADJ tokens (87% of all ADJ tokens) have a non-empty value of Gender.
The most frequent other feature values with which ADJ and Gender co-occurred: Number=Sing (852; 66%).
ADJ tokens may have the following values of Gender:
Fem(564; 44% of non-emptyGender): primera, nueva, británica, mayor, segunda, nuevas, americana, nacional, propia, últimaMasc(724; 56% of non-emptyGender): primer, últimos, nuevos, Unidos, grandes, mayor, nacional, nuevo, Unido, mismoEMPTY(197): gran, posible, siguiente, increíble, estadounidense, importante, probable, agrícola, constante, contundente
| Paradigm nuevo | Masc | Fem |
|---|---|---|
| Number=Sing | nuevo | nueva |
| Number=Plur | nuevos, nuevo | nuevas |
PROPN
709 PROPN tokens (57% of all PROPN tokens) have a non-empty value of Gender.
The most frequent other feature values with which PROPN and Gender co-occurred: Number=Sing (684; 96%).
PROPN tokens may have the following values of Gender:
Fem(173; 24% of non-emptyGender): Clinton, BBC, Kesha, Rona, luna, Blunt, Guinea, Jasmine, UE, AnayaMasc(536; 76% of non-emptyGender): C., Trump, mediterráneo, EUA, C, Donald, Caribe, Joseph, Rafferty, AndesEMPTY(543): China, Europa, Hong, Kong, Australia, Italia, Pekín, Albania, Bretaña, España
| Paradigm Trump | Masc | Fem |
|---|---|---|
| Trump | Trump |
Gender seems to be lexical feature of PROPN. 99% lemmas (512) occur only with one value of Gender.
PRON
619 PRON tokens (59% of all PRON tokens) have a non-empty value of Gender.
The most frequent other feature values with which PRON and Gender co-occurred: Reflex=EMPTY (619; 100%), PrepCase=EMPTY (521; 84%), Case=EMPTY (486; 79%), Number=Sing (472; 76%), Poss=EMPTY (393; 63%), PronType=Prs (362; 58%), Person=3 (361; 58%).
PRON tokens may have the following values of Gender:
Fem(238; 38% of non-emptyGender): su, que, sus, ella, la, cual, cuales, una, Her, lasMasc(381; 62% of non-emptyGender): que, lo, su, sus, ellos, él, cual, tanto, los, cualesEMPTY(427): se, le, me, les, nos, quien, yo, cuál, qué, que
| Paradigm él | Masc | Fem |
|---|---|---|
| Case=Acc,Nom|Number=Sing|Person=3 | él, ello | ella |
| Case=Acc,Nom|Number=Plur|Person=3 | ellos | |
| Case=Acc|Number=Sing|Person=3|PrepCase=Npr | lo | la |
| Case=Acc|Number=Plur|Person=3|PrepCase=Npr | los | las |
| Number=Plur|Person=3 | los | |
| Number=Plur | los |
NUM
430 NUM tokens (99% of all NUM tokens) have a non-empty value of Gender.
The most frequent other feature values with which NUM and Gender co-occurred: NumType=Card (430; 100%), NumForm=Digit (313; 73%).
NUM tokens may have the following values of Gender:
Fem(55; 13% of non-emptyGender): dos, tres, 10, 760, cuatro, diez, ocho, 10.000, 12.000, 125Masc(375; 87% of non-emptyGender): dos, 1, 10, 3, mil, tres, seis, 70, cuatro, 100EMPTY(5): Cuatro, Dos, Five, Nine
| Paradigm dos | Masc | Fem |
|---|---|---|
| dos | dos |
Gender seems to be lexical feature of NUM. 93% lemmas (211) occur only with one value of Gender.
VERB
397 VERB tokens (17% of all VERB tokens) have a non-empty value of Gender.
The most frequent other feature values with which VERB and Gender co-occurred: Mood=EMPTY (397; 100%), Person=EMPTY (397; 100%), VerbForm=Part (397; 100%), Tense=Past (383; 96%), Number=Sing (312; 79%).
VERB tokens may have the following values of Gender:
Fem(88; 22% of non-emptyGender): dirigida, consideradas, coprotagonizada, derrotada, destruida, dividida, formada, llamada, localizadas, perdidasMasc(309; 78% of non-emptyGender): debido, hecho, escrito, tenido, dado, dejado, visto, acusado, declarado, desarrolladoEMPTY(1873): dijo, tiene, es, hacer, hay, hace, tener, está, tienen, ver
| Paradigm conocer | Masc | Fem |
|---|---|---|
| Number=Sing | conocido | conocida |
| Number=Plur | conocidas |
AUX
15 AUX tokens (2% of all AUX tokens) have a non-empty value of Gender.
The most frequent other feature values with which AUX and Gender co-occurred: Mood=EMPTY (15; 100%), Number=Sing (15; 100%), Person=EMPTY (15; 100%), Tense=Past (15; 100%), VerbForm=Part (15; 100%).
AUX tokens may have the following values of Gender:
Masc(15; 100% of non-emptyGender): sido, estadoEMPTY(619): es, fue, ha, había, está, era, puede, son, ser, fueron
Relations with Agreement in Gender
The 10 most frequent relations where parent and child node agree in Gender:
NOUN –[det]–> DET (2954; 99%),
NOUN –[amod]–> ADJ (1104; 89%),
NOUN –[nmod]–> NOUN (639; 50%),
NOUN –[det]–> PRON (236; 100%),
NOUN –[conj]–> NOUN (167; 64%),
NOUN –[nummod]–> NUM (162; 100%),
PROPN –[flat:name]–> PROPN (137; 98%),
PROPN –[det]–> DET (107; 96%),
NOUN –[appos]–> PROPN (91; 64%),
NOUN –[acl]–> VERB (85; 83%).