Treebank Statistics: UD_Spanish-PUD: Features: Gender
This feature is universal.
It occurs with 2 different values: Fem, Masc.
11660 tokens (50%) have a non-empty value of Gender.
4255 types (72%) occur at least once with a non-empty value of Gender.
3457 lemmas (77%) occur at least once with a non-empty value of Gender.
The feature is used with 8 part-of-speech tags: NOUN (4717; 20% instances), DET (3330; 14% instances), ADJ (1452; 6% instances), PROPN (709; 3% instances), PRON (613; 3% instances), NUM (430; 2% instances), VERB (394; 2% instances), AUX (15; 0% instances).
NOUN
4717 NOUN tokens (98% of all NOUN tokens) have a non-empty value of Gender.
The most frequent other feature values with which NOUN and Gender co-occurred: Number=Sing (3347; 71%).
NOUN tokens may have the following values of Gender:
Fem(1998; 42% of non-emptyGender): guerra, parte, ciudad, vez, personas, historia, región, mayoría, vida, vecesMasc(2719; 58% of non-emptyGender): años, año, lugar, gobierno, estado, millones, día, embargo, mar, mundoEMPTY(100): internet, arte, Bank, GCA, Ground, News, North, Street, cápita, estudiantes
| Paradigm todo | Masc | Fem |
|---|---|---|
| Number=Sing | todo | |
| Number=Plur | todos | todas |
Gender seems to be lexical feature of NOUN. 99% lemmas (1842) occur only with one value of Gender.
DET
3330 DET tokens (100% of all DET tokens) have a non-empty value of Gender.
The most frequent other feature values with which DET and Gender co-occurred: PronType=Art (2984; 90%), Number=Sing (2532; 76%), Definite=Def (2528; 76%).
DET tokens may have the following values of Gender:
Fem(1344; 40% of non-emptyGender): la, las, una, esta, muchas, todas, otra, cada, varias, estasMasc(1986; 60% of non-emptyGender): el, los, un, este, esto, ese, cada, muchos, eso, estosEMPTY(8): The, a, That
| Paradigm el | Masc | Fem |
|---|---|---|
| Number=Sing | el | la |
| Number=Plur | los | las |
ADJ
1452 ADJ tokens (98% of all ADJ tokens) have a non-empty value of Gender.
The most frequent other feature values with which ADJ and Gender co-occurred: Number=Sing (1014; 70%).
ADJ tokens may have the following values of Gender:
Fem(627; 43% of non-emptyGender): primera, nueva, británica, gran, mayor, segunda, nuevas, americana, nacional, propiaMasc(825; 57% of non-emptyGender): gran, primer, últimos, nuevos, Unidos, grandes, mayor, nacional, Unido, mismoEMPTY(28): Gran, American, Associated, Golden, Metropolitan, Shaky, Stranger, Talking, Wild, austro
| Paradigm nuevo | Masc | Fem |
|---|---|---|
| Number=Sing | nuevo | nueva |
| Number=Plur | nuevos, nuevo | nuevas |
PROPN
709 PROPN tokens (57% of all PROPN tokens) have a non-empty value of Gender.
The most frequent other feature values with which PROPN and Gender co-occurred: Number=Sing (684; 96%).
PROPN tokens may have the following values of Gender:
Fem(173; 24% of non-emptyGender): Clinton, BBC, Kesha, Rona, luna, Blunt, Guinea, Jasmine, UE, AnayaMasc(536; 76% of non-emptyGender): C., Trump, mediterráneo, EUA, C, Donald, Caribe, Joseph, Rafferty, AndesEMPTY(544): China, Europa, Hong, Kong, Australia, Italia, Pekín, Albania, Bretaña, España
| Paradigm Trump | Masc | Fem |
|---|---|---|
| Trump | Trump |
Gender seems to be lexical feature of PROPN. 99% lemmas (512) occur only with one value of Gender.
PRON
613 PRON tokens (59% of all PRON tokens) have a non-empty value of Gender.
The most frequent other feature values with which PRON and Gender co-occurred: Reflex=EMPTY (613; 100%), PrepCase=EMPTY (515; 84%), Case=EMPTY (480; 78%), Number=Sing (466; 76%), Poss=EMPTY (387; 63%), PronType=Prs (362; 59%), Person=3 (361; 59%).
PRON tokens may have the following values of Gender:
Fem(238; 39% of non-emptyGender): su, que, sus, ella, la, cual, cuales, una, Her, lasMasc(375; 61% of non-emptyGender): que, lo, su, sus, ellos, él, cual, los, cuales, elloEMPTY(426): se, le, me, les, nos, quien, yo, cuál, qué, que
| Paradigm él | Masc | Fem |
|---|---|---|
| Case=Acc,Nom|Number=Sing|Person=3 | él, ello | ella |
| Case=Acc,Nom|Number=Plur|Person=3 | ellos | |
| Case=Acc|Number=Sing|Person=3|PrepCase=Npr | lo | la |
| Case=Acc|Number=Plur|Person=3|PrepCase=Npr | los | las |
| Number=Plur|Person=3 | los | |
| Number=Plur | los |
NUM
430 NUM tokens (99% of all NUM tokens) have a non-empty value of Gender.
The most frequent other feature values with which NUM and Gender co-occurred: NumType=Card (430; 100%), NumForm=Digit (313; 73%).
NUM tokens may have the following values of Gender:
Fem(55; 13% of non-emptyGender): dos, tres, 10, 760, cuatro, diez, ocho, 10.000, 12.000, 125Masc(375; 87% of non-emptyGender): dos, 1, 10, 3, mil, tres, seis, 70, cuatro, 100EMPTY(5): Cuatro, Dos, Five, Nine
| Paradigm dos | Masc | Fem |
|---|---|---|
| dos | dos |
Gender seems to be lexical feature of NUM. 93% lemmas (211) occur only with one value of Gender.
VERB
394 VERB tokens (17% of all VERB tokens) have a non-empty value of Gender.
The most frequent other feature values with which VERB and Gender co-occurred: Mood=EMPTY (394; 100%), Person=EMPTY (394; 100%), Tense=Past (383; 97%), VerbForm=Part (383; 97%), Number=Sing (309; 78%).
VERB tokens may have the following values of Gender:
Fem(88; 22% of non-emptyGender): dirigida, consideradas, coprotagonizada, derrotada, destruida, dividida, formada, llamada, localizadas, perdidasMasc(306; 78% of non-emptyGender): debido, hecho, tenido, dado, dejado, visto, acusado, declarado, desarrollado, dichoEMPTY(1871): dijo, tiene, es, hacer, hay, hace, tener, está, tienen, ver
| Paradigm conocer | Masc | Fem |
|---|---|---|
| Number=Sing | conocido | conocida |
| Number=Plur | conocidas |
AUX
15 AUX tokens (2% of all AUX tokens) have a non-empty value of Gender.
The most frequent other feature values with which AUX and Gender co-occurred: Mood=EMPTY (15; 100%), Number=Sing (15; 100%), Person=EMPTY (15; 100%), Tense=Past (15; 100%), VerbForm=Part (15; 100%).
AUX tokens may have the following values of Gender:
Masc(15; 100% of non-emptyGender): sido, estadoEMPTY(619): es, fue, ha, había, está, era, puede, son, ser, fueron
Relations with Agreement in Gender
The 10 most frequent relations where parent and child node agree in Gender:
NOUN –[det]–> DET (2960; 100%),
NOUN –[amod]–> ADJ (1237; 100%),
NOUN –[nmod]–> NOUN (638; 50%),
NOUN –[det]–> PRON (236; 100%),
NOUN –[conj]–> NOUN (167; 64%),
NOUN –[nummod]–> NUM (162; 100%),
PROPN –[flat:name]–> PROPN (137; 98%),
PROPN –[det]–> DET (107; 96%),
NOUN –[appos]–> PROPN (91; 64%),
NOUN –[acl]–> VERB (85; 83%).