Treebank Statistics: UD_Spanish-PUD: Features: Gender
This feature is universal.
It occurs with 2 different values: Fem
, Masc
.
11661 tokens (50%) have a non-empty value of Gender
.
4255 types (72%) occur at least once with a non-empty value of Gender
.
3457 lemmas (77%) occur at least once with a non-empty value of Gender
.
The feature is used with 8 part-of-speech tags: NOUN (4717; 20% instances), DET (3331; 14% instances), ADJ (1452; 6% instances), PROPN (709; 3% instances), PRON (613; 3% instances), NUM (430; 2% instances), VERB (394; 2% instances), AUX (15; 0% instances).
NOUN
4717 NOUN tokens (98% of all NOUN
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which NOUN
and Gender
co-occurred: Number=Sing (3347; 71%).
NOUN
tokens may have the following values of Gender
:
Fem
(1998; 42% of non-emptyGender
): guerra, parte, ciudad, vez, personas, historia, región, mayoría, vida, vecesMasc
(2719; 58% of non-emptyGender
): años, año, lugar, gobierno, estado, millones, día, embargo, mar, mundoEMPTY
(98): internet, arte, Bank, GCA, Ground, News, North, Street, estudiantes, inmigrantes
Paradigm todo | Masc | Fem |
---|---|---|
Number=Sing | todo | |
Number=Plur | todos | todas |
Gender
seems to be lexical feature of NOUN
. 99% lemmas (1842) occur only with one value of Gender
.
DET
3331 DET tokens (100% of all DET
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which DET
and Gender
co-occurred: PronType=Art (2985; 90%), Number=Sing (2533; 76%), Definite=Def (2529; 76%).
DET
tokens may have the following values of Gender
:
Fem
(1344; 40% of non-emptyGender
): la, las, una, esta, muchas, todas, otra, cada, varias, estasMasc
(1987; 60% of non-emptyGender
): el, los, un, este, esto, ese, cada, muchos, eso, estosEMPTY
(8): The, a, That
Paradigm el | Masc | Fem |
---|---|---|
Number=Sing | el | la |
Number=Plur | los | las |
ADJ
1452 ADJ tokens (98% of all ADJ
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which ADJ
and Gender
co-occurred: Number=Sing (1014; 70%).
ADJ
tokens may have the following values of Gender
:
Fem
(627; 43% of non-emptyGender
): primera, nueva, británica, gran, mayor, segunda, nuevas, americana, nacional, propiaMasc
(825; 57% of non-emptyGender
): gran, primer, últimos, nuevos, Unidos, grandes, mayor, nacional, Unido, mismoEMPTY
(28): Gran, American, Associated, Golden, Metropolitan, Shaky, Stranger, Talking, Wild, austro
Paradigm nuevo | Masc | Fem |
---|---|---|
Number=Sing | nuevo | nueva |
Number=Plur | nuevos, nuevo | nuevas |
PROPN
709 PROPN tokens (57% of all PROPN
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which PROPN
and Gender
co-occurred: Number=Sing (684; 96%).
PROPN
tokens may have the following values of Gender
:
Fem
(173; 24% of non-emptyGender
): Clinton, BBC, Kesha, Rona, luna, Blunt, Guinea, Jasmine, UE, AnayaMasc
(536; 76% of non-emptyGender
): C., Trump, mediterráneo, EUA, C, Donald, Caribe, Joseph, Rafferty, AndesEMPTY
(544): China, Europa, Hong, Kong, Australia, Italia, Pekín, Albania, Bretaña, España
Paradigm Trump | Masc | Fem |
---|---|---|
Trump | Trump |
Gender
seems to be lexical feature of PROPN
. 99% lemmas (512) occur only with one value of Gender
.
PRON
613 PRON tokens (59% of all PRON
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which PRON
and Gender
co-occurred: Reflex=EMPTY (613; 100%), PrepCase=EMPTY (515; 84%), Case=EMPTY (480; 78%), Number=Sing (466; 76%), Poss=EMPTY (387; 63%), PronType=Prs (362; 59%), Person=3 (361; 59%).
PRON
tokens may have the following values of Gender
:
Fem
(238; 39% of non-emptyGender
): su, que, sus, ella, la, cual, cuales, una, Her, lasMasc
(375; 61% of non-emptyGender
): que, lo, su, sus, ellos, él, cual, los, cuales, elloEMPTY
(426): se, le, me, les, nos, quien, yo, cuál, qué, que
Paradigm él | Masc | Fem |
---|---|---|
Case=Acc,Nom|Number=Sing|Person=3 | él, ello | ella |
Case=Acc,Nom|Number=Plur|Person=3 | ellos | |
Case=Acc|Number=Sing|Person=3|PrepCase=Npr | lo | la |
Case=Acc|Number=Plur|Person=3|PrepCase=Npr | los | las |
Number=Plur|Person=3 | los | |
Number=Plur | los |
NUM
430 NUM tokens (99% of all NUM
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which NUM
and Gender
co-occurred: NumType=Card (430; 100%), NumForm=Digit (313; 73%).
NUM
tokens may have the following values of Gender
:
Fem
(55; 13% of non-emptyGender
): dos, tres, 10, 760, cuatro, diez, ocho, 10.000, 12.000, 125Masc
(375; 87% of non-emptyGender
): dos, 1, 10, 3, mil, tres, seis, 70, cuatro, 100EMPTY
(5): Cuatro, Dos, Five, Nine
Paradigm dos | Masc | Fem |
---|---|---|
dos | dos |
Gender
seems to be lexical feature of NUM
. 93% lemmas (211) occur only with one value of Gender
.
VERB
394 VERB tokens (17% of all VERB
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which VERB
and Gender
co-occurred: Mood=EMPTY (394; 100%), Person=EMPTY (394; 100%), Tense=Past (383; 97%), VerbForm=Part (383; 97%), Number=Sing (309; 78%).
VERB
tokens may have the following values of Gender
:
Fem
(88; 22% of non-emptyGender
): dirigida, consideradas, coprotagonizada, derrotada, destruida, dividida, formada, llamada, localizadas, perdidasMasc
(306; 78% of non-emptyGender
): debido, hecho, tenido, dado, dejado, visto, acusado, declarado, desarrollado, dichoEMPTY
(1871): dijo, tiene, es, hacer, hay, hace, tener, está, tienen, ver
Paradigm conocer | Masc | Fem |
---|---|---|
Number=Sing | conocido | conocida |
Number=Plur | conocidas |
AUX
15 AUX tokens (2% of all AUX
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which AUX
and Gender
co-occurred: Mood=EMPTY (15; 100%), Number=Sing (15; 100%), Person=EMPTY (15; 100%), Tense=Past (15; 100%), VerbForm=Part (15; 100%).
AUX
tokens may have the following values of Gender
:
Masc
(15; 100% of non-emptyGender
): sido, estadoEMPTY
(619): es, fue, ha, había, está, era, puede, son, ser, fueron
Relations with Agreement in Gender
The 10 most frequent relations where parent and child node agree in Gender
:
NOUN –[det]–> DET (2960; 100%),
NOUN –[amod]–> ADJ (1237; 100%),
NOUN –[nmod]–> NOUN (638; 50%),
NOUN –[det]–> PRON (236; 100%),
NOUN –[conj]–> NOUN (167; 64%),
NOUN –[nummod]–> NUM (162; 100%),
PROPN –[flat:name]–> PROPN (137; 98%),
PROPN –[det]–> DET (107; 96%),
NOUN –[appos]–> PROPN (91; 64%),
NOUN –[acl]–> VERB (85; 83%).