Treebank Statistics: UD_Spanish-COSER: Features: Gender
This feature is universal.
It occurs with 2 different values: Fem, Masc.
2052 tokens (25%) have a non-empty value of Gender.
743 types (49%) occur at least once with a non-empty value of Gender.
602 lemmas (58%) occur at least once with a non-empty value of Gender.
The feature is used with 9 part-of-speech tags: NOUN (886; 11% instances), DET (692; 9% instances), PRON (292; 4% instances), ADJ (127; 2% instances), VERB (46; 1% instances), AUX (4; 0% instances), NUM (3; 0% instances), PROPN (1; 0% instances), X (1; 0% instances).
NOUN
886 NOUN tokens (96% of all NOUN tokens) have a non-empty value of Gender.
The most frequent other feature values with which NOUN and Gender co-occurred: Number=Sing (648; 73%).
NOUN tokens may have the following values of Gender:
Fem(430; 49% of non-emptyGender): gente, casa, cosas, cosa, leche, madre, misa, vez, agua, tierraMasc(456; 51% of non-emptyGender): años, días, día, pueblo, hijos, marido, ejemplo, aceite, año, cerdoEMPTY(41): mejor, fin, vez, mano, pesetas, Avena, Tal, abril, agosto, año
| Paradigm hijo | Masc | Fem |
|---|---|---|
| Number=Sing | hijo | hija |
| Number=Plur | hijos |
Gender seems to be lexical feature of NOUN. 98% lemmas (441) occur only with one value of Gender.
DET
692 DET tokens (94% of all DET tokens) have a non-empty value of Gender.
The most frequent other feature values with which DET and Gender co-occurred: PronType=Art (562; 81%), Number=Sing (533; 77%), Definite=Def (425; 61%).
DET tokens may have the following values of Gender:
Fem(350; 51% of non-emptyGender): la, una, las, mucha, otra, esta, unas, esas, estas, esaMasc(342; 49% of non-emptyGender): el, un, los, todo, unos, otro, mucho, to, todos, muchosEMPTY(47): mi, su, cada, sus, bastantes, mis, qué, tus
| Paradigm el | Masc | Fem |
|---|---|---|
| Number=Sing | el, l | la |
| Number=Plur | los | las |
PRON
292 PRON tokens (32% of all PRON tokens) have a non-empty value of Gender.
The most frequent other feature values with which PRON and Gender co-occurred: Reflex=EMPTY (292; 100%), Number=Sing (198; 68%), PronType=Prs (187; 64%), Person=3 (176; 60%), PrepCase=Npr (157; 54%), Case=Acc (155; 53%).
PRON tokens may have the following values of Gender:
Fem(68; 23% of non-emptyGender): la, las, ellas, una, ella, otra, esa, esas, esta, estasMasc(224; 77% of non-emptyGender): lo, los, todo, todos, nosotros, otro, uno, él, otros, eseEMPTY(612): se, yo, que, eso, le, te, me, cómo, qué, nos
| Paradigm él | Masc | Fem |
|---|---|---|
| Case=Acc,Nom|Number=Sing | él, ello | ella |
| Case=Acc,Nom|Number=Plur | ellos | ellas |
| Case=Acc|Number=Sing|PrepCase=Npr | lo | la |
| Case=Acc|Number=Plur|PrepCase=Npr | los | las |
ADJ
127 ADJ tokens (73% of all ADJ tokens) have a non-empty value of Gender.
The most frequent other feature values with which ADJ and Gender co-occurred: Number=Sing (104; 82%), VerbForm=EMPTY (100; 79%).
ADJ tokens may have the following values of Gender:
Fem(41; 32% of non-emptyGender): buena, apretás, pequeña, sola, viejas, ancha, baja, blanca, buenas, cerrásMasc(86; 68% of non-emptyGender): buen, bueno, criollo, mismo, Enrazao, alto, bonito, duro, enterrao, espesitoEMPTY(46): grande, natural, diferente, joven, igual, mayor, Fenomenal, acotumago, amables, buen
| Paradigm primero | Masc | Fem |
|---|---|---|
| primera | ||
| NumType=Ord | primer, primero |
Gender seems to be lexical feature of ADJ. 97% lemmas (88) occur only with one value of Gender.
VERB
46 VERB tokens (5% of all VERB tokens) have a non-empty value of Gender.
The most frequent other feature values with which VERB and Gender co-occurred: Mood=EMPTY (46; 100%), Number=Sing (46; 100%), Person=EMPTY (46; 100%), Tense=Past (46; 100%), VerbForm=Part (46; 100%).
VERB tokens may have the following values of Gender:
Masc(46; 100% of non-emptyGender): dicho, hecho, trabajao, puesto, visto, aprendido, cambiao, comprao, conocido, criaoEMPTY(932): había, hay, hace, hacer, digo, tenía, está, hacía, tiene, va
Gender seems to be lexical feature of VERB. 100% lemmas (36) occur only with one value of Gender.
AUX
4 AUX tokens (2% of all AUX tokens) have a non-empty value of Gender.
The most frequent other feature values with which AUX and Gender co-occurred: Mood=EMPTY (4; 100%), Number=Sing (4; 100%), Person=EMPTY (4; 100%), Tense=Past (4; 100%), VerbForm=Part (4; 100%).
AUX tokens may have the following values of Gender:
Masc(4; 100% of non-emptyGender): sío, estado, sidoEMPTY(259): es, era, ha, he, sea, eran, hemos, son, está, estaba
NUM
3 NUM tokens (3% of all NUM tokens) have a non-empty value of Gender.
The most frequent other feature values with which NUM and Gender co-occurred: NumType=Card (3; 100%), Number=Plur (2; 67%).
NUM tokens may have the following values of Gender:
Fem(1; 33% of non-emptyGender): mediaMasc(2; 67% of non-emptyGender): doscientosEMPTY(88): dos, cinco, cuarenta, cuatro, siete, tres, ocho, cincuenta, diez, catorce
PROPN
1 PROPN tokens (1% of all PROPN tokens) have a non-empty value of Gender.
PROPN tokens may have the following values of Gender:
Masc(1; 100% of non-emptyGender): MunguíEMPTY(67): NP, Pamplona, España, Madrid, San, virgen, Alemania, Ana, Arroa, Bolívar
X
1 X tokens (4% of all X tokens) have a non-empty value of Gender.
X tokens may have the following values of Gender:
Fem(1; 100% of non-emptyGender): punEMPTY(26): anim, cuan, A, Apar, Re, cam, champ, co, cor, cos
Relations with Agreement in Gender
The 10 most frequent relations where parent and child node agree in Gender:
NOUN –[det]–> DET (548; 90%),
NOUN –[amod]–> ADJ (42; 68%),
NOUN –[nmod]–> NOUN (38; 54%),
DET –[det]–> DET (15; 100%),
ADJ –[det]–> DET (14; 88%),
PRON –[det]–> DET (10; 53%),
PRON –[reparandum]–> PRON (9; 100%),
NOUN –[obl]–> NOUN (8; 57%),
NOUN –[reparandum]–> NOUN (7; 100%),
DET –[reparandum]–> DET (6; 55%).