Treebank Statistics: UD_Spanish-AnCora: Features: Gender
This feature is universal.
It occurs with 2 different values: Fem, Masc.
204131 tokens (36%) have a non-empty value of Gender.
17341 types (45%) occur at least once with a non-empty value of Gender.
11608 lemmas (45%) occur at least once with a non-empty value of Gender.
The feature is used with 8 part-of-speech tags: NOUN (89250; 16% instances), DET (78760; 14% instances), ADJ (24709; 4% instances), PRON (5858; 1% instances), VERB (4775; 1% instances), AUX (480; 0% instances), NUM (290; 0% instances), PROPN (9; 0% instances).
NOUN
89250 NOUN tokens (88% of all NOUN tokens) have a non-empty value of Gender.
The most frequent other feature values with which NOUN and Gender co-occurred: Number=Sing (62388; 70%).
NOUN tokens may have the following values of Gender:
Fem(41348; 46% of non-emptyGender): pesetas, personas, parte, vida, situación, vez, forma, elecciones, empresa, decisiónMasc(47902; 54% of non-emptyGender): años, gobierno, presidente, millones, equipo, partido, país, año, ministro, mundoEMPTY(11778): parte, frente, portavoz, líder, respecto, vez, pese, policía, año, partir
| Paradigm candidato | Masc | Fem |
|---|---|---|
| Number=Sing | candidato | |
| Number=Plur | candidatos | CANDIDATAS |
Gender seems to be lexical feature of NOUN. 99% lemmas (7755) occur only with one value of Gender.
DET
78760 DET tokens (93% of all DET tokens) have a non-empty value of Gender.
The most frequent other feature values with which DET and Gender co-occurred: PronType=Art (71584; 91%), Number=Sing (62068; 79%), Definite=Def (62012; 79%).
DET tokens may have the following values of Gender:
Fem(32385; 41% of non-emptyGender): la, las, una, esta, esa, todas, estas, otras, toda, otraMasc(46375; 59% of non-emptyGender): el, los, un, este, todo, ese, todos, otros, estos, unosEMPTY(5674): su, sus, cada, mi, cualquier, qué, tal, mis, diferentes, tu
| Paradigm el | Masc | Fem |
|---|---|---|
| Definite=Def|ExtPos=ADV|Number=Sing|PronType=Art | la | |
| Definite=Def|ExtPos=SCONJ|Number=Sing|PronType=Art | el | |
| Definite=Def|Foreign=Yes|Number=Sing|PronType=Art | la | |
| Definite=Def|Foreign=Yes|Number=Plur|PronType=Art | les | les |
| Definite=Def|Number=Sing|PronType=Art | el | la |
| Definite=Def|Number=Plur|PronType=Art | los, els | las |
| Number=Sing|PronType=Dem | el | la |
| Number=Plur|PronType=Dem | los | las |
ADJ
24709 ADJ tokens (67% of all ADJ tokens) have a non-empty value of Gender.
The most frequent other feature values with which ADJ and Gender co-occurred: VerbForm=EMPTY (18129; 73%), Number=Sing (17769; 72%).
ADJ tokens may have the following values of Gender:
Fem(10307; 42% of non-emptyGender): primera, nueva, segunda, política, española, última, nuevas, única, buena, públicaMasc(14402; 58% of non-emptyGender): pasado, primer, nuevo, próximo, últimos, español, segundo, último, San, únicoEMPTY(12164): gran, mayor, mejor, general, ex, posible, grandes, social, actual, electoral
| Paradigm primero | Masc | Fem |
|---|---|---|
| Number=Sing | primer, primero | primera |
| Number=Plur | primeros | primeras |
PRON
5858 PRON tokens (23% of all PRON tokens) have a non-empty value of Gender.
The most frequent other feature values with which PRON and Gender co-occurred: Reflex=EMPTY (5857; 100%), Number=Sing (4408; 75%), Person=3 (3444; 59%), PronType=Prs (3344; 57%), PrepCase=EMPTY (3222; 55%).
PRON tokens may have the following values of Gender:
Fem(1191; 20% of non-emptyGender): la, una, ella, las, ellas, otra, ésta, unas, otras, algunasMasc(4667; 80% of non-emptyGender): lo, uno, todo, él, ellos, ello, unos, los, otros, todosEMPTY(19449): que, se, le, me, nos, quien, les, eso, nada, qué
| Paradigm él | Masc | Fem |
|---|---|---|
| Case=Acc,Nom|Number=Sing|PronType=Prs | él, ello | ella |
| Case=Acc,Nom|Number=Plur|PronType=Prs | ellos | ellas |
| Case=Acc|Definite=Def|Number=Sing|PrepCase=Npr|PronType=Prs | lo | |
| Case=Acc|Definite=Ind|Number=Sing|PrepCase=Npr|PronType=Prs | LO | |
| Case=Acc|ExtPos=ADV|Number=Sing|PrepCase=Npr|PronType=Prs | lo | |
| Case=Acc|ExtPos=CCONJ|Number=Sing|PrepCase=Npr|PronType=Prs | lo | |
| Case=Acc|Number=Sing|PrepCase=Npr|PronType=Dem | lo | |
| Case=Acc|Number=Sing|PrepCase=Npr|PronType=Prs | lo | la |
| Case=Acc|Number=Plur|PrepCase=Npr|PronType=Prs | los | las |
| Case=Nom|Number=Sing|PronType=Prs | Ella |
VERB
4775 VERB tokens (10% of all VERB tokens) have a non-empty value of Gender.
The most frequent other feature values with which VERB and Gender co-occurred: Mood=EMPTY (4774; 100%), Person=EMPTY (4774; 100%), VerbForm=Part (4774; 100%), Tense=Past (4772; 100%), Number=Sing (4455; 93%).
VERB tokens may have the following values of Gender:
Fem(333; 7% of non-emptyGender): aprobada, considerada, dada, utilizada, comprada, dadas, incluida, rechazada, recibida, violadaMasc(4442; 93% of non-emptyGender): hecho, dado, tenido, visto, conseguido, pasado, ganado, llegado, perdido, logradoEMPTY(43406): tiene, dijo, hay, hace, hacer, tienen, aseguró, dar, explicó, tener
| Paradigm hacer | Masc | Fem |
|---|---|---|
| Number=Sing | hecho | hecha |
| Number=Plur | hechos |
AUX
480 AUX tokens (4% of all AUX tokens) have a non-empty value of Gender.
The most frequent other feature values with which AUX and Gender co-occurred: Mood=EMPTY (480; 100%), Number=Sing (480; 100%), Person=EMPTY (480; 100%), Tense=Past (480; 100%), VerbForm=Part (480; 100%).
AUX tokens may have the following values of Gender:
Masc(480; 100% of non-emptyGender): sido, podido, estado, debidoEMPTY(13091): es, ha, han, fue, ser, son, está, puede, había, era
NUM
290 NUM tokens (3% of all NUM tokens) have a non-empty value of Gender.
The most frequent other feature values with which NUM and Gender co-occurred: NumType=Card (290; 100%), NumForm=Word (289; 100%), Number=Plur (176; 61%).
NUM tokens may have the following values of Gender:
Fem(93; 32% of non-emptyGender): ambas, media, una, DECENAS, quinientasMasc(197; 68% of non-emptyGender): ambos, medio, un, doscientos, uno, miles, quinientos, dois, euros, ochentaEMPTY(8885): dos, ciento, tres, cinco, cuatro, seis, 20, 30, siete, 10
| Paradigm ambos | Masc | Fem |
|---|---|---|
| ambos | ambas |
PROPN
9 PROPN tokens (0% of all PROPN tokens) have a non-empty value of Gender.
PROPN tokens may have the following values of Gender:
Fem(2; 22% of non-emptyGender): Cuba, LletresMasc(7; 78% of non-emptyGender): SantosEMPTY(41316): España, Madrid, Barcelona, José, Estado, PP, Juan, Nacional, Estados, Aznar
Relations with Agreement in Gender
The 10 most frequent relations where parent and child node agree in Gender:
NOUN –[det]–> DET (58636; 86%),
NOUN –[amod]–> ADJ (17026; 63%),
NOUN –[conj]–> NOUN (2535; 54%),
NOUN –[appos]–> NOUN (928; 51%),
ADJ –[det]–> DET (732; 66%),
ADJ –[nsubj]–> NOUN (613; 57%),
ADJ –[conj]–> ADJ (570; 56%),
PRON –[nmod]–> NOUN (444; 73%),
NOUN –[nmod]–> DET (189; 94%),
ADJ –[det]–> PRON (167; 63%).