Treebank Statistics: UD_Spanish-GSD: Features: Gender
This feature is universal.
It occurs with 2 different values: Fem
, Masc
.
158460 tokens (37%) have a non-empty value of Gender
.
20671 types (45%) occur at least once with a non-empty value of Gender
.
14570 lemmas (42%) occur at least once with a non-empty value of Gender
.
The feature is used with 10 part-of-speech tags: NOUN (70453; 16% instances), DET (56083; 13% instances), ADJ (15416; 4% instances), VERB (7450; 2% instances), PRON (4453; 1% instances), PROPN (3418; 1% instances), X (518; 0% instances), AUX (338; 0% instances), NUM (209; 0% instances), SYM (122; 0% instances).
NOUN
70453 NOUN tokens (91% of all NOUN
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which NOUN
and Gender
co-occurred: Number=Sing (50632; 72%).
NOUN
tokens may have the following values of Gender
:
Fem
(32936; 47% of non-emptyGender
): parte, población, ciudad, personas, familia, vez, forma, vida, agua, regiónMasc
(37517; 53% of non-emptyGender
): años, año, municipio, nombre, lugar, equipo, tiempo, estado, grupo, paísEMPTY
(7129): habitantes, km, Estado, base, euros, frente, Gobierno, Oficina, mar, arte
Paradigm parte | Masc | Fem |
---|---|---|
Number=Sing | parte | parte |
Number=Plur | partes |
Gender
seems to be lexical feature of NOUN
. 97% lemmas (8767) occur only with one value of Gender
.
DET
56083 DET tokens (92% of all DET
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which DET
and Gender
co-occurred: PronType=Art (51186; 91%), Number=Sing (44758; 80%), Definite=Def (43530; 78%).
DET
tokens may have the following values of Gender
:
Fem
(23946; 43% of non-emptyGender
): la, las, una, esta, otras, toda, estas, esa, todas, otraMasc
(32137; 57% of non-emptyGender
): el, los, un, este, otros, ese, estos, todo, todos, unosEMPTY
(4804): su, sus, cada, cualquier, mi, the, tu, qué, mis, a
Paradigm el | Masc | Fem |
---|---|---|
Definite=Def|Number=Sing | el | la, l' |
Definite=Def|Number=Sing|Typo=Yes | al, en | a, al |
Definite=Def|Number=Plur | los | las |
Number=Sing|Typo=Yes | al, en | a |
ADJ
15416 ADJ tokens (62% of all ADJ
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which ADJ
and Gender
co-occurred: Number=Sing (10981; 71%).
ADJ
tokens may have the following values of Gender
:
Fem
(6731; 44% of non-emptyGender
): primera, nueva, segunda, buena, francesa, misma, alta, pequeña, propia, nuevasMasc
(8685; 56% of non-emptyGender
): primer, mismo, nuevo, junto, segundo, español, buen, propio, primeros, únicoEMPTY
(9576): gran, mayor, estadounidense, mejor, total, nacional, grandes, principal, importante, diferentes
Paradigm primero | Masc | Fem |
---|---|---|
Number=Sing | primer, primero | primera |
Number=Sing|NumType=Ord | primer, primero | primera |
Number=Plur | primeros | primeras |
Number=Plur|NumType=Ord | primeros | primeras |
VERB
7450 VERB tokens (20% of all VERB
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which VERB
and Gender
co-occurred: Mood=EMPTY (7450; 100%), Person=EMPTY (7447; 100%), VerbForm=Part (6783; 91%), Number=Sing (5988; 80%), Tense=EMPTY (4374; 59%).
VERB
tokens may have the following values of Gender
:
Fem
(2272; 30% of non-emptyGender
): situada, conocida, ubicada, llamada, dirigida, fundada, publicada, realizada, construida, creadaMasc
(5178; 70% of non-emptyGender
): ubicado, conocido, debido, llamado, hecho, nacido, dado, compuesto, tenido, lanzadoEMPTY
(28915): tiene, es, encuentra, hay, hacer, hace, tenía, tienen, era, fue
Paradigm tener | Masc | Fem |
---|---|---|
Number=Sing|Tense=Past|VerbForm=Part | tenido | |
Number=Sing|VerbForm=Fin | tengo, tuvo | |
Number=Plur|Tense=Past|VerbForm=Part | tenidos | tenidas |
Number=Plur|VerbForm=Fin | tienes | |
VerbForm=Fin | tenéis |
PRON
4453 PRON tokens (32% of all PRON
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which PRON
and Gender
co-occurred: Reflex=EMPTY (4448; 100%), Number=Sing (3347; 75%), PronType=Prs (2883; 65%), Person=3 (2820; 63%), PrepCase=EMPTY (2269; 51%).
PRON
tokens may have the following values of Gender
:
Fem
(1173; 26% of non-emptyGender
): la, una, ella, las, ellas, esta, otra, otras, ésta, muchasMasc
(3280; 74% of non-emptyGender
): lo, uno, los, él, todo, ellos, ello, este, otros, otroEMPTY
(9593): se, que, le, me, cual, nos, quien, esto, les, te
Paradigm él | Masc | Fem |
---|---|---|
Case=Acc,Nom|Number=Sing | él, ello | ella |
Case=Acc,Nom|Number=Plur | ellos | ellas |
Case=Acc|Number=Sing|PrepCase=Npr | lo | la |
Case=Acc|Number=Plur|PrepCase=Npr | los | las |
Case=Dat|Number=Sing|PrepCase=Npr|Typo=Yes | la | |
Case=Nom|Number=Sing | él |
PROPN
3418 PROPN tokens (9% of all PROPN
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which PROPN
and Gender
co-occurred: Number=Sing (2936; 86%).
PROPN
tokens may have the following values of Gender
:
Fem
(946; 28% of non-emptyGender
): guerra, Europea, Ruta, Isla, española, TV, Aérea, batalla, universidad, CienciasMasc
(2472; 72% of non-emptyGender
): Unidos, Estados, Partido, censo, José, of, Club, Diego, País, ríoEMPTY
(35821): san, España, Estados, Unidos, madrid, Juan, septiembre, julio, enero, José
Paradigm the | Masc | Fem |
---|---|---|
the | the |
Gender
seems to be lexical feature of PROPN
. 99% lemmas (2029) occur only with one value of Gender
.
X
518 X tokens (28% of all X
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which X
and Gender
co-occurred: Number=Sing (413; 80%).
X
tokens may have the following values of Gender
:
Fem
(110; 21% of non-emptyGender
): ’s, C, B, cápita, i, pre, semi, ta, C., highMasc
(408; 79% of non-emptyGender
): mm, msnm, ‘s, etc., n., of, co, cis, parking, toEMPTY
(1345): ex, hab, ya, ‘s, C, etc., x, C., and, d
Paradigm 's | Masc | Fem |
---|---|---|
_ | 's | 's |
Number=Sing | 's | 's |
Number=Sing|Person=3 | 's |
Gender
seems to be lexical feature of X
. 96% lemmas (369) occur only with one value of Gender
.
AUX
338 AUX tokens (3% of all AUX
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which AUX
and Gender
co-occurred: Mood=EMPTY (338; 100%), Person=EMPTY (337; 100%), Number=Sing (332; 98%), VerbForm=Part (269; 80%), Tense=Past (268; 79%).
AUX
tokens may have the following values of Gender
:
Fem
(39; 12% of non-emptyGender
): esta, estoy, pudieras, estarías, estas, hasMasc
(299; 88% of non-emptyGender
): sido, estado, ser, podido, este, poder, estar, deber, debido, haberEMPTY
(10409): es, fue, ha, son, ser, eran, era, han, está, puede
Paradigm haber | Masc | Fem |
---|---|---|
Number=Sing | haber, han | |
Number=Plur|Person=3 | has | |
habéis |
NUM
209 NUM tokens (2% of all NUM
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which NUM
and Gender
co-occurred: NumType=Card (209; 100%), Number=Sing (177; 85%), NumForm=Word (169; 81%).
NUM
tokens may have the following values of Gender
:
Fem
(75; 36% of non-emptyGender
): una, media, II, pocas, I, IV, XI, ocho, setenta, 2008-09Masc
(134; 64% of non-emptyGender
): un, uno, ciento, II, medio, cero, millones, V, VIII, XXEMPTY
(10853): dos, tres, 2010, 0, 3, cuatro, 1, 2, 10, 4
Paradigm uno | Masc | Fem |
---|---|---|
un, uno | una |
SYM
122 SYM tokens (7% of all SYM
tokens) have a non-empty value of Gender
.
SYM
tokens may have the following values of Gender
:
Fem
(36; 30% of non-emptyGender
): h, $, &, m, €, +, http://redsismica.uprm.edu/spanish/informacion/terr1918.php, http://www.rumbo.es/disney/Masc
(86; 70% of non-emptyGender
): km, cm, $, &, m, #, º, mundo.com, www.delnuevo, www.dgt.esEMPTY
(1539): %, ², km, º, $, °, €, ª, /, a
Paradigm $ | Masc | Fem |
---|---|---|
Number=Sing | $ | $ |
Number=Sing|VerbForm=Part | $ | |
Number=Plur|VerbForm=Part | $ |
Relations with Agreement in Gender
The 10 most frequent relations where parent and child node agree in Gender
:
NOUN –[det]–> DET (42537; 84%),
NOUN –[amod]–> ADJ (11103; 58%),
NOUN –[conj]–> NOUN (2928; 53%),
NOUN –[acl]–> VERB (1938; 81%),
VERB –[nsubj:pass]–> NOUN (697; 86%),
PRON –[nmod]–> NOUN (500; 68%),
ADJ –[nsubj]–> NOUN (466; 56%),
ADJ –[conj]–> ADJ (448; 54%),
NOUN –[nsubj]–> NOUN (422; 51%),
NOUN –[det]–> PRON (186; 70%).