Treebank Statistics: UD_Highland_Puebla_Nahuatl-ITML: Features: Gender
This feature is universal.
It occurs with 2 different values: Fem, Masc.
Some words have combined values of the feature; 1 combinations have been observed: Fem|Masc.
317 tokens (3%) have a non-empty value of Gender.
173 types (9%) occur at least once with a non-empty value of Gender.
156 lemmas (12%) occur at least once with a non-empty value of Gender.
The feature is used with 4 part-of-speech tags: NOUN (223; 2% instances), PROPN (59; 1% instances), ADJ (32; 0% instances), DET (3; 0% instances).
NOUN
223 NOUN tokens (13% of all NOUN tokens) have a non-empty value of Gender.
The most frequent other feature values with which NOUN and Gender co-occurred: Case=EMPTY (223; 100%), NounType=EMPTY (223; 100%), Number[psor]=EMPTY (217; 97%), Person[psor]=EMPTY (217; 97%), Number=Sing (170; 76%).
NOUN tokens may have the following values of Gender:
Fem(64; 29% of non-emptyGender): células, microondas, moléculas, nanotecnologías, cimarrón, manila, manteca, radiación, radiografías, AntropologíaMasc(159; 71% of non-emptyGender): marzo, abril, compadrito, mayo, agosto, junio, átomos, enero, febrero, rayosEMPTY(1434): itech, ika, iteyo, taman, itakka, kajfentaj, kuoujtaj, kuouit, pajti, iuan
Gender seems to be lexical feature of NOUN. 100% lemmas (106) occur only with one value of Gender.
PROPN
59 PROPN tokens (29% of all PROPN tokens) have a non-empty value of Gender.
PROPN tokens may have the following values of Gender:
Fem(21; 36% of non-emptyGender): Maria, niMaria, Amelia, Coral, Cristina, Elisa, Elvira, Emma, Evangelista, GuadalupeMasc(38; 64% of non-emptyGender): Miguel, Anastacio, Nicolas, Eleuterio, Ruben, Alfredo, Damian, Ernesto, Gustavo, JesusEMPTY(142): Kuesalan, San, Tzinacapan, Próxima, centauri, Osollo, Salazar, Tonalix, Vazquez, Xaltipan
Gender seems to be lexical feature of PROPN. 100% lemmas (31) occur only with one value of Gender.
ADJ
32 ADJ tokens (8% of all ADJ tokens) have a non-empty value of Gender.
The most frequent other feature values with which ADJ and Gender co-occurred: Degree=EMPTY (32; 100%), Number=Sing (27; 84%), Number[subj]=EMPTY (25; 78%), Person[subj]=EMPTY (25; 78%).
ADJ tokens may have the following values of Gender:
Fem(6; 19% of non-emptyGender): Autónoma, Láctea, cancerígenas, cimarrón, electromagnética, frescaFem,Masc(9; 28% of non-emptyGender): Ambientales, Austral, Espacial, Norte, celulares, corriente, fuerte, nucleares, útilesMasc(17; 53% of non-emptyGender): morado, moradito, fresco, injertado, mismo, contento, moraditos, pocoEMPTY(392): sesek, istak, kwali, totonik, welik, kwaltsin, uelik, tsikitsitsin, chichiltik, uejueyi
| Paradigm fresco | Masc | Fem |
|---|---|---|
| fresco | fresca |
Gender seems to be lexical feature of ADJ. 95% lemmas (19) occur only with one value of Gender.
DET
3 DET tokens (1% of all DET tokens) have a non-empty value of Gender.
DET tokens may have the following values of Gender:
Fem(3; 100% of non-emptyGender): laEMPTY(490): in, n’, se, ne, nejin, nijin, miyak, nochi, yon, okse
Relations with Agreement in Gender
The 10 most frequent relations where parent and child node agree in Gender:
NOUN –[conj]–> NOUN (22; 65%),
NOUN –[amod]–> NOUN (1; 100%),
PROPN –[nmod]–> NOUN (1; 100%).