Treebank Statistics: UD_Portuguese-PUD: Features: Gender
This feature is universal.
It occurs with 2 different values: Fem, Masc.
12281 tokens (52%) have a non-empty value of Gender.
4740 types (80%) occur at least once with a non-empty value of Gender.
3266 lemmas (86%) occur at least once with a non-empty value of Gender.
The feature is used with 9 part-of-speech tags: NOUN (4598; 20% instances), DET (3537; 15% instances), ADJ (1550; 7% instances), PROPN (1393; 6% instances), PRON (550; 2% instances), VERB (357; 2% instances), NUM (274; 1% instances), ADP (11; 0% instances), AUX (11; 0% instances).
NOUN
4598 NOUN tokens (100% of all NOUN tokens) have a non-empty value of Gender.
The most frequent other feature values with which NOUN and Gender co-occurred: Number=Sing (3258; 71%).
NOUN tokens may have the following values of Gender:
Fem(2070; 45% of non-emptyGender): vez, pessoas, guerra, parte, cidade, região, vida, vezes, volta, áreaMasc(2528; 55% of non-emptyGender): anos, ano, governo, estado, mundo, acordo, século, tempo, sul, diaEMPTY(2): porta-voz
Gender seems to be lexical feature of NOUN. 98% lemmas (1653) occur only with one value of Gender.
DET
3537 DET tokens (100% of all DET tokens) have a non-empty value of Gender.
The most frequent other feature values with which DET and Gender co-occurred: PronType=Art (3217; 91%), Definite=EMPTY (3040; 86%), Number=Sing (2771; 78%).
DET tokens may have the following values of Gender:
Fem(1658; 47% of non-emptyGender): a, as, uma, esta, várias, outras, muitas, cada, própria, estasMasc(1879; 53% of non-emptyGender): o, os, um, este, muitos, cada, isso, outros, vários, algunsEMPTY(4): os
| Paradigm o | Masc | Fem |
|---|---|---|
| Definite=Def|Number=Sing | o | a |
| Definite=Def|Number=Plur | os | as, os |
| Number=Sing | o | a |
| Number=Plur | os | as |
ADJ
1550 ADJ tokens (100% of all ADJ tokens) have a non-empty value of Gender.
The most frequent other feature values with which ADJ and Gender co-occurred: Number=Sing (1073; 69%).
ADJ tokens may have the following values of Gender:
Fem(707; 46% of non-emptyGender): primeira, nova, grande, mais, maior, grandes, segunda, última, americana, britânicaMasc(843; 54% of non-emptyGender): grande, mais, novos, primeiro, últimos, novo, maior, Unidos, todo, melhor
PROPN
1393 PROPN tokens (100% of all PROPN tokens) have a non-empty value of Gender.
The most frequent other feature values with which PROPN and Gender co-occurred: Number=Sing (1357; 97%), Foreign=EMPTY (1229; 88%).
PROPN tokens may have the following values of Gender:
Fem(567; 41% of non-emptyGender): China, América, Austrália, Europa, França, Grécia, Itália, Albânia, Clinton, ParisMasc(826; 59% of non-emptyGender): Trump, Mediterrâneo, the, Caribe, EUA, Hong, Kong, Donald, Joseph, Mar
| Paradigm Trump | Masc | Fem |
|---|---|---|
| Trump | Trump |
Gender seems to be lexical feature of PROPN. 97% lemmas (970) occur only with one value of Gender.
PRON
550 PRON tokens (59% of all PRON tokens) have a non-empty value of Gender.
The most frequent other feature values with which PRON and Gender co-occurred: Person=3 (444; 81%), Number=Sing (422; 77%), Case=EMPTY (360; 65%), Number[psor]=EMPTY (316; 57%), PronType=EMPTY (305; 55%).
PRON tokens may have the following values of Gender:
Fem(183; 33% of non-emptyGender): sua, ela, suas, a, elas, esta, minha, nossa, aquela, essaMasc(367; 67% of non-emptyGender): ele, seu, eles, o, seus, isso, isto, lo, tudo, esteEMPTY(384): que, se, eu, nós, qual, quais, lhe, quem, você, vocês
VERB
357 VERB tokens (18% of all VERB tokens) have a non-empty value of Gender.
The most frequent other feature values with which VERB and Gender co-occurred: Mood=EMPTY (357; 100%), Person=EMPTY (357; 100%), Tense=EMPTY (357; 100%), Number=Sing (235; 66%).
VERB tokens may have the following values of Gender:
Fem(127; 36% of non-emptyGender): conhecidas, construída, crescidas, derrotada, destruída, dividida, encontrada, estabelecidas, formada, levantadasMasc(230; 64% of non-emptyGender): devido, feito, realizado, conhecido, construído, coprotagonizado, dito, usado, utilizado, acusadoEMPTY(1675): disse, há, tem, começou, diz, fazer, é, está, ter, fez
NUM
274 NUM tokens (58% of all NUM tokens) have a non-empty value of Gender.
NUM tokens may have the following values of Gender:
Fem(31; 11% of non-emptyGender): duas, uma, 760, 15,001, 19,999, 330.000, 360, 500, 600.000Masc(243; 89% of non-emptyGender): dois, um, 1, 1492, 2010, 2012, 2014, 2015, 2017, 1980EMPTY(195): três, milhões, quatro, 10, 3, seis, dez, 100, 20, 50
ADP
11 ADP tokens (0% of all ADP tokens) have a non-empty value of Gender.
ADP tokens may have the following values of Gender:
Fem(6; 55% of non-emptyGender): a, nessa, daquelaMasc(5; 45% of non-emptyGender): nesse, Aqueles, consigo, nestesEMPTY(3806): de, em, a, para, por, com, como, que, durante, entre
AUX
11 AUX tokens (1% of all AUX tokens) have a non-empty value of Gender.
The most frequent other feature values with which AUX and Gender co-occurred: Mood=EMPTY (11; 100%), Person=EMPTY (11; 100%), Tense=EMPTY (11; 100%), Number=Sing (8; 73%).
AUX tokens may have the following values of Gender:
Fem(4; 36% of non-emptyGender): consideradas, deixada, nomeadaMasc(7; 64% of non-emptyGender): declarado, proclamado, chamado, considerados, tornadoEMPTY(796): é, foi, foram, são, ser, sido, está, pode, tinha, ter
Relations with Agreement in Gender
The 10 most frequent relations where parent and child node agree in Gender:
NOUN –[det]–> DET (3039; 100%),
NOUN –[amod]–> ADJ (1282; 100%),
NOUN –[nmod]–> NOUN (629; 51%),
PROPN –[det]–> DET (358; 99%),
NOUN –[det]–> PRON (232; 100%),
NOUN –[nmod]–> PROPN (190; 54%),
NOUN –[conj]–> NOUN (159; 61%),
PROPN –[flat]–> PROPN (158; 100%),
PROPN –[flat:name]–> PROPN (152; 100%),
NOUN –[appos]–> PROPN (140; 92%).