Treebank Statistics: UD_Portuguese-PUD: Features: Gender
This feature is universal.
It occurs with 2 different values: Fem
, Masc
.
12281 tokens (52%) have a non-empty value of Gender
.
4740 types (80%) occur at least once with a non-empty value of Gender
.
3266 lemmas (86%) occur at least once with a non-empty value of Gender
.
The feature is used with 9 part-of-speech tags: NOUN (4598; 20% instances), DET (3537; 15% instances), ADJ (1550; 7% instances), PROPN (1393; 6% instances), PRON (550; 2% instances), VERB (357; 2% instances), NUM (274; 1% instances), ADP (11; 0% instances), AUX (11; 0% instances).
NOUN
4598 NOUN tokens (100% of all NOUN
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which NOUN
and Gender
co-occurred: Number=Sing (3258; 71%).
NOUN
tokens may have the following values of Gender
:
Fem
(2070; 45% of non-emptyGender
): vez, pessoas, guerra, parte, cidade, região, vida, vezes, volta, áreaMasc
(2528; 55% of non-emptyGender
): anos, ano, governo, estado, mundo, acordo, século, tempo, sul, diaEMPTY
(2): porta-voz
Gender
seems to be lexical feature of NOUN
. 98% lemmas (1653) occur only with one value of Gender
.
DET
3537 DET tokens (100% of all DET
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which DET
and Gender
co-occurred: PronType=Art (3217; 91%), Definite=EMPTY (3040; 86%), Number=Sing (2771; 78%).
DET
tokens may have the following values of Gender
:
Fem
(1658; 47% of non-emptyGender
): a, as, uma, esta, várias, outras, muitas, cada, própria, estasMasc
(1879; 53% of non-emptyGender
): o, os, um, este, muitos, cada, isso, outros, vários, algunsEMPTY
(4): os
Paradigm o | Masc | Fem |
---|---|---|
Definite=Def|Number=Sing | o | a |
Definite=Def|Number=Plur | os | as, os |
Number=Sing | o | a |
Number=Plur | os | as |
ADJ
1550 ADJ tokens (100% of all ADJ
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which ADJ
and Gender
co-occurred: Number=Sing (1073; 69%).
ADJ
tokens may have the following values of Gender
:
Fem
(707; 46% of non-emptyGender
): primeira, nova, grande, mais, maior, grandes, segunda, última, americana, britânicaMasc
(843; 54% of non-emptyGender
): grande, mais, novos, primeiro, últimos, novo, maior, Unidos, todo, melhor
PROPN
1393 PROPN tokens (100% of all PROPN
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which PROPN
and Gender
co-occurred: Number=Sing (1357; 97%), Foreign=EMPTY (1229; 88%).
PROPN
tokens may have the following values of Gender
:
Fem
(567; 41% of non-emptyGender
): China, América, Austrália, Europa, França, Grécia, Itália, Albânia, Clinton, ParisMasc
(826; 59% of non-emptyGender
): Trump, Mediterrâneo, the, Caribe, EUA, Hong, Kong, Donald, Joseph, Mar
Paradigm Trump | Masc | Fem |
---|---|---|
Trump | Trump |
Gender
seems to be lexical feature of PROPN
. 97% lemmas (970) occur only with one value of Gender
.
PRON
550 PRON tokens (59% of all PRON
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which PRON
and Gender
co-occurred: Person=3 (444; 81%), Number=Sing (422; 77%), Case=EMPTY (360; 65%), Number[psor]=EMPTY (316; 57%), PronType=EMPTY (305; 55%).
PRON
tokens may have the following values of Gender
:
Fem
(183; 33% of non-emptyGender
): sua, ela, suas, a, elas, esta, minha, nossa, aquela, essaMasc
(367; 67% of non-emptyGender
): ele, seu, eles, o, seus, isso, isto, lo, tudo, esteEMPTY
(384): que, se, eu, nós, qual, quais, lhe, quem, você, vocês
VERB
357 VERB tokens (18% of all VERB
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which VERB
and Gender
co-occurred: Mood=EMPTY (357; 100%), Person=EMPTY (357; 100%), Tense=EMPTY (357; 100%), Number=Sing (235; 66%).
VERB
tokens may have the following values of Gender
:
Fem
(127; 36% of non-emptyGender
): conhecidas, construída, crescidas, derrotada, destruída, dividida, encontrada, estabelecidas, formada, levantadasMasc
(230; 64% of non-emptyGender
): devido, feito, realizado, conhecido, construído, coprotagonizado, dito, usado, utilizado, acusadoEMPTY
(1675): disse, há, tem, começou, diz, fazer, é, está, ter, fez
NUM
274 NUM tokens (58% of all NUM
tokens) have a non-empty value of Gender
.
NUM
tokens may have the following values of Gender
:
Fem
(31; 11% of non-emptyGender
): duas, uma, 760, 15,001, 19,999, 330.000, 360, 500, 600.000Masc
(243; 89% of non-emptyGender
): dois, um, 1, 1492, 2010, 2012, 2014, 2015, 2017, 1980EMPTY
(195): três, milhões, quatro, 10, 3, seis, dez, 100, 20, 50
ADP
11 ADP tokens (0% of all ADP
tokens) have a non-empty value of Gender
.
ADP
tokens may have the following values of Gender
:
Fem
(6; 55% of non-emptyGender
): a, nessa, daquelaMasc
(5; 45% of non-emptyGender
): nesse, Aqueles, consigo, nestesEMPTY
(3806): de, em, a, para, por, com, como, que, durante, entre
AUX
11 AUX tokens (1% of all AUX
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which AUX
and Gender
co-occurred: Mood=EMPTY (11; 100%), Person=EMPTY (11; 100%), Tense=EMPTY (11; 100%), Number=Sing (8; 73%).
AUX
tokens may have the following values of Gender
:
Fem
(4; 36% of non-emptyGender
): consideradas, deixada, nomeadaMasc
(7; 64% of non-emptyGender
): declarado, proclamado, chamado, considerados, tornadoEMPTY
(796): é, foi, foram, são, ser, sido, está, pode, tinha, ter
Relations with Agreement in Gender
The 10 most frequent relations where parent and child node agree in Gender
:
NOUN –[det]–> DET (3039; 100%),
NOUN –[amod]–> ADJ (1282; 100%),
NOUN –[nmod]–> NOUN (629; 51%),
PROPN –[det]–> DET (358; 99%),
NOUN –[det]–> PRON (232; 100%),
NOUN –[nmod]–> PROPN (190; 54%),
NOUN –[conj]–> NOUN (159; 61%),
PROPN –[flat]–> PROPN (158; 100%),
PROPN –[flat:name]–> PROPN (152; 100%),
NOUN –[appos]–> PROPN (140; 92%).