Treebank Statistics: UD_Portuguese-GSD: Features: Gender
This feature is universal.
It occurs with 2 different values: Fem, Masc.
54658 tokens (17%) have a non-empty value of Gender.
6927 types (22%) occur at least once with a non-empty value of Gender.
5820 lemmas (41%) occur at least once with a non-empty value of Gender.
The feature is used with 11 part-of-speech tags: DET (38829; 12% instances), NOUN (8320; 3% instances), PROPN (3093; 1% instances), ADJ (2066; 1% instances), PRON (1486; 0% instances), VERB (839; 0% instances), ADV (10; 0% instances), X (6; 0% instances), NUM (5; 0% instances), ADP (2; 0% instances), AUX (2; 0% instances).
DET
38829 DET tokens (82% of all DET tokens) have a non-empty value of Gender.
The most frequent other feature values with which DET and Gender co-occurred: PronType=Art (38013; 98%), Definite=Def (37476; 97%), Number=Sing (33527; 86%).
DET tokens may have the following values of Gender:
Fem(17806; 46% of non-emptyGender): a, as, uma, sua, esta, essa, suas, todas, outras, minhaMasc(21023; 54% of non-emptyGender): o, os, um, seu, a, este, seus, esse, todo, outrosEMPTY(8773): os, um, uma, sua, seu, o, seus, cada, a, suas
| Paradigm o | Masc | Fem |
|---|---|---|
| Definite=Def|Number=Sing|PronType=Art | o, a | a |
| Definite=Def|Number=Plur|PronType=Art | os | as |
| Number=Sing|PronType=Art | o | a |
| Number=Sing|PronType=Dem | o | |
| Number=Plur|PronType=Art | os | as |
NOUN
8320 NOUN tokens (15% of all NOUN tokens) have a non-empty value of Gender.
The most frequent other feature values with which NOUN and Gender co-occurred: Number=Sing (6281; 75%).
NOUN tokens may have the following values of Gender:
Fem(3676; 44% of non-emptyGender): feira, pessoas, área, casa, decisão, parte, forma, causa, empresa, equipeMasc(4644; 56% of non-emptyGender): anos, dia, ano, km, acordo, estado, país, dias, governo, tempoEMPTY(48270): anos, ano, dia, r, pessoas, presidente, cidade, acordo, governo, parte
| Paradigm presidente | Masc | Fem |
|---|---|---|
| presidente | presidente |
Gender seems to be lexical feature of NOUN. 98% lemmas (2653) occur only with one value of Gender.
PROPN
3093 PROPN tokens (10% of all PROPN tokens) have a non-empty value of Gender.
The most frequent other feature values with which PROPN and Gender co-occurred: Number=Sing (3018; 98%).
PROPN tokens may have the following values of Gender:
Fem(947; 31% of non-emptyGender): Copa, Nova, Maria, La, Espanha, Polícia, Rua, Alemanha, Brasília, CasaMasc(2146; 69% of non-emptyGender): Brasil, The, São, R, Rio, Estados, Ministério, O, José, LuizEMPTY(29186): feira, Brasil, Paulo, São, rio, Federal, Nacional, Estado, janeiro, quinta
| Paradigm The | Masc | Fem |
|---|---|---|
| The | The |
Gender seems to be lexical feature of PROPN. 97% lemmas (1962) occur only with one value of Gender.
ADJ
2066 ADJ tokens (14% of all ADJ tokens) have a non-empty value of Gender.
The most frequent other feature values with which ADJ and Gender co-occurred: Number=Sing (1589; 77%).
ADJ tokens may have the following values of Gender:
Fem(911; 44% of non-emptyGender): primeira, segunda, última, maior, grande, ª, alta, americana, mundial, novasMasc(1155; 56% of non-emptyGender): primeiro, ex, último, novo, segundo, maior, mesmo, grande, bom, últimosEMPTY(12973): maior, grande, primeiro, primeira, novo, segundo, última, segunda, mesmo, nova
| Paradigm primeiro | Masc | Fem |
|---|---|---|
| Number=Sing | primeiro | primeira |
| Number=Plur | primeiros | primeiras |
PRON
1486 PRON tokens (19% of all PRON tokens) have a non-empty value of Gender.
The most frequent other feature values with which PRON and Gender co-occurred: Number=Sing (1163; 78%).
PRON tokens may have the following values of Gender:
Fem(383; 26% of non-emptyGender): que, se, ela, a, elas, onde, la, outra, essa, qualMasc(1103; 74% of non-emptyGender): que, se, o, ele, isso, eles, onde, os, lo, lheEMPTY(6234): que, se, ele, isso, o, eu, um, ela, quem, eles
| Paradigm que | Masc | Fem |
|---|---|---|
| Number=Sing|PronType=Dem | que | |
| Number=Sing|PronType=Ind | que | |
| Number=Sing|PronType=Int | que | |
| Number=Sing|PronType=Rel | que | que |
| Number=Plur|PronType=Rel | que | que |
VERB
839 VERB tokens (3% of all VERB tokens) have a non-empty value of Gender.
The most frequent other feature values with which VERB and Gender co-occurred: VerbForm=Part (838; 100%), Number=Sing (586; 70%).
VERB tokens may have the following values of Gender:
Fem(312; 37% of non-emptyGender): realizada, feita, denominada, chamada, publicada, considerada, lançada, divulgada, encontrada, enviadaMasc(527; 63% of non-emptyGender): cobertos, considerado, lançado, chamado, considerados, esperado, realizado, divulgado, feito, presoEMPTY(27134): é, tem, disse, está, há, foi, fazer, afirmou, estão, teve
| Paradigm fazer | Masc | Fem |
|---|---|---|
| Number=Sing|Voice=Pass | feito | feita |
| Number=Plur | feitos | |
| Number=Plur|Voice=Pass | feitos | feitas |
ADV
10 ADV tokens (0% of all ADV tokens) have a non-empty value of Gender.
The most frequent other feature values with which ADV and Gender co-occurred: Polarity=EMPTY (10; 100%).
ADV tokens may have the following values of Gender:
Masc(10; 100% of non-emptyGender): juntos, Mal, Nada, caro, devagarinho, entanto, independente, pouco, quantoEMPTY(9760): não, mais, também, já, ainda, muito, depois, onde, além, apenas
X
6 X tokens (1% of all X tokens) have a non-empty value of Gender.
X tokens may have the following values of Gender:
Fem(1; 17% of non-emptyGender): onMasc(5; 83% of non-emptyGender): \epsilon=\epsilon_{0}, \kappa, center, market, spinEMPTY(397): disso, deles, delas, dele, do, +, etc, @, comigo, nele
NUM
5 NUM tokens (0% of all NUM tokens) have a non-empty value of Gender.
The most frequent other feature values with which NUM and Gender co-occurred: NumType=EMPTY (3; 60%).
NUM tokens may have the following values of Gender:
Fem(1; 20% of non-emptyGender): centenasMasc(4; 80% of non-emptyGender): cento, cem, sessentaEMPTY(8507): dois, três, mil, duas, milhões, um, 1, 2012, quatro, 2
ADP
2 ADP tokens (0% of all ADP tokens) have a non-empty value of Gender.
ADP tokens may have the following values of Gender:
Masc(2; 100% of non-emptyGender): queEMPTY(51224): de, em, a, para, por, com, como, entre, sobre, até
AUX
2 AUX tokens (0% of all AUX tokens) have a non-empty value of Gender.
The most frequent other feature values with which AUX and Gender co-occurred: Mood=EMPTY (2; 100%), Number=Sing (2; 100%), Person=EMPTY (2; 100%), Tense=EMPTY (2; 100%), VerbForm=Part (2; 100%).
AUX tokens may have the following values of Gender:
Masc(2; 100% of non-emptyGender): sidoEMPTY(6934): é, foi, ser, foram, são, será, vai, pode, era, sendo
Relations with Agreement in Gender
The 10 most frequent relations where parent and child node agree in Gender:
NOUN –[amod]–> ADJ (1671; 99%),
NOUN –[appos]–> PROPN (398; 90%),
NOUN –[acl]–> VERB (355; 71%),
PROPN –[conj]–> PROPN (324; 70%),
NOUN –[conj]–> NOUN (293; 67%),
VERB –[nsubj:pass]–> NOUN (152; 83%),
PROPN –[appos]–> PROPN (150; 75%),
NOUN –[appos]–> NOUN (91; 61%),
NOUN –[nmod]–> PRON (65; 61%),
ADJ –[obl]–> NOUN (63; 55%).