Treebank Statistics: UD_Portuguese-GSD: Features: Gender
This feature is universal.
It occurs with 2 different values: Fem
, Masc
.
54658 tokens (17%) have a non-empty value of Gender
.
6927 types (22%) occur at least once with a non-empty value of Gender
.
5820 lemmas (41%) occur at least once with a non-empty value of Gender
.
The feature is used with 11 part-of-speech tags: DET (38829; 12% instances), NOUN (8320; 3% instances), PROPN (3093; 1% instances), ADJ (2066; 1% instances), PRON (1486; 0% instances), VERB (839; 0% instances), ADV (10; 0% instances), X (6; 0% instances), NUM (5; 0% instances), ADP (2; 0% instances), AUX (2; 0% instances).
DET
38829 DET tokens (82% of all DET
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which DET
and Gender
co-occurred: PronType=Art (38013; 98%), Definite=Def (37476; 97%), Number=Sing (33527; 86%).
DET
tokens may have the following values of Gender
:
Fem
(17806; 46% of non-emptyGender
): a, as, uma, sua, esta, essa, suas, todas, outras, minhaMasc
(21023; 54% of non-emptyGender
): o, os, um, seu, a, este, seus, esse, todo, outrosEMPTY
(8773): os, um, uma, sua, seu, o, seus, cada, a, suas
Paradigm o | Masc | Fem |
---|---|---|
Definite=Def|Number=Sing|PronType=Art | o, a | a |
Definite=Def|Number=Plur|PronType=Art | os | as |
Number=Sing|PronType=Art | o | a |
Number=Sing|PronType=Dem | o | |
Number=Plur|PronType=Art | os | as |
NOUN
8320 NOUN tokens (15% of all NOUN
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which NOUN
and Gender
co-occurred: Number=Sing (6281; 75%).
NOUN
tokens may have the following values of Gender
:
Fem
(3676; 44% of non-emptyGender
): feira, pessoas, área, casa, decisão, parte, forma, causa, empresa, equipeMasc
(4644; 56% of non-emptyGender
): anos, dia, ano, km, acordo, estado, país, dias, governo, tempoEMPTY
(48270): anos, ano, dia, r, pessoas, presidente, cidade, acordo, governo, parte
Paradigm presidente | Masc | Fem |
---|---|---|
presidente | presidente |
Gender
seems to be lexical feature of NOUN
. 98% lemmas (2653) occur only with one value of Gender
.
PROPN
3093 PROPN tokens (10% of all PROPN
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which PROPN
and Gender
co-occurred: Number=Sing (3018; 98%).
PROPN
tokens may have the following values of Gender
:
Fem
(947; 31% of non-emptyGender
): Copa, Nova, Maria, La, Espanha, Polícia, Rua, Alemanha, Brasília, CasaMasc
(2146; 69% of non-emptyGender
): Brasil, The, São, R, Rio, Estados, Ministério, O, José, LuizEMPTY
(29186): feira, Brasil, Paulo, São, rio, Federal, Nacional, Estado, janeiro, quinta
Paradigm The | Masc | Fem |
---|---|---|
The | The |
Gender
seems to be lexical feature of PROPN
. 97% lemmas (1962) occur only with one value of Gender
.
ADJ
2066 ADJ tokens (14% of all ADJ
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which ADJ
and Gender
co-occurred: Number=Sing (1589; 77%).
ADJ
tokens may have the following values of Gender
:
Fem
(911; 44% of non-emptyGender
): primeira, segunda, última, maior, grande, ª, alta, americana, mundial, novasMasc
(1155; 56% of non-emptyGender
): primeiro, ex, último, novo, segundo, maior, mesmo, grande, bom, últimosEMPTY
(12973): maior, grande, primeiro, primeira, novo, segundo, última, segunda, mesmo, nova
Paradigm primeiro | Masc | Fem |
---|---|---|
Number=Sing | primeiro | primeira |
Number=Plur | primeiros | primeiras |
PRON
1486 PRON tokens (19% of all PRON
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which PRON
and Gender
co-occurred: Number=Sing (1163; 78%).
PRON
tokens may have the following values of Gender
:
Fem
(383; 26% of non-emptyGender
): que, se, ela, a, elas, onde, la, outra, essa, qualMasc
(1103; 74% of non-emptyGender
): que, se, o, ele, isso, eles, onde, os, lo, lheEMPTY
(6234): que, se, ele, isso, o, eu, um, ela, quem, eles
Paradigm que | Masc | Fem |
---|---|---|
Number=Sing|PronType=Dem | que | |
Number=Sing|PronType=Ind | que | |
Number=Sing|PronType=Int | que | |
Number=Sing|PronType=Rel | que | que |
Number=Plur|PronType=Rel | que | que |
VERB
839 VERB tokens (3% of all VERB
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which VERB
and Gender
co-occurred: VerbForm=Part (838; 100%), Number=Sing (586; 70%).
VERB
tokens may have the following values of Gender
:
Fem
(312; 37% of non-emptyGender
): realizada, feita, denominada, chamada, publicada, considerada, lançada, divulgada, encontrada, enviadaMasc
(527; 63% of non-emptyGender
): cobertos, considerado, lançado, chamado, considerados, esperado, realizado, divulgado, feito, presoEMPTY
(27134): é, tem, disse, está, há, foi, fazer, afirmou, estão, teve
Paradigm fazer | Masc | Fem |
---|---|---|
Number=Sing|Voice=Pass | feito | feita |
Number=Plur | feitos | |
Number=Plur|Voice=Pass | feitos | feitas |
ADV
10 ADV tokens (0% of all ADV
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which ADV
and Gender
co-occurred: Polarity=EMPTY (10; 100%).
ADV
tokens may have the following values of Gender
:
Masc
(10; 100% of non-emptyGender
): juntos, Mal, Nada, caro, devagarinho, entanto, independente, pouco, quantoEMPTY
(9760): não, mais, também, já, ainda, muito, depois, onde, além, apenas
X
6 X tokens (1% of all X
tokens) have a non-empty value of Gender
.
X
tokens may have the following values of Gender
:
Fem
(1; 17% of non-emptyGender
): onMasc
(5; 83% of non-emptyGender
): \epsilon=\epsilon_{0}, \kappa, center, market, spinEMPTY
(397): disso, deles, delas, dele, do, +, etc, @, comigo, nele
NUM
5 NUM tokens (0% of all NUM
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which NUM
and Gender
co-occurred: NumType=EMPTY (3; 60%).
NUM
tokens may have the following values of Gender
:
Fem
(1; 20% of non-emptyGender
): centenasMasc
(4; 80% of non-emptyGender
): cento, cem, sessentaEMPTY
(8507): dois, três, mil, duas, milhões, um, 1, 2012, quatro, 2
ADP
2 ADP tokens (0% of all ADP
tokens) have a non-empty value of Gender
.
ADP
tokens may have the following values of Gender
:
Masc
(2; 100% of non-emptyGender
): queEMPTY
(51224): de, em, a, para, por, com, como, entre, sobre, até
AUX
2 AUX tokens (0% of all AUX
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which AUX
and Gender
co-occurred: Mood=EMPTY (2; 100%), Number=Sing (2; 100%), Person=EMPTY (2; 100%), Tense=EMPTY (2; 100%), VerbForm=Part (2; 100%).
AUX
tokens may have the following values of Gender
:
Masc
(2; 100% of non-emptyGender
): sidoEMPTY
(6934): é, foi, ser, foram, são, será, vai, pode, era, sendo
Relations with Agreement in Gender
The 10 most frequent relations where parent and child node agree in Gender
:
NOUN –[amod]–> ADJ (1671; 99%),
NOUN –[appos]–> PROPN (398; 90%),
NOUN –[acl]–> VERB (355; 71%),
PROPN –[conj]–> PROPN (324; 70%),
NOUN –[conj]–> NOUN (293; 67%),
VERB –[nsubj:pass]–> NOUN (152; 83%),
PROPN –[appos]–> PROPN (150; 75%),
NOUN –[appos]–> NOUN (91; 61%),
NOUN –[nmod]–> PRON (65; 61%),
ADJ –[obl]–> NOUN (63; 55%).