Treebank Statistics: UD_Portuguese-Porttinari: Features: Gender
This feature is universal.
It occurs with 2 different values: Fem, Masc.
64442 tokens (38%) have a non-empty value of Gender.
9652 types (50%) occur at least once with a non-empty value of Gender.
6592 lemmas (51%) occur at least once with a non-empty value of Gender.
The feature is used with 7 part-of-speech tags: NOUN (29321; 17% instances), DET (24062; 14% instances), ADJ (5484; 3% instances), PRON (2722; 2% instances), VERB (2141; 1% instances), NUM (650; 0% instances), AUX (62; 0% instances).
NOUN
29321 NOUN tokens (94% of all NOUN tokens) have a non-empty value of Gender.
The most frequent other feature values with which NOUN and Gender co-occurred: Number=Sing (21169; 72%).
NOUN tokens may have the following values of Gender:
Fem(13979; 48% of non-emptyGender): pessoas, vez, parte, empresa, casa, cidade, história, empresas, gente, formaMasc(15342; 52% of non-emptyGender): anos, ano, dia, país, tempo, governo, mercado, caso, mundo, acordoEMPTY(1941): presidente, polícia, segurança, capital, final, clientes, ex-presidente, local, cara, modelo
| Paradigm filho | Masc | Fem |
|---|---|---|
| Number=Sing | filho | filha |
| Number=Plur | filhos | filhas |
Gender seems to be lexical feature of NOUN. 97% lemmas (4517) occur only with one value of Gender.
DET
24062 DET tokens (99% of all DET tokens) have a non-empty value of Gender.
The most frequent other feature values with which DET and Gender co-occurred: PronType=Art (21217; 88%), Number=Sing (19647; 82%), Definite=Def (18936; 79%).
DET tokens may have the following values of Gender:
Fem(11125; 46% of non-emptyGender): a, as, uma, sua, essa, esta, suas, essas, minha, outrasMasc(12937; 54% of non-emptyGender): o, os, um, seu, esse, este, seus, outros, mesmo, todosEMPTY(300): cada, mais, qualquer, que, menos, tal, demais, quais, qual, tais
| Paradigm o | Masc | Fem |
|---|---|---|
| ExtPos=ADV|Number=Sing | o | |
| Number=Sing | o | a |
| Number=Plur | os | as |
ADJ
5484 ADJ tokens (64% of all ADJ tokens) have a non-empty value of Gender.
The most frequent other feature values with which ADJ and Gender co-occurred: VerbForm=EMPTY (4436; 81%), Number=Sing (3865; 70%).
ADJ tokens may have the following values of Gender:
Fem(2495; 45% of non-emptyGender): primeira, nova, brasileira, muitas, última, segunda, política, boa, novas, públicaMasc(2989; 55% of non-emptyGender): novo, últimos, primeiro, muitos, bom, passado, preciso, último, segundo, brasileiroEMPTY(3107): maior, grande, melhor, possível, importante, sociais, difícil, grandes, principal, atual
| Paradigm novo | Masc | Fem |
|---|---|---|
| Number=Sing | novo | nova |
| Number=Plur | novos | novas |
PRON
2722 PRON tokens (43% of all PRON tokens) have a non-empty value of Gender.
The most frequent other feature values with which PRON and Gender co-occurred: Number=Sing (2139; 79%), Person=3 (1778; 65%), Case=EMPTY (1605; 59%).
PRON tokens may have the following values of Gender:
Fem(613; 23% of non-emptyGender): ela, a, elas, as, essa, la, esta, algumas, outra, outrasMasc(2109; 77% of non-emptyGender): o, ele, isso, eles, os, nada, algo, lo, outro, umEMPTY(3646): que, se, eu, quem, me, tudo, você, nos, nós, ninguém
| Paradigm ele | Masc | Fem |
|---|---|---|
| Number=Sing | ele | ela |
| Number=Plur | eles | elas |
VERB
2141 VERB tokens (13% of all VERB tokens) have a non-empty value of Gender.
The most frequent other feature values with which VERB and Gender co-occurred: Mood=EMPTY (2141; 100%), Person=EMPTY (2141; 100%), Tense=EMPTY (2141; 100%), VerbForm=Part (2141; 100%), Voice=Pass (1808; 84%), Number=Sing (1530; 71%).
VERB tokens may have the following values of Gender:
Fem(737; 34% of non-emptyGender): feita, feitas, realizada, procurada, chamada, criada, seguida, usadas, considerada, dadaMasc(1404; 66% of non-emptyGender): feito, devido, usado, preso, visto, apresentado, chamado, recebido, apontado, conhecidoEMPTY(14963): diz, tem, há, disse, pode, fazer, ter, afirma, deve, teve
| Paradigm ter | Masc | Fem |
|---|---|---|
| Number=Sing | tido | |
| Number=Plur|Voice=Pass | tidas |
NUM
650 NUM tokens (20% of all NUM tokens) have a non-empty value of Gender.
The most frequent other feature values with which NUM and Gender co-occurred: NumType=Card (559; 86%).
NUM tokens may have the following values of Gender:
Fem(258; 40% of non-emptyGender): uma, duas, primeira, segunda, meia, 2ª, terceira, 71ªMasc(392; 60% of non-emptyGender): um, dois, primeiro, 1º, segundo, meio, terceiro, primeiros, 3º, quintoEMPTY(2649): três, mil, 20, quatro, 30, 2016, 2018, 12, 15, cinco
| Paradigm um | Masc | Fem |
|---|---|---|
| ExtPos=ADV | um | |
| ExtPos=SCONJ | uma | |
| um | uma |
AUX
62 AUX tokens (1% of all AUX tokens) have a non-empty value of Gender.
The most frequent other feature values with which AUX and Gender co-occurred: Mood=EMPTY (62; 100%), Number=Sing (62; 100%), Person=EMPTY (62; 100%), Tense=EMPTY (62; 100%), VerbForm=Part (62; 100%).
AUX tokens may have the following values of Gender:
Masc(62; 100% of non-emptyGender): sidoEMPTY(4744): é, foi, ser, está, são, era, foram, será, estão, estava
Relations with Agreement in Gender
The 10 most frequent relations where parent and child node agree in Gender:
NOUN –[det]–> DET (18839; 93%),
NOUN –[amod]–> ADJ (3801; 62%),
NOUN –[conj]–> NOUN (766; 50%),
VERB –[nsubj:pass]–> NOUN (485; 91%),
ADJ –[nsubj]–> NOUN (228; 54%),
NUM –[nmod]–> NOUN (131; 56%),
PRON –[nmod]–> NOUN (94; 57%),
PRON –[det]–> DET (75; 63%),
NUM –[det]–> DET (66; 65%),
PRON –[nsubj]–> NOUN (45; 67%).