Treebank Statistics: UD_Portuguese-CINTIL: Features: Gender
This feature is universal.
It occurs with 2 different values: Fem
, Masc
.
200191 tokens (42%) have a non-empty value of Gender
.
18595 types (54%) occur at least once with a non-empty value of Gender
.
11518 lemmas (46%) occur at least once with a non-empty value of Gender
.
The feature is used with 7 part-of-speech tags: NOUN (86397; 18% instances), DET (77510; 16% instances), ADJ (21937; 5% instances), VERB (7554; 2% instances), PRON (4854; 1% instances), NUM (1938; 0% instances), PROPN (1; 0% instances).
NOUN
86397 NOUN tokens (99% of all NOUN
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which NOUN
and Gender
co-occurred: Number=Sing (61066; 71%).
NOUN
tokens may have the following values of Gender
:
Fem
(40592; 47% of non-emptyGender
): pessoas, empresa, vida, situação, parte, vez, questão, casa, acções, históriaMasc
(45805; 53% of non-emptyGender
): anos, ano, dia, presidente, mercado, caso, lado, país, fim, tempoEMPTY
(826): segurança, razão, líder, jovens, capital, razões, capitais, final, analistas, estudantes
Paradigm presidente | Masc | Fem |
---|---|---|
Number=Sing | presidente | presidente |
Number=Plur | presidentes |
Gender
seems to be lexical feature of NOUN
. 96% lemmas (7369) occur only with one value of Gender
.
DET
77510 DET tokens (99% of all DET
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which DET
and Gender
co-occurred: PronType=Art (70597; 91%), Definite=Def (61198; 79%), Number=Sing (60959; 79%).
DET
tokens may have the following values of Gender
:
Fem
(35910; 46% of non-emptyGender
): a, as, uma, esta, essa, toda, todas, estas, algumas, muitasMasc
(41600; 54% of non-emptyGender
): o, os, um, este, isso, todos, esse, alguns, estes, istoEMPTY
(555): qualquer, cada, tal, isso, quaisquer, isto, tais, aquilo, bastantes
Paradigm qualquer | Masc | Fem |
---|---|---|
qualquer | qualquer |
Gender
seems to be lexical feature of DET
. 92% lemmas (69) occur only with one value of Gender
.
ADJ
21937 ADJ tokens (88% of all ADJ
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which ADJ
and Gender
co-occurred: Number=Sing (15527; 71%).
ADJ
tokens may have the following values of Gender
:
Fem
(10193; 46% of non-emptyGender
): nova, primeira, outra, outras, grande, segunda, portuguesa, mesma, novas, boaMasc
(11744; 54% of non-emptyGender
): novo, outro, outros, primeiro, novos, mesmo, grande, últimos, último, portuguêsEMPTY
(3132): grande, grandes, diferente, importante, bom, maior, diferentes, difícil, nacional, forte
Paradigm outro | Masc | Fem |
---|---|---|
Number=Sing | outro | outra, outro |
Number=Plur | outros | outras |
VERB
7554 VERB tokens (18% of all VERB
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which VERB
and Gender
co-occurred: Mood=EMPTY (7554; 100%), Person=EMPTY (7554; 100%), Tense=EMPTY (7554; 100%), VerbForm=Part (7554; 100%), Number=Sing (4948; 66%).
VERB
tokens may have the following values of Gender
:
Fem
(3295; 44% of non-emptyGender
): feita, feitas, marcada, passada, apresentada, sediada, prevista, anunciada, dada, tomadaMasc
(4259; 56% of non-emptyGender
): passado, feito, divulgados, apresentado, aprovado, preso, divulgado, apresentados, dado, feitosEMPTY
(35067): há, tem, vai, pode, disse, é, ter, diz, quer, têm
Paradigm fazer,feito | Masc | Fem |
---|---|---|
Number=Sing | feito, feitinho | feita |
Number=Plur | feitos | feitas |
PRON
4854 PRON tokens (40% of all PRON
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which PRON
and Gender
co-occurred: PronType=Prs (4130; 85%), Number=Sing (3630; 75%), Person=EMPTY (3367; 69%).
PRON
tokens may have the following values of Gender
:
Fem
(1804; 37% of non-emptyGender
): sua, suas, minha, ela, nossa, elas, -a, a, cuja, minhasMasc
(3050; 63% of non-emptyGender
): seu, ele, tudo, seus, eles, meu, o, nosso, meus, nadaEMPTY
(7407): que, -se, se, quem, onde, eu, nós, -me, me, -lhe
Paradigm -se | Masc | Fem |
---|---|---|
-se | -se |
NUM
1938 NUM tokens (28% of all NUM
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which NUM
and Gender
co-occurred: NumType=Card (1476; 76%), Number=Plur (1473; 76%).
NUM
tokens may have the following values of Gender
:
Fem
(387; 20% of non-emptyGender
): duas, três, cinco, mil, quatro, Ambas, meia, seis, sete, vinteMasc
(1551; 80% of non-emptyGender
): dois, cento, duzentos, milhões, três, mil, quatro, cinco, sete, dezEMPTY
(5088): mil, três, quatro, vinte, cinco, seis, dez, sete, 30, 20
Paradigm mil | Masc | Fem |
---|---|---|
mil | mil |
PROPN
1 PROPN tokens (0% of all PROPN
tokens) have a non-empty value of Gender
.
PROPN
tokens may have the following values of Gender
:
Masc
(1; 100% of non-emptyGender
): PresidenteEMPTY
(45817): Portugal, Lisboa, Porto, Pedro, José, Governo, João, Câmara, Manuel, PSD
Relations with Agreement in Gender
The 10 most frequent relations where parent and child node agree in Gender
:
NOUN –[det]–> DET (59170; 99%),
NOUN –[amod]–> ADJ (17898; 91%),
NOUN –[amod]–> VERB (2954; 56%),
NOUN –[amod]–> PRON (1414; 99%),
NOUN –[det:poss]–> PRON (1124; 100%),
ADJ –[nsubj]–> NOUN (977; 58%),
ADJ –[det]–> DET (887; 94%),
NOUN –[parataxis]–> NOUN (575; 59%),
NOUN –[conj]–> NOUN (544; 55%),
NOUN –[nsubj]–> DET (368; 80%).