This is part of archived UD v1 documentation. See http://universaldependencies.org/ for the current version.
home pt/feat issue tracker

Gender: gender

Gender is usually a lexical feature of nouns and inflectional feature of other parts of speech (pronouns, adjectives, determiners, numerals, verbs) that mark agreement with nouns.

Masc: masculine gender

Nouns denoting male persons are masculine. Other nouns may be also grammatically masculine, without any relation to sex.

Examples

Fem: feminine gender

Nouns denoting female persons are feminine. Other nouns may be also grammatically feminine, without any relation to sex.

Examples

Unsp: unspecified

Unsp is used to tag words that can be masculine or feminine when the context is not enough to make clear its gender.

Examples


Treebank Statistics (UD_Portuguese)

This feature is universal. It occurs with 2 different values: Fem, Masc.

104749 tokens (46%) have a non-empty value of Gender. 17718 types (69%) occur at least once with a non-empty value of Gender. 13097 lemmas (73%) occur at least once with a non-empty value of Gender. The feature is used with 8 part-of-speech tags: NOUN (38942; 17% instances), DET (34335; 15% instances), ADJ (10649; 5% instances), PROPN (6096; 3% instances), PRON (5977; 3% instances), VERB (4225; 2% instances), NUM (4109; 2% instances), SYM (416; 0% instances).

NOUN

38942 NOUN tokens (93% of all NOUN tokens) have a non-empty value of Gender.

The most frequent other feature values with which NOUN and Gender co-occurred: Number=Sing (27451; 70%).

NOUN tokens may have the following values of Gender:

Paradigm milhãoMascFem
Number=Singmilhão
Number=Plurmilhõesmilhões
Number=Plur|Typo=Yesmi

Gender seems to be lexical feature of NOUN. 98% lemmas (6317) occur only with one value of Gender.

DET

34335 DET tokens (96% of all DET tokens) have a non-empty value of Gender.

The most frequent other feature values with which DET and Gender co-occurred: PronType=Art (30272; 88%), Number=Sing (27222; 79%), Definite=Def (27214; 79%).

DET tokens may have the following values of Gender:

Paradigm oMascFem
Definite=Defo(s)
Definite=Def|Number=Singoa
Definite=Def|Number=Sing|PronType=Arto, Os, aa, as
Definite=Def|Number=Sing|PronType=Art|Typo=Yesosa
Definite=Def|Number=Plurosas
Definite=Def|Number=Plur|PronType=Artos, oas
Definite=Ind|Number=Sing|PronType=Arto
Number=Singo
Number=Sing|NumType=Card|PronType=Ind,Neg,Tota
Number=Sing|PronType=Artoa
Number=Sing|PronType=Demoa
Number=Pluros
Number=Plur|PronType=Artosas
Number=Plur|PronType=Demosas

ADJ

10649 ADJ tokens (97% of all ADJ tokens) have a non-empty value of Gender.

The most frequent other feature values with which ADJ and Gender co-occurred: Number=Sing (7554; 71%).

ADJ tokens may have the following values of Gender:

Paradigm grandeMascFem
Degree=Cmp|Number=Singmaiormaior
Degree=Cmp|Number=Plurmaioresmaiores
Degree=Sup|Number=Singmáximomáxima
Degree=Sup|Number=Plurmáximos
Number=Singgrande, maior, máximogrande, maior, máxima
Number=Plurgrandes, máximosgrandes, maiores

PROPN

6096 PROPN tokens (33% of all PROPN tokens) have a non-empty value of Gender.

The most frequent other feature values with which PROPN and Gender co-occurred: Number=Sing (5871; 96%).

PROPN tokens may have the following values of Gender:

Paradigm EUAMascFem
Hyph=YesEUA
EUAEUA

Gender seems to be lexical feature of PROPN. 97% lemmas (2691) occur only with one value of Gender.

PRON

5977 PRON tokens (89% of all PRON tokens) have a non-empty value of Gender.

The most frequent other feature values with which PRON and Gender co-occurred: Reflex=EMPTY (5313; 89%), Number=Sing (4303; 72%), Case=EMPTY (3839; 64%), Person=EMPTY (3817; 64%).

PRON tokens may have the following values of Gender:

Paradigm queMascFem
Number=Sing|PronType=Indqueque
Number=Sing|PronType=Intque
Number=Sing|PronType=Relqueque
Number=Sing|PronType=Rel|Typo=Yesque
Number=Plur|PronType=Indque
Number=Plur|PronType=Intqueque
Number=Plur|PronType=Relqueque
PronType=Relque

VERB

4225 VERB tokens (16% of all VERB tokens) have a non-empty value of Gender.

The most frequent other feature values with which VERB and Gender co-occurred: Tense=EMPTY (4225; 100%), Person=EMPTY (4224; 100%), Mood=EMPTY (4224; 100%), VerbForm=Part (4223; 100%), Number=Sing (2817; 67%).

VERB tokens may have the following values of Gender:

Paradigm terMascFem
Number=Singtidotida
Number=Plurtidas

NUM

4109 NUM tokens (96% of all NUM tokens) have a non-empty value of Gender.

The most frequent other feature values with which NUM and Gender co-occurred: NumType=Card (4109; 100%), Number=Plur (3173; 77%).

NUM tokens may have the following values of Gender:

Paradigm umMascFem
Number=Singumuma
Number=Plurum

SYM

416 SYM tokens (92% of all SYM tokens) have a non-empty value of Gender.

The most frequent other feature values with which SYM and Gender co-occurred: Number=Plur (408; 98%).

SYM tokens may have the following values of Gender:

Relations with Agreement in Gender

The 10 most frequent relations where parent and child node agree in Gender: NOUN –[det]–> DET (26415; 94%), NOUN –[amod]–> ADJ (8131; 98%), NOUN –[nummod]–> NUM (2432; 89%), NOUN –[conj]–> NOUN (1265; 58%), VERB –[nsubjpass]–> NOUN (633; 94%), ADJ –[det]–> DET (552; 92%), ADJ –[conj]–> ADJ (362; 96%), ADJ –[nsubj]–> NOUN (354; 95%), NUM –[nmod]–> NOUN (318; 87%), SYM –[nummod]–> NUM (317; 100%).


Treebank Statistics (UD_Portuguese-Bosque)

This feature is universal but the values Unsp are language-specific. It occurs with 3 different values: Fem, Masc, Unsp.

108919 tokens (48%) have a non-empty value of Gender. 18839 types (72%) occur at least once with a non-empty value of Gender. 14360 lemmas (80%) occur at least once with a non-empty value of Gender. The feature is used with 10 part-of-speech tags: NOUN (40747; 18% instances), DET (34025; 15% instances), PROPN (11709; 5% instances), ADJ (10975; 5% instances), PRON (7034; 3% instances), VERB (4203; 2% instances), SYM (203; 0% instances), AUX (19; 0% instances), INTJ (3; 0% instances), PART (1; 0% instances).

NOUN

40747 NOUN tokens (97% of all NOUN tokens) have a non-empty value of Gender.

The most frequent other feature values with which NOUN and Gender co-occurred: Number=Sing (28851; 71%).

NOUN tokens may have the following values of Gender:

Paradigm presidenteMascFemUnsp
Number=SingpresidentepresidentePresidente
Number=Plurpresidentes

Gender seems to be lexical feature of NOUN. 97% lemmas (6428) occur only with one value of Gender.

DET

34025 DET tokens (97% of all DET tokens) have a non-empty value of Gender.

The most frequent other feature values with which DET and Gender co-occurred: PronType=Art (29678; 87%), Number=Sing (26885; 79%), Definite=Def (26487; 78%).

DET tokens may have the following values of Gender:

Paradigm muitoMascFemUnsp
Number=Singmuito, maismais, muita
Number=Plurmuitos, maismuitas, maismais
Number=Unspmais

PROPN

11709 PROPN tokens (61% of all PROPN tokens) have a non-empty value of Gender.

The most frequent other feature values with which PROPN and Gender co-occurred: Number=Sing (11296; 96%).

PROPN tokens may have the following values of Gender:

Paradigm SãoMascFemUnsp
SãoSãoSão

Gender seems to be lexical feature of PROPN. 94% lemmas (4438) occur only with one value of Gender.

ADJ

10975 ADJ tokens (99% of all ADJ tokens) have a non-empty value of Gender.

The most frequent other feature values with which ADJ and Gender co-occurred: Number=Sing (7836; 71%).

ADJ tokens may have the following values of Gender:

Paradigm grandeMascFemUnsp
Number=Singmaior, grande, máximomaior, grande, máxima
Number=Plurgrandes, maiores, máximosgrandes, maioresgrandes

PRON

7034 PRON tokens (100% of all PRON tokens) have a non-empty value of Gender.

The most frequent other feature values with which PRON and Gender co-occurred: Number=Sing (4896; 70%), Person=EMPTY (4628; 66%), Case=EMPTY (4498; 64%).

PRON tokens may have the following values of Gender:

Paradigm queMascFemUnsp
Definite=Def|Number=Sing|PronType=Artque
Number=Sing|PronType=Demque
Number=Sing|PronType=Indqueque
Number=Sing|PronType=Intquequeque
Number=Sing|PronType=Relquequeque
Number=Plur|PronType=Indque
Number=Plur|PronType=Intqueque
Number=Plur|PronType=Relquequeque
Number=Unsp|PronType=Indque
Number=Unsp|PronType=Relque
PronType=Relque

VERB

4203 VERB tokens (18% of all VERB tokens) have a non-empty value of Gender.

The most frequent other feature values with which VERB and Gender co-occurred: Tense=EMPTY (4203; 100%), Person=EMPTY (4202; 100%), Mood=EMPTY (4202; 100%), VerbForm=Part (4201; 100%), Number=Sing (2802; 67%).

VERB tokens may have the following values of Gender:

Paradigm terMascFem
Number=Singtido
Number=Sing|Voice=Passtidotida
Number=Plurtidas

SYM

203 SYM tokens (100% of all SYM tokens) have a non-empty value of Gender.

The most frequent other feature values with which SYM and Gender co-occurred: Number=Plur (203; 100%).

SYM tokens may have the following values of Gender:

AUX

19 AUX tokens (1% of all AUX tokens) have a non-empty value of Gender.

The most frequent other feature values with which AUX and Gender co-occurred: Mood=EMPTY (19; 100%), Tense=EMPTY (19; 100%), VerbForm=Part (19; 100%), Person=EMPTY (19; 100%), Number=Sing (13; 68%).

AUX tokens may have the following values of Gender:

Gender seems to be lexical feature of AUX. 100% lemmas (13) occur only with one value of Gender.

INTJ

3 INTJ tokens (7% of all INTJ tokens) have a non-empty value of Gender.

INTJ tokens may have the following values of Gender:

PART

1 PART tokens (25% of all PART tokens) have a non-empty value of Gender.

The most frequent other feature values with which PART and Gender co-occurred: Number=Sing (1; 100%).

PART tokens may have the following values of Gender:

Relations with Agreement in Gender

The 10 most frequent relations where parent and child node agree in Gender: NOUN –[det]–> DET (26659; 95%), NOUN –[amod]–> ADJ (8360; 98%), PROPN –[det]–> DET (4417; 80%), NOUN –[acl]–> VERB (1803; 68%), NOUN –[appos]–> PROPN (1204; 88%), NOUN –[conj]–> NOUN (1198; 60%), ADJ –[det]–> DET (498; 96%), PROPN –[appos]–> NOUN (425; 77%), NOUN –[appos]–> NOUN (400; 64%), PROPN –[appos]–> PROPN (398; 79%).


Gender in other languages: [bg] [cs] [de] [el] [en] [es] [eu] [fa] [fr] [ga] [he] [hu] [it] [ja] [ko] [sv] [u]