Treebank Statistics: UD_Catalan: Features: Gender
This feature is universal.
It occurs with 2 different values: Fem
, Masc
.
193535 tokens (36%) have a non-empty value of Gender
.
14362 types (44%) occur at least once with a non-empty value of Gender
.
9877 lemmas (42%) occur at least once with a non-empty value of Gender
.
The feature is used with 11 part-of-speech tags: NOUN (85332; 16% instances), DET (61191; 12% instances), ADJ (20196; 4% instances), ADP (14673; 3% instances), VERB (6669; 1% instances), PRON (3247; 1% instances), NUM (1369; 0% instances), AUX (796; 0% instances), ADV (60; 0% instances), PROPN (1; 0% instances), SYM (1; 0% instances).
NOUN
85332 NOUN tokens (86% of all NOUN
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which NOUN
and Gender
co-occurred: Number=Sing (59388; 70%).
NOUN
tokens may have the following values of Gender
:
Fem
(41322; 48% of non-emptyGender
): persones, obres, obra, empresa, llei, ciutat, zona, cosa, situació, bandaMasc
(44010; 52% of non-emptyGender
): anys, milions, any, president, temps, grup, projecte, cas, partit, directorEMPTY
(13419): pessetes, any, través, cap, euros, juny, part, partir, dia, terme
Paradigm cas | Masc | Fem |
---|---|---|
Number=Sing | cas | |
Number=Plur | casos | cas |
Gender
seems to be lexical feature of NOUN
. 99% lemmas (6537) occur only with one value of Gender
.
DET
61191 DET tokens (84% of all DET
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which DET
and Gender
co-occurred: Number=Sing (47167; 77%), PronType=Art (43634; 71%), Definite=Def (39267; 64%).
DET
tokens may have the following values of Gender
:
Fem
(33547; 55% of non-emptyGender
): la, les, una, seva, aquesta, seves, aquestes, totes, altra, totaMasc
(27644; 45% of non-emptyGender
): el, els, un, aquest, seu, seus, aquests, tots, tot, mateixEMPTY
(11405): l’, altres, cap, cada, diferents, qualsevol, qual, nostres, meva, prou
Paradigm el | Masc | Fem |
---|---|---|
Definite=Def|Number=Sing|PronType=Art | el | la, L' |
Definite=Def|Number=Plur|PronType=Art | els | les |
Number=Sing|PronType=Art | el | la |
Number=Plur|Person=3|Poss=Yes|PronType=Prs | les | |
Number=Plur|PronType=Art | els | les |
ADJ
20196 ADJ tokens (67% of all ADJ
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which ADJ
and Gender
co-occurred: VerbForm=EMPTY (14607; 72%), Number=Sing (14049; 70%).
ADJ
tokens may have the following values of Gender
:
Fem
(9118; 45% of non-emptyGender
): primera, nova, catalana, noves, política, segona, única, pública, bona, espanyolaMasc
(11078; 55% of non-emptyGender
): passat, primer, nou, espanyol, nous, català, públic, últims, polític, últimEMPTY
(9849): gran, general, grans, actual, important, social, baix, possible, municipal, anterior
Paradigm nou | Masc | Fem |
---|---|---|
Number=Sing | nou | nova |
Number=Plur | nous | noves |
ADP
14673 ADP tokens (17% of all ADP
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which ADP
and Gender
co-occurred: AdpType=Preppron (14673; 100%), Number=Sing (10823; 74%).
ADP
tokens may have the following values of Gender
:
Fem
(3; 0% of non-emptyGender
): daMasc
(14670; 100% of non-emptyGender
): del, al, dels, als, pel, pels, doEMPTY
(73302): de, a, d’, en, per, amb, entre, sobre, segons, des
Gender
seems to be lexical feature of ADP
. 100% lemmas (14) occur only with one value of Gender
.
VERB
6669 VERB tokens (17% of all VERB
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which VERB
and Gender
co-occurred: Mood=EMPTY (6669; 100%), Person=EMPTY (6669; 100%), Tense=Past (6669; 100%), VerbForm=Part (6669; 100%), Number=Sing (6433; 96%).
VERB
tokens may have the following values of Gender
:
Fem
(332; 5% of non-emptyGender
): dictada, aprovada, presentada, considerada, donada, atesa, inclosa, inaugurada, traslladada, conegudaMasc
(6337; 95% of non-emptyGender
): fet, explicat, dit, presentat, tingut, assegurat, destacat, demanat, passat, assenyalatEMPTY
(33245): fer, té, ha, tenir, dir, donar, tenen, arribar, fa, considera
Paradigm fer | Masc | Fem |
---|---|---|
Number=Sing | fet | |
Number=Plur | fetes |
PRON
3247 PRON tokens (14% of all PRON
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which PRON
and Gender
co-occurred: Number=Sing (2458; 76%), Person=EMPTY (2206; 68%).
PRON
tokens may have the following values of Gender
:
Fem
(907; 28% of non-emptyGender
): la, una, les, aquesta, altra, unes, ella, algunes, totes, ellesMasc
(2340; 72% of non-emptyGender
): un, tot, el, ell, uns, lo, ells, alguns, aquest, totsEMPTY
(20122): que, es, s’, hi, se, li, on, què, això, ho
Paradigm ell | Masc | Fem |
---|---|---|
Case=Acc|Number=Sing | el, lo, 'l, li, -lo | la, -la |
Case=Acc|Number=Plur | els, 'ls | les |
Number=Sing | ell | ella |
Number=Plur | ells, els | elles |
NUM
1369 NUM tokens (15% of all NUM
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which NUM
and Gender
co-occurred: NumForm=EMPTY (1369; 100%), NumType=Card (1369; 100%), Number=Plur (891; 65%).
NUM
tokens may have the following values of Gender
:
Fem
(500; 37% of non-emptyGender
): dues, una, mitja, ambdues, desena, tres-centes, Desenes, Vuit-centes, cinquena, dues-centesMasc
(869; 63% of non-emptyGender
): dos, un, mig, ambdós, tercer, quart, cinc-cents, 2, centenars, desèEMPTY
(7892): tres, quatre, cent, 10, cinc, sis, 15, 30, 20, vuit
Paradigm dos | Masc | Fem |
---|---|---|
Number=Sing | dos | |
Number=Plur | dos | dues |
dos |
AUX
796 AUX tokens (3% of all AUX
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which AUX
and Gender
co-occurred: Mood=EMPTY (796; 100%), Person=EMPTY (796; 100%), Tense=Past (796; 100%), VerbForm=Part (796; 100%), Number=Sing (781; 98%).
AUX
tokens may have the following values of Gender
:
Fem
(9; 1% of non-emptyGender
): aprovada, controlades, declarades, endeutada, investigada, investigades, presentades, remodelada, sostretaMasc
(787; 99% of non-emptyGender
): estat, pogut, hagut, començat, volgut, anat, fet, tornat, deixat, arribatEMPTY
(23230): va, ha, és, van, han, ser, són, està, havia, pot
Gender
seems to be lexical feature of AUX
. 100% lemmas (59) occur only with one value of Gender
.
ADV
60 ADV tokens (0% of all ADV
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which ADV
and Gender
co-occurred: Polarity=EMPTY (60; 100%).
ADV
tokens may have the following values of Gender
:
Masc
(60; 100% of non-emptyGender
): més, fins, enfront, entorn, enllà, quant, prop, enmigEMPTY
(15396): no, més, també, ja, després, ahir, molt, avui, només, ara
PROPN
1 PROPN tokens (0% of all PROPN
tokens) have a non-empty value of Gender
.
PROPN
tokens may have the following values of Gender
:
Masc
(1; 100% of non-emptyGender
): JustíciaEMPTY
(46731): Catalunya, Barcelona, Generalitat, Govern, sant, Ajuntament, Girona, Josep, CiU, PP
SYM
1 SYM tokens (0% of all SYM
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which SYM
and Gender
co-occurred: NumForm=EMPTY (1; 100%), NumType=EMPTY (1; 100%).
SYM
tokens may have the following values of Gender
:
Masc
(1; 100% of non-emptyGender
): 1%EMPTY
(4632): ’, %, 50%, 10%, 30%, 5%, 40%, 2%, 25%, 1%
Relations with Agreement in Gender
The 10 most frequent relations where parent and child node agree in Gender
:
NOUN –[det]–> DET (46906; 81%),
NOUN –[amod]–> ADJ (14858; 64%),
NOUN –[conj]–> NOUN (2642; 53%),
DET –[det]–> DET (1177; 81%),
NOUN –[appos]–> NOUN (1032; 51%),
ADJ –[nsubj]–> NOUN (558; 60%),
ADJ –[conj]–> ADJ (427; 52%),
PRON –[nmod]–> NOUN (411; 73%),
ADJ –[det]–> DET (384; 62%),
NOUN –[acl]–> ADJ (148; 60%).