Treebank Statistics: UD_Catalan: Features: Gender
This feature is universal.
It occurs with 2 different values: Fem, Masc.
193535 tokens (36%) have a non-empty value of Gender.
14362 types (44%) occur at least once with a non-empty value of Gender.
9877 lemmas (42%) occur at least once with a non-empty value of Gender.
The feature is used with 11 part-of-speech tags: NOUN (85332; 16% instances), DET (61191; 12% instances), ADJ (20196; 4% instances), ADP (14673; 3% instances), VERB (6669; 1% instances), PRON (3247; 1% instances), NUM (1369; 0% instances), AUX (796; 0% instances), ADV (60; 0% instances), PROPN (1; 0% instances), SYM (1; 0% instances).
NOUN
85332 NOUN tokens (86% of all NOUN tokens) have a non-empty value of Gender.
The most frequent other feature values with which NOUN and Gender co-occurred: Number=Sing (59388; 70%).
NOUN tokens may have the following values of Gender:
Fem(41322; 48% of non-emptyGender): persones, obres, obra, empresa, llei, ciutat, zona, cosa, situació, bandaMasc(44010; 52% of non-emptyGender): anys, milions, any, president, temps, grup, projecte, cas, partit, directorEMPTY(13419): pessetes, any, través, cap, euros, juny, part, partir, dia, terme
| Paradigm cas | Masc | Fem |
|---|---|---|
| Number=Sing | cas | |
| Number=Plur | casos | cas |
Gender seems to be lexical feature of NOUN. 99% lemmas (6537) occur only with one value of Gender.
DET
61191 DET tokens (84% of all DET tokens) have a non-empty value of Gender.
The most frequent other feature values with which DET and Gender co-occurred: Number=Sing (47167; 77%), PronType=Art (43634; 71%), Definite=Def (39267; 64%).
DET tokens may have the following values of Gender:
Fem(33547; 55% of non-emptyGender): la, les, una, seva, aquesta, seves, aquestes, totes, altra, totaMasc(27644; 45% of non-emptyGender): el, els, un, aquest, seu, seus, aquests, tots, tot, mateixEMPTY(11405): l’, altres, cap, cada, diferents, qualsevol, qual, nostres, meva, prou
| Paradigm el | Masc | Fem |
|---|---|---|
| Definite=Def|Number=Sing|PronType=Art | el | la, L' |
| Definite=Def|Number=Plur|PronType=Art | els | les |
| Number=Sing|PronType=Art | el | la |
| Number=Plur|Person=3|Poss=Yes|PronType=Prs | les | |
| Number=Plur|PronType=Art | els | les |
ADJ
20196 ADJ tokens (67% of all ADJ tokens) have a non-empty value of Gender.
The most frequent other feature values with which ADJ and Gender co-occurred: VerbForm=EMPTY (14607; 72%), Number=Sing (14049; 70%).
ADJ tokens may have the following values of Gender:
Fem(9118; 45% of non-emptyGender): primera, nova, catalana, noves, política, segona, única, pública, bona, espanyolaMasc(11078; 55% of non-emptyGender): passat, primer, nou, espanyol, nous, català, públic, últims, polític, últimEMPTY(9849): gran, general, grans, actual, important, social, baix, possible, municipal, anterior
| Paradigm nou | Masc | Fem |
|---|---|---|
| Number=Sing | nou | nova |
| Number=Plur | nous | noves |
ADP
14673 ADP tokens (17% of all ADP tokens) have a non-empty value of Gender.
The most frequent other feature values with which ADP and Gender co-occurred: AdpType=Preppron (14673; 100%), Number=Sing (10823; 74%).
ADP tokens may have the following values of Gender:
Fem(3; 0% of non-emptyGender): daMasc(14670; 100% of non-emptyGender): del, al, dels, als, pel, pels, doEMPTY(73302): de, a, d’, en, per, amb, entre, sobre, segons, des
Gender seems to be lexical feature of ADP. 100% lemmas (14) occur only with one value of Gender.
VERB
6669 VERB tokens (17% of all VERB tokens) have a non-empty value of Gender.
The most frequent other feature values with which VERB and Gender co-occurred: Mood=EMPTY (6669; 100%), Person=EMPTY (6669; 100%), Tense=Past (6669; 100%), VerbForm=Part (6669; 100%), Number=Sing (6433; 96%).
VERB tokens may have the following values of Gender:
Fem(332; 5% of non-emptyGender): dictada, aprovada, presentada, considerada, donada, atesa, inclosa, inaugurada, traslladada, conegudaMasc(6337; 95% of non-emptyGender): fet, explicat, dit, presentat, tingut, assegurat, destacat, demanat, passat, assenyalatEMPTY(33245): fer, té, ha, tenir, dir, donar, tenen, arribar, fa, considera
| Paradigm fer | Masc | Fem |
|---|---|---|
| Number=Sing | fet | |
| Number=Plur | fetes |
PRON
3247 PRON tokens (14% of all PRON tokens) have a non-empty value of Gender.
The most frequent other feature values with which PRON and Gender co-occurred: Number=Sing (2458; 76%), Person=EMPTY (2206; 68%).
PRON tokens may have the following values of Gender:
Fem(907; 28% of non-emptyGender): la, una, les, aquesta, altra, unes, ella, algunes, totes, ellesMasc(2340; 72% of non-emptyGender): un, tot, el, ell, uns, lo, ells, alguns, aquest, totsEMPTY(20122): que, es, s’, hi, se, li, on, què, això, ho
| Paradigm ell | Masc | Fem |
|---|---|---|
| Case=Acc|Number=Sing | el, lo, 'l, li, -lo | la, -la |
| Case=Acc|Number=Plur | els, 'ls | les |
| Number=Sing | ell | ella |
| Number=Plur | ells, els | elles |
NUM
1369 NUM tokens (15% of all NUM tokens) have a non-empty value of Gender.
The most frequent other feature values with which NUM and Gender co-occurred: NumForm=EMPTY (1369; 100%), NumType=Card (1369; 100%), Number=Plur (891; 65%).
NUM tokens may have the following values of Gender:
Fem(500; 37% of non-emptyGender): dues, una, mitja, ambdues, desena, tres-centes, Desenes, Vuit-centes, cinquena, dues-centesMasc(869; 63% of non-emptyGender): dos, un, mig, ambdós, tercer, quart, cinc-cents, 2, centenars, desèEMPTY(7892): tres, quatre, cent, 10, cinc, sis, 15, 30, 20, vuit
| Paradigm dos | Masc | Fem |
|---|---|---|
| Number=Sing | dos | |
| Number=Plur | dos | dues |
| dos |
AUX
796 AUX tokens (3% of all AUX tokens) have a non-empty value of Gender.
The most frequent other feature values with which AUX and Gender co-occurred: Mood=EMPTY (796; 100%), Person=EMPTY (796; 100%), Tense=Past (796; 100%), VerbForm=Part (796; 100%), Number=Sing (781; 98%).
AUX tokens may have the following values of Gender:
Fem(9; 1% of non-emptyGender): aprovada, controlades, declarades, endeutada, investigada, investigades, presentades, remodelada, sostretaMasc(787; 99% of non-emptyGender): estat, pogut, hagut, començat, volgut, anat, fet, tornat, deixat, arribatEMPTY(23230): va, ha, és, van, han, ser, són, està, havia, pot
Gender seems to be lexical feature of AUX. 100% lemmas (59) occur only with one value of Gender.
ADV
60 ADV tokens (0% of all ADV tokens) have a non-empty value of Gender.
The most frequent other feature values with which ADV and Gender co-occurred: Polarity=EMPTY (60; 100%).
ADV tokens may have the following values of Gender:
Masc(60; 100% of non-emptyGender): més, fins, enfront, entorn, enllà, quant, prop, enmigEMPTY(15396): no, més, també, ja, després, ahir, molt, avui, només, ara
PROPN
1 PROPN tokens (0% of all PROPN tokens) have a non-empty value of Gender.
PROPN tokens may have the following values of Gender:
Masc(1; 100% of non-emptyGender): JustíciaEMPTY(46731): Catalunya, Barcelona, Generalitat, Govern, sant, Ajuntament, Girona, Josep, CiU, PP
SYM
1 SYM tokens (0% of all SYM tokens) have a non-empty value of Gender.
The most frequent other feature values with which SYM and Gender co-occurred: NumForm=EMPTY (1; 100%), NumType=EMPTY (1; 100%).
SYM tokens may have the following values of Gender:
Masc(1; 100% of non-emptyGender): 1%EMPTY(4632): ’, %, 50%, 10%, 30%, 5%, 40%, 2%, 25%, 1%
Relations with Agreement in Gender
The 10 most frequent relations where parent and child node agree in Gender:
NOUN –[det]–> DET (46906; 81%),
NOUN –[amod]–> ADJ (14858; 64%),
NOUN –[conj]–> NOUN (2642; 53%),
DET –[det]–> DET (1177; 81%),
NOUN –[appos]–> NOUN (1032; 51%),
ADJ –[nsubj]–> NOUN (558; 60%),
ADJ –[conj]–> ADJ (427; 52%),
PRON –[nmod]–> NOUN (411; 73%),
ADJ –[det]–> DET (384; 62%),
NOUN –[acl]–> ADJ (148; 60%).