Treebank Statistics: UD_Catalan-AnCora: Features: Gender
This feature is universal.
It occurs with 3 different values: Fem, Masc, Neut.
Some words have combined values of the feature; 1 combinations have been observed: Fem|Masc.
194766 tokens (36%) have a non-empty value of Gender.
14350 types (44%) occur at least once with a non-empty value of Gender.
9829 lemmas (42%) occur at least once with a non-empty value of Gender.
The feature is used with 8 part-of-speech tags: NOUN (86128; 16% instances), DET (75857; 14% instances), ADJ (20208; 4% instances), VERB (6814; 1% instances), PRON (3742; 1% instances), NUM (1358; 0% instances), AUX (651; 0% instances), PROPN (8; 0% instances).
NOUN
86128 NOUN tokens (87% of all NOUN tokens) have a non-empty value of Gender.
The most frequent other feature values with which NOUN and Gender co-occurred: Number=Sing (59389; 69%).
NOUN tokens may have the following values of Gender:
Fem(41906; 49% of non-emptyGender): pessetes, persones, obres, obra, empresa, llei, ciutat, zona, cosa, situacióMasc(44222; 51% of non-emptyGender): anys, milions, any, president, temps, grup, projecte, cas, partit, directorEMPTY(12518): any, través, cap, juny, part, partir, dia, terme, fa, tal
| Paradigm cas | Masc | Fem |
|---|---|---|
| Number=Sing | cas | |
| Number=Plur | casos | cas |
Gender seems to be lexical feature of NOUN. 99% lemmas (6523) occur only with one value of Gender.
DET
75857 DET tokens (87% of all DET tokens) have a non-empty value of Gender.
The most frequent other feature values with which DET and Gender co-occurred: PronType=Art (58148; 77%), Number=Sing (57983; 76%), Definite=Def (53784; 71%).
DET tokens may have the following values of Gender:
Fem(33547; 44% of non-emptyGender): la, les, una, seva, aquesta, seves, aquestes, totes, altra, totaMasc(42310; 56% of non-emptyGender): el, els, un, aquest, seu, seus, aquests, tots, tot, mateixEMPTY(11410): l’, altres, cap, cada, diferents, qualsevol, qual, nostres, meva, prou
| Paradigm el | Masc | Fem |
|---|---|---|
| Definite=Def|Foreign=Yes|Number=Sing|PronType=Art | el | |
| Definite=Def|Number=Sing|PronType=Art | el | la, L' |
| Definite=Def|Number=Plur|PronType=Art | els | les |
| Definite=Ind|Number=Sing|PronType=Art | la | |
| Number=Sing|PronType=Art | el | la |
| Number=Sing|PronType=Dem | el | la |
| Number=Plur|Person=3|Poss=Yes|PronType=Prs | les | |
| Number=Plur|PronType=Art | els | les |
| Number=Plur|PronType=Dem | els | les |
ADJ
20208 ADJ tokens (67% of all ADJ tokens) have a non-empty value of Gender.
The most frequent other feature values with which ADJ and Gender co-occurred: VerbForm=EMPTY (14619; 72%), Number=Sing (14061; 70%).
ADJ tokens may have the following values of Gender:
Fem(9120; 45% of non-emptyGender): primera, nova, catalana, noves, política, segona, única, pública, bona, espanyolaMasc(11088; 55% of non-emptyGender): passat, primer, nou, espanyol, nous, català, públic, últims, polític, últimEMPTY(9874): gran, general, grans, actual, important, social, baix, possible, municipal, anterior
| Paradigm nou | Masc | Fem |
|---|---|---|
| Number=Sing | nou | nova |
| Number=Plur | nous | noves |
VERB
6814 VERB tokens (16% of all VERB tokens) have a non-empty value of Gender.
The most frequent other feature values with which VERB and Gender co-occurred: Mood=EMPTY (6814; 100%), Person=EMPTY (6814; 100%), Tense=Past (6814; 100%), VerbForm=Part (6814; 100%), Number=Sing (6563; 96%).
VERB tokens may have the following values of Gender:
Fem(341; 5% of non-emptyGender): dictada, aprovada, presentada, considerada, donada, atesa, inclosa, inaugurada, traslladada, conegudaMasc(6473; 95% of non-emptyGender): fet, explicat, dit, presentat, tingut, assegurat, destacat, passat, demanat, assenyalatEMPTY(35082): fer, té, ha, fa, dir, tenir, donar, arribar, tenen, aconseguir
| Paradigm fer | Masc | Fem |
|---|---|---|
| Number=Sing | fet | |
| Number=Plur | fetes |
PRON
3742 PRON tokens (16% of all PRON tokens) have a non-empty value of Gender.
The most frequent other feature values with which PRON and Gender co-occurred: PrepCase=EMPTY (3742; 100%), Reflex=EMPTY (3742; 100%), Number=Sing (2965; 79%), Case=EMPTY (2545; 68%), Person=EMPTY (2205; 59%).
PRON tokens may have the following values of Gender:
Fem(909; 24% of non-emptyGender): la, una, les, aquesta, altra, unes, ella, algunes, totes, ellesFem,Masc(176; 5% of non-emptyGender): l’Masc(2324; 62% of non-emptyGender): un, tot, el, ell, uns, lo, ells, alguns, aquest, totsNeut(333; 9% of non-emptyGender): ho, -hoEMPTY(19712): que, es, s’, hi, se, li, on, què, això, qual
| Paradigm ell | Fem,Masc | Masc | Fem | Neut |
|---|---|---|---|---|
| Case=Acc|Number=Sing | l' | el, lo, 'l, -lo, l | la, -la | ho, -ho |
| Case=Acc|Number=Plur | les | |||
| Number=Sing | ell | ella | ||
| Number=Plur | ells | elles |
NUM
1358 NUM tokens (14% of all NUM tokens) have a non-empty value of Gender.
The most frequent other feature values with which NUM and Gender co-occurred: NumType=Card (1358; 100%), NumForm=Word (1356; 100%), Number=Plur (891; 66%).
NUM tokens may have the following values of Gender:
Fem(499; 37% of non-emptyGender): dues, una, mitja, ambdues, desena, tres-centes, Desenes, Vuit-centes, X, cinquenaMasc(859; 63% of non-emptyGender): dos, un, mig, ambdós, quart, cinc-cents, 2, centenars, desè, quatre-centesEMPTY(8604): tres, quatre, cent, 10, cinc, sis, 15, 30, 5, 20
| Paradigm dos | Masc | Fem |
|---|---|---|
| Number=Sing | dos | |
| Number=Plur | dos | dues |
| dos | dues |
AUX
651 AUX tokens (3% of all AUX tokens) have a non-empty value of Gender.
The most frequent other feature values with which AUX and Gender co-occurred: Mood=EMPTY (651; 100%), Number=Sing (651; 100%), Person=EMPTY (651; 100%), Tense=Past (651; 100%), VerbForm=Part (651; 100%).
AUX tokens may have the following values of Gender:
Masc(651; 100% of non-emptyGender): estat, pogut, hagut, anat, sigut, sabutEMPTY(21403): va, ha, és, van, han, ser, són, està, havia, pot
PROPN
8 PROPN tokens (0% of all PROPN tokens) have a non-empty value of Gender.
PROPN tokens may have the following values of Gender:
Fem(4; 50% of non-emptyGender): Seu, Companyia, FontMasc(4; 50% of non-emptyGender): Cobain, Justícia, Kurt, PlaEMPTY(46582): Catalunya, Barcelona, Generalitat, Govern, sant, Ajuntament, Girona, Josep, CiU, PP
Relations with Agreement in Gender
The 10 most frequent relations where parent and child node agree in Gender:
NOUN –[det]–> DET (56323; 82%),
NOUN –[amod]–> ADJ (14924; 64%),
NOUN –[conj]–> NOUN (2733; 53%),
NOUN –[appos]–> NOUN (1058; 51%),
ADJ –[nsubj]–> NOUN (528; 60%),
ADJ –[det]–> DET (448; 60%),
ADJ –[conj]–> ADJ (428; 52%),
PRON –[nmod]–> NOUN (411; 72%),
NOUN –[acl]–> ADJ (127; 60%),
ADJ –[obj]–> NOUN (109; 52%).