Treebank Statistics: UD_Italian-ISDT: Features: Gender
This feature is universal.
It occurs with 2 different values: Fem, Masc.
123837 tokens (42%) have a non-empty value of Gender.
14894 types (54%) occur at least once with a non-empty value of Gender.
10094 lemmas (54%) occur at least once with a non-empty value of Gender.
The feature is used with 10 part-of-speech tags: NOUN (57555; 19% instances), DET (41724; 14% instances), ADJ (12631; 4% instances), VERB (8135; 3% instances), PRON (3035; 1% instances), AUX (753; 0% instances), ADP (1; 0% instances), ADV (1; 0% instances), PROPN (1; 0% instances), X (1; 0% instances).
NOUN
57555 NOUN tokens (97% of all NOUN tokens) have a non-empty value of Gender.
The most frequent other feature values with which NOUN and Gender co-occurred: Number=Sing (37491; 65%).
NOUN tokens may have the following values of Gender:
Fem(25668; 45% of non-emptyGender): città, parte, persone, legge, società, proprietà, attività, vita, servitù, commissioneMasc(31887; 55% of non-emptyGender): anni, presidente, anno, fondo, diritto, film, stato, proprietario, mondo, casoEMPTY(1923): presidente, rappresentanti, onorevole, grazie, abitanti, fronte, giovani, enfiteuta, leader, partecipanti
| Paradigm proprietario | Masc | Fem |
|---|---|---|
| Number=Sing | proprietario | proprietaria |
| Number=Plur | proprietari |
Gender seems to be lexical feature of NOUN. 98% lemmas (6657) occur only with one value of Gender.
DET
41724 DET tokens (86% of all DET tokens) have a non-empty value of Gender.
The most frequent other feature values with which DET and Gender co-occurred: PronType=Art (37767; 91%), Definite=Def (33151; 79%), Number=Sing (29153; 70%).
DET tokens may have the following values of Gender:
Fem(17350; 42% of non-emptyGender): la, le, una, sua, un’, questa, sue, queste, tutte, molteMasc(24374; 58% of non-emptyGender): il, i, un, gli, lo, suo, questo, tutti, suoi, alcuniEMPTY(6918): l’, quale, ogni, loro, l’, che, qualche, tale, qualsiasi, tali
| Paradigm il | Masc | Fem |
|---|---|---|
| Number=Sing | il, lo, l’, i, i1, lu | la, l’, le, L', il |
| Number=Plur | i, gli, il | le, l’ |
ADJ
12631 ADJ tokens (64% of all ADJ tokens) have a non-empty value of Gender.
The most frequent other feature values with which ADJ and Gender co-occurred: Number=Sing (8375; 66%).
ADJ tokens may have the following values of Gender:
Fem(5640; 45% of non-emptyGender): prima, italiana, altra, altre, stessa, seconda, nuova, nuove, economica, altaMasc(6991; 55% of non-emptyGender): primo, nuovo, altri, altro, stesso, vero, secondo, terzo, europeo, italianiEMPTY(7144): grande, presente, comune, mondiale, ex, internazionale, maggiore, nazionale, possibile, sociale
| Paradigm primo | Masc | Fem |
|---|---|---|
| Number=Sing | primo | prima |
| Number=Sing|NumType=Ord | primo, 1º | prima |
| Number=Plur | prime | |
| Number=Plur|NumType=Ord | primi | prime |
VERB
8135 VERB tokens (32% of all VERB tokens) have a non-empty value of Gender.
The most frequent other feature values with which VERB and Gender co-occurred: Person=EMPTY (8135; 100%), Tense=Past (8135; 100%), Mood=EMPTY (8133; 100%), VerbForm=Part (8133; 100%), Number=Sing (6016; 74%).
VERB tokens may have the following values of Gender:
Fem(2334; 29% of non-emptyGender): fatta, stabilite, fatte, vista, dovuta, considerata, costituita, fondata, nata, chiamataMasc(5801; 71% of non-emptyGender): fatto, visto, vinto, avuto, tenuto, detto, nato, dato, messo, ricevutoEMPTY(17114): ha, è, hanno, fare, far, trova, sono, fa, chiama, vedere
| Paradigm avere | Masc | Fem |
|---|---|---|
| avuto | avuta |
PRON
3035 PRON tokens (27% of all PRON tokens) have a non-empty value of Gender.
The most frequent other feature values with which PRON and Gender co-occurred: Number=Sing (2217; 73%), Clitic=EMPTY (2184; 72%), Person=EMPTY (1850; 61%).
PRON tokens may have the following values of Gender:
Fem(758; 25% of non-emptyGender): la, le, quella, quelle, una, questa, essa, esse, altra, leiMasc(2277; 75% of non-emptyGender): lo, quello, uno, li, questo, gli, lui, tutto, ciò, tuttiEMPTY(8275): si, che, chi, ci, cui, ne, qual, c’, mi, quale
| Paradigm lo | Masc | Fem |
|---|---|---|
| Person=3 | lo, gli | La |
| lo |
AUX
753 AUX tokens (6% of all AUX tokens) have a non-empty value of Gender.
The most frequent other feature values with which AUX and Gender co-occurred: Mood=EMPTY (753; 100%), Person=EMPTY (753; 100%), Tense=Past (753; 100%), VerbForm=Part (753; 100%), Number=Sing (573; 76%).
AUX tokens may have the following values of Gender:
Fem(233; 31% of non-emptyGender): stata, state, potuta, andata, fattaMasc(520; 69% of non-emptyGender): stato, stati, potuto, dovuto, voluto, andato, fatto, potutiEMPTY(10951): è, sono, ha, può, hanno, essere, era, possono, deve, sia
| Paradigm essere | Masc | Fem |
|---|---|---|
| Number=Sing | stato | stata |
| Number=Plur | stati | state |
ADP
1 ADP tokens (0% of all ADP tokens) have a non-empty value of Gender.
ADP tokens may have the following values of Gender:
Masc(1; 100% of non-emptyGender): duEMPTY(45258): di, a, in, da, per, con, su, come, ad, tra
ADV
1 ADV tokens (0% of all ADV tokens) have a non-empty value of Gender.
The most frequent other feature values with which ADV and Gender co-occurred: PronType=EMPTY (1; 100%).
ADV tokens may have the following values of Gender:
Masc(1; 100% of non-emptyGender): pochissimoEMPTY(11440): non, più, anche, dove, come, quando, solo, prima, sempre, molto
PROPN
1 PROPN tokens (0% of all PROPN tokens) have a non-empty value of Gender.
PROPN tokens may have the following values of Gender:
Fem(1; 100% of non-emptyGender): hyeEMPTY(14775): Italia, Shakespeare, Balzac, Europa, Roma, San, Stati, Uniti, Marco, Unione
X
1 X tokens (0% of all X tokens) have a non-empty value of Gender.
The most frequent other feature values with which X and Gender co-occurred: Foreign=Yes (1; 100%).
X tokens may have the following values of Gender:
Masc(1; 100% of non-emptyGender): mixerEMPTY(277): a, b, Illusions, de, perdues, la, ad, c, f, home
Relations with Agreement in Gender
The 10 most frequent relations where parent and child node agree in Gender:
NOUN –[det]–> DET (34184; 84%),
NOUN –[amod]–> ADJ (10014; 63%),
NOUN –[conj]–> NOUN (2462; 55%),
NOUN –[acl]–> VERB (1638; 62%),
VERB –[nsubj:pass]–> NOUN (1486; 81%),
NOUN –[det:poss]–> DET (1455; 79%),
VERB –[conj]–> VERB (481; 52%),
ADJ –[conj]–> ADJ (392; 53%),
NOUN –[det:predet]–> DET (375; 97%),
ADJ –[nsubj]–> NOUN (362; 56%).