Treebank Statistics: UD_Italian-PoSTWITA: Features: Gender
This feature is universal.
It occurs with 2 different values: Fem, Masc.
36401 tokens (29%) have a non-empty value of Gender.
7007 types (40%) occur at least once with a non-empty value of Gender.
4487 lemmas (33%) occur at least once with a non-empty value of Gender.
The feature is used with 8 part-of-speech tags: NOUN (16932; 14% instances), DET (12308; 10% instances), ADJ (3540; 3% instances), PRON (1931; 2% instances), VERB (1596; 1% instances), AUX (92; 0% instances), ADP (1; 0% instances), PROPN (1; 0% instances).
NOUN
16932 NOUN tokens (96% of all NOUN tokens) have a non-empty value of Gender.
The most frequent other feature values with which NOUN and Gender co-occurred: Number=Sing (11722; 69%).
NOUN tokens may have the following values of Gender:
Fem(6939; 41% of non-emptyGender): crisi, politica, manovra, vita, tv, foto, cosa, fiducia, gente, fineMasc(9993; 59% of non-emptyGender): governo, video, anni, premier, lavoro, presidente, partiti, paese, italiani, professorEMPTY(661): TT, spread, tweet, blog, leader, link, news, web, nov., account
| Paradigm partito | Masc | Fem |
|---|---|---|
| Number=Sing | partito, partitino | partita |
| Number=Plur | partiti |
Gender seems to be lexical feature of NOUN. 98% lemmas (3184) occur only with one value of Gender.
DET
12308 DET tokens (85% of all DET tokens) have a non-empty value of Gender.
The most frequent other feature values with which DET and Gender co-occurred: PronType=Art (11463; 93%), Definite=Def (9904; 80%), Number=Sing (9594; 78%).
DET tokens may have the following values of Gender:
Fem(4399; 36% of non-emptyGender): la, le, una, mia, questa, un’, sua, tua, nostra, sueMasc(7909; 64% of non-emptyGender): il, i, un, gli, lo, questo, mio, suo, uno, suoiEMPTY(2167): l’, che, tutti, tutto, ogni, tutta, loro, qualche, tutte, altro
| Paradigm il | Masc | Fem |
|---|---|---|
| Number=Sing | il, lo | la |
| Number=Plur | i, gli, il | le, e |
ADJ
3540 ADJ tokens (71% of all ADJ tokens) have a non-empty value of Gender.
The most frequent other feature values with which ADJ and Gender co-occurred: Number=Sing (2689; 76%).
ADJ tokens may have the following values of Gender:
Fem(1395; 39% of non-emptyGender): bella, buona, nuova, prima, politica, italiana, unica, vera, prime, economicaMasc(2145; 61% of non-emptyGender): nuovo, buon, primo, vero, giusto, fisso, tecnico, bello, bravo, italianoEMPTY(1457): grande, possibile, facile, forte, migliore, sociale, difficile, fiscale, importante, elettorale
| Paradigm nuovo | Masc | Fem |
|---|---|---|
| Number=Sing | nuovo | nuova |
| Number=Plur | nuovi | nuove |
PRON
1931 PRON tokens (30% of all PRON tokens) have a non-empty value of Gender.
The most frequent other feature values with which PRON and Gender co-occurred: Number=Sing (1380; 71%), Clitic=EMPTY (1261; 65%), Person=EMPTY (1170; 61%).
PRON tokens may have the following values of Gender:
Fem(372; 19% of non-emptyGender): la, le, lei, quella, questa, tutte, quelle, una, tua, suaMasc(1559; 81% of non-emptyGender): lo, tutti, tutto, quello, l’, li, gli, questo, uno, nessunoEMPTY(4557): che, si, mi, ci, ti, io, chi, c’, me, ne
| Paradigm tutto | Masc | Fem |
|---|---|---|
| Number=Sing | tutto | tutta |
| Number=Plur | tutti | tutte |
VERB
1596 VERB tokens (14% of all VERB tokens) have a non-empty value of Gender.
The most frequent other feature values with which VERB and Gender co-occurred: Mood=EMPTY (1596; 100%), VerbForm=Part (1596; 100%), Person=EMPTY (1595; 100%), Tense=Past (1595; 100%), Number=Sing (1472; 92%).
VERB tokens may have the following values of Gender:
Fem(306; 19% of non-emptyGender): fatta, finita, chiamata, fatte, andata, stata, varata, arrivata, dedicata, mortaMasc(1290; 81% of non-emptyGender): fatto, detto, dato, nominato, messo, trovato, letto, perso, iniziato, rottoEMPTY(9668): fare, fa, è, ha, dice, far, piace, va, ho, dire
| Paradigm fare | Masc | Fem |
|---|---|---|
| Number=Sing | fatto | fatta |
| Number=Plur | fatte |
AUX
92 AUX tokens (2% of all AUX tokens) have a non-empty value of Gender.
The most frequent other feature values with which AUX and Gender co-occurred: Mood=EMPTY (92; 100%), Person=EMPTY (92; 100%), Tense=Past (92; 100%), VerbForm=Part (92; 100%), Number=Sing (79; 86%).
AUX tokens may have the following values of Gender:
Fem(31; 34% of non-emptyGender): stata, state, dovuta, ststaMasc(61; 66% of non-emptyGender): stato, stati, potuto, dovutoEMPTY(4346): è, sono, ha, e’, ho, essere, siamo, sei, può, hanno
| Paradigm essere | Masc | Fem |
|---|---|---|
| Number=Sing | stato | stata, state, ststa |
| Number=Plur | stati | state |
ADP
1 ADP tokens (0% of all ADP tokens) have a non-empty value of Gender.
ADP tokens may have the following values of Gender:
Masc(1; 100% of non-emptyGender): rispettoEMPTY(13015): di, a, in, per, da, su, con, via, come, contro
PROPN
1 PROPN tokens (0% of all PROPN tokens) have a non-empty value of Gender.
PROPN tokens may have the following values of Gender:
Masc(1; 100% of non-emptyGender): pepeEMPTY(10480): monti, mario, italia, berlusconi, roma, pd, lega, pdl, twitter, napolitano
Relations with Agreement in Gender
The 10 most frequent relations where parent and child node agree in Gender:
NOUN –[det]–> DET (9108; 84%),
NOUN –[amod]–> ADJ (2029; 70%),
NOUN –[det:poss]–> DET (484; 86%),
NOUN –[conj]–> NOUN (414; 53%),
NOUN –[parataxis]–> NOUN (213; 52%),
ADJ –[nsubj]–> NOUN (138; 58%),
NOUN –[nsubj]–> NOUN (130; 50%),
ADJ –[conj]–> ADJ (105; 52%),
PRON –[det]–> DET (95; 65%),
NOUN –[compound]–> NOUN (86; 53%).