Treebank Statistics: UD_Italian-MarkIT: Features: Gender
This feature is universal.
It occurs with 2 different values: Fem, Masc.
17767 tokens (44%) have a non-empty value of Gender.
3855 types (64%) occur at least once with a non-empty value of Gender.
2915 lemmas (71%) occur at least once with a non-empty value of Gender.
The feature is used with 11 part-of-speech tags: NOUN (7399; 18% instances), DET (5737; 14% instances), ADJ (2454; 6% instances), PRON (1391; 3% instances), VERB (718; 2% instances), AUX (61; 0% instances), PROPN (3; 0% instances), ADP (1; 0% instances), ADV (1; 0% instances), CCONJ (1; 0% instances), SCONJ (1; 0% instances).
NOUN
7399 NOUN tokens (99% of all NOUN tokens) have a non-empty value of Gender.
The most frequent other feature values with which NOUN and Gender co-occurred: Number=Sing (5598; 76%).
NOUN tokens may have the following values of Gender:
Fem(3509; 47% of non-emptyGender): vita, società, persone, scienza, felicità, parte, amicizia, ricerca, storia, filosofiaMasc(3890; 53% of non-emptyGender): uomo, tempo, esempio, anni, modo, mondo, amico, paese, stato, motivoEMPTY(54): grazie, riconoscere, vivere, Domani, Estero, Museo, Stato, Uomo, aldilá, avanzare
| Paradigm grazie | Masc | Fem |
|---|---|---|
| grazie | grazie |
Gender seems to be lexical feature of NOUN. 98% lemmas (1728) occur only with one value of Gender.
DET
5737 DET tokens (87% of all DET tokens) have a non-empty value of Gender.
The most frequent other feature values with which DET and Gender co-occurred: PronType=Art (4632; 81%), Number=Sing (4398; 77%), Definite=Def (3780; 66%).
DET tokens may have the following values of Gender:
Fem(2515; 44% of non-emptyGender): la, le, una, questa, sua, l’, nostra, queste, sue, propriaMasc(3222; 56% of non-emptyGender): il, un, i, gli, questo, lo, suo, questi, ogni, l’EMPTY(825): l’, un’, tale, che, più, tali, un, cui, dei, delle
| Paradigm il | Masc | Fem |
|---|---|---|
| Definite=Def|Number=Sing | La | |
| Definite=Def|Number=Sing|PronType=Art | il, lo, l' | la, l', lo |
| Definite=Def|Number=Plur|PronType=Art | i, gli, il | le, la |
| Number=Sing | il | la, L' |
| Number=Plur | i | le |
ADJ
2454 ADJ tokens (96% of all ADJ tokens) have a non-empty value of Gender.
The most frequent other feature values with which ADJ and Gender co-occurred: Number=Sing (1749; 71%).
ADJ tokens may have the following values of Gender:
Fem(787; 32% of non-emptyGender): stessa, diverse, moderna, prima, seconda, unica, italiana, umana, nuova, nuoveMasc(1667; 68% of non-emptyGender): stesso, grande, importante, primo, umano, possibile, difficile, piccolo, grandi, veroEMPTY(103): altri, altro, maggiore, altre, maggior, superiore, maggiori, migliore, pochi, III
| Paradigm grande | Masc | Fem |
|---|---|---|
| Degree=Abs|Number=Sing | grandissima | |
| Number=Sing | grande, gran | |
| Number=Plur | grandi | grandi |
PRON
1391 PRON tokens (45% of all PRON tokens) have a non-empty value of Gender.
The most frequent other feature values with which PRON and Gender co-occurred: Person=3 (1147; 82%), Number=Sing (1032; 74%), PronType=Prs (857; 62%), Clitic=Yes (747; 54%).
PRON tokens may have the following values of Gender:
Fem(392; 28% of non-emptyGender): ci, la, questa, vi, essa, quella, le, mi, qualcosa, séMasc(999; 72% of non-emptyGender): si, lo, questo, ci, ciò, quello, tutti, altri, lui, tuttoEMPTY(1728): che, si, c’, noi, cui, ne, quale, chi, quali, ci
| Paradigm lo | Masc | Fem |
|---|---|---|
| Clitic=Yes|Number=Sing|Person=3 | la | |
| Clitic=Yes|Number=Sing|Person=3|PronType=Prs | lo, l', gli, li | la, le |
| Clitic=Yes|Number=Plur|Person=3|PronType=Prs | li, gli | le |
| Definite=Def|Number=Sing | lo | |
| Definite=Def|Number=Sing|PronType=Art | lo | la |
| Number=Sing | la |
VERB
718 VERB tokens (18% of all VERB tokens) have a non-empty value of Gender.
The most frequent other feature values with which VERB and Gender co-occurred: Person=EMPTY (718; 100%), Mood=EMPTY (717; 100%), VerbForm=Part (714; 99%), Tense=Past (710; 99%), Number=Sing (534; 74%).
VERB tokens may have the following values of Gender:
Fem(217; 30% of non-emptyGender): creata, fatta, porta, sviluppata, considerata, vista, avuta, composta, fatte, sentitaMasc(501; 70% of non-emptyGender): avuto, dato, fatto, visto, inteso, stato, cercato, detto, legati, permessoEMPTY(3223): è, ha, far, sono, fa, fare, essere, trovare, dare, avere
| Paradigm essere | Masc | Fem |
|---|---|---|
| Number=Sing | stato | stata |
| Number=Plur | stati | state |
AUX
61 AUX tokens (3% of all AUX tokens) have a non-empty value of Gender.
The most frequent other feature values with which AUX and Gender co-occurred: Mood=EMPTY (61; 100%), Person=EMPTY (61; 100%), Tense=Past (61; 100%), VerbForm=Part (61; 100%), Number=Sing (47; 77%).
AUX tokens may have the following values of Gender:
Fem(22; 36% of non-emptyGender): stata, state, potutaMasc(39; 64% of non-emptyGender): stato, stati, potuto, volutoEMPTY(1979): è, sono, ha, può, essere, hanno, era, possiamo, fu, deve
| Paradigm essere | Masc | Fem |
|---|---|---|
| Number=Sing | stato | stata |
| Number=Plur | stati | state |
PROPN
3 PROPN tokens (0% of all PROPN tokens) have a non-empty value of Gender.
PROPN tokens may have the following values of Gender:
Masc(3; 100% of non-emptyGender): Human, brain, projectEMPTY(829): Italia, Europa, Germania, Unione, europea, America, Leopardi, Malpelo, Pascal, Romeo
ADP
1 ADP tokens (0% of all ADP tokens) have a non-empty value of Gender.
ADP tokens may have the following values of Gender:
Fem(1; 100% of non-emptyGender): aEMPTY(5459): di, in, a, da, per, con, su, come, ad, tra
ADV
1 ADV tokens (0% of all ADV tokens) have a non-empty value of Gender.
The most frequent other feature values with which ADV and Gender co-occurred: PronType=EMPTY (1; 100%).
ADV tokens may have the following values of Gender:
Masc(1; 100% of non-emptyGender): parecchioEMPTY(2291): non, più, proprio, anche, sempre, solo, infatti, così, quindi, molto
CCONJ
1 CCONJ tokens (0% of all CCONJ tokens) have a non-empty value of Gender.
CCONJ tokens may have the following values of Gender:
Fem(1; 100% of non-emptyGender): eEMPTY(1347): e, ma, ed, o, sia, oppure, quindi, né, ovvero, cioè
SCONJ
1 SCONJ tokens (0% of all SCONJ tokens) have a non-empty value of Gender.
SCONJ tokens may have the following values of Gender:
Masc(1; 100% of non-emptyGender): perchéEMPTY(867): che, se, perché, come, quando, poiché, mentre, nonostante, affinché, dopo
Relations with Agreement in Gender
The 10 most frequent relations where parent and child node agree in Gender:
NOUN –[det]–> DET (4737; 85%),
NOUN –[amod]–> ADJ (1537; 82%),
NOUN –[det:poss]–> DET (359; 93%),
NOUN –[conj]–> NOUN (312; 56%),
ADJ –[conj]–> ADJ (95; 84%),
NOUN –[nsubj]–> NOUN (67; 54%),
ADJ –[det]–> DET (65; 86%),
ADJ –[nsubj]–> NOUN (55; 68%),
NOUN –[det:predet]–> DET (53; 100%),
VERB –[nsubj:pass]–> NOUN (51; 94%).