Treebank Statistics: UD_Italian-KIParlaForest: Features: Gender
This feature is universal.
It occurs with 2 different values: Fem, Masc.
6263 tokens (34%) have a non-empty value of Gender.
1546 types (53%) occur at least once with a non-empty value of Gender.
1271 lemmas (60%) occur at least once with a non-empty value of Gender.
The feature is used with 13 part-of-speech tags: NOUN (2495; 13% instances), DET (1812; 10% instances), ADJ (665; 4% instances), PRON (650; 3% instances), VERB (389; 2% instances), PROPN (111; 1% instances), AUX (35; 0% instances), INTJ (34; 0% instances), NUM (34; 0% instances), ADV (33; 0% instances), ADP (2; 0% instances), CCONJ (2; 0% instances), X (1; 0% instances).
NOUN
2495 NOUN tokens (94% of all NOUN tokens) have a non-empty value of Gender.
The most frequent other feature values with which NOUN and Gender co-occurred: Number=Sing (1699; 68%).
NOUN tokens may have the following values of Gender:
Fem(1179; 47% of non-emptyGender): città, realtà, casa, lingua, cosa, parte, università, cose, persone, storiaMasc(1316; 53% of non-emptyGender): tipo, arabo, centro, anni, dialetti, alfabeto, sud, sacco, senso, periodoEMPTY(173): po’, nord, lingue, cazzo, femminile, okay, chewing, grazie, gum, ics
| Paradigm lingua | Masc | Fem |
|---|---|---|
| Number=Sing | lingue | lingua |
| Number=Plur | lingue |
Gender seems to be lexical feature of NOUN. 97% lemmas (755) occur only with one value of Gender.
DET
1812 DET tokens (83% of all DET tokens) have a non-empty value of Gender.
The most frequent other feature values with which DET and Gender co-occurred: PronType=Art (1555; 86%), Number=Sing (1335; 74%), Definite=Def (1167; 64%).
DET tokens may have the following values of Gender:
Fem(853; 47% of non-emptyGender): la, le, una, questa, un’, queste, delle, mia, quella, tutteMasc(959; 53% of non-emptyGender): il, un, i, gli, questo, lo, questi, dei, uno, tuttiEMPTY(382): l’, che, tutto, loro, tutti, il, qualche, alcuni, la, tutta
| Paradigm il | Masc | Fem |
|---|---|---|
| Definite=Def|Number=Sing|PronType=Art | il, lo, l | la, le |
| Definite=Def|Number=Plur|PronType=Art | i, gli, il | le, lo |
| Number=Sing|Person=3|PronType=Prs | lo, l' | la |
| Number=Sing|PronType=Art | la | |
| Number=Plur|PronType=Art | i |
ADJ
665 ADJ tokens (70% of all ADJ tokens) have a non-empty value of Gender.
The most frequent other feature values with which ADJ and Gender co-occurred: Number=Sing (511; 77%).
ADJ tokens may have the following values of Gender:
Fem(295; 44% of non-emptyGender): araba, mia, piccola, prima, semitica, bella, buona, tua, mezza, altraMasc(370; 56% of non-emptyGender): esatto, arabo, miei, proto, stesso, strano, antico, bel, islamico, perfettoEMPTY(282): grande, difficile, stessa, comune, standard, udenti, altra, certo, enorme, facile
| Paradigm arabo | Masc | Fem |
|---|---|---|
| Number=Sing | arabo | araba |
| Number=Plur | arabi |
PRON
650 PRON tokens (35% of all PRON tokens) have a non-empty value of Gender.
The most frequent other feature values with which PRON and Gender co-occurred: Number=Sing (510; 78%), Person=EMPTY (383; 59%).
PRON tokens may have the following values of Gender:
Fem(180; 28% of non-emptyGender): questa, le, la, lei, quella, una, altra, queste, alcune, quelleMasc(470; 72% of non-emptyGender): lo, quello, questo, l’, tutti, li, qualcuno, uno, questi, tuttoEMPTY(1232): c’, io, si, ci, mi, che, me, ti, cui, ne
| Paradigm lo | Masc | Fem |
|---|---|---|
| Definite=Def|Number=Sing|PronType=Art | lo | |
| Definite=Def|Number=Plur|PronType=Prs | l' | |
| Number=Sing|Person=3|PronType=Prs | lo, l', qual |
VERB
389 VERB tokens (16% of all VERB tokens) have a non-empty value of Gender.
The most frequent other feature values with which VERB and Gender co-occurred: Mood=EMPTY (389; 100%), Person=EMPTY (389; 100%), Number=Sing (336; 86%), Tense=Past (334; 86%), VerbForm=Part (334; 86%).
VERB tokens may have the following values of Gender:
Fem(102; 26% of non-emptyGender): fatta, trovata, datata, morte, scritte, andata, andate, basta, chiusa, copertaMasc(287; 74% of non-emptyGender): detto, fatto, scritto, sentito, visto, imparato, parlato, trovato, usato, vissutoEMPTY(1995): è, so, sono, abbiamo, fa, fare, era, ha, dire, va
| Paradigm essere | Masc | Fem |
|---|---|---|
| Number=Sing | stato | stata |
| Number=Plur | stati |
PROPN
111 PROPN tokens (26% of all PROPN tokens) have a non-empty value of Gender.
The most frequent other feature values with which PROPN and Gender co-occurred: Number=Sing (81; 73%).
PROPN tokens may have the following values of Gender:
Fem(59; 53% of non-emptyGender): arabia, siria, giordania, saudita, saba, turchia, arancioni, marina, palestina, palmiraMasc(52; 47% of non-emptyGender): rossi, oman, kitab, nabatei, qays, erodoto, sinai, arab, egitto, fermoEMPTY(313): [TOWN_NAME], ancona, bologna, pesaro, cristo, [PLACE_NAME], fermo, gialli, imola, marche
Gender seems to be lexical feature of PROPN. 100% lemmas (56) occur only with one value of Gender.
AUX
35 AUX tokens (3% of all AUX tokens) have a non-empty value of Gender.
The most frequent other feature values with which AUX and Gender co-occurred: Mood=EMPTY (35; 100%), Person=EMPTY (35; 100%), Tense=Past (27; 77%), VerbForm=Part (27; 77%), Number=Sing (26; 74%).
AUX tokens may have the following values of Gender:
Fem(18; 51% of non-emptyGender): stata, state, son, esserMasc(17; 49% of non-emptyGender): stato, son, stati, abbiamo, avevo, ero, stavoEMPTY(1009): è, sono, ho, ha, era, devi, hanno, possiamo, abbiamo, son
| Paradigm essere | Masc | Fem |
|---|---|---|
| _ | son | son |
| Number=Sing | ero | |
| Number=Sing|Tense=Past|VerbForm=Part | stato | stata |
| Number=Plur | esser | |
| Number=Plur|Tense=Past|VerbForm=Part | stati | state |
INTJ
34 INTJ tokens (4% of all INTJ tokens) have a non-empty value of Gender.
INTJ tokens may have the following values of Gender:
Masc(34; 100% of non-emptyGender): mh, ehEMPTY(757): eh, mh, okay, ah, no, sì, vabbè, mhmh, beh, ehm
NUM
34 NUM tokens (20% of all NUM tokens) have a non-empty value of Gender.
The most frequent other feature values with which NUM and Gender co-occurred: Number=Sing (25; 74%), NumType=Ord (19; 56%).
NUM tokens may have the following values of Gender:
Fem(10; 29% of non-emptyGender): prima, seconda, sedicimila, terzaMasc(24; 71% of non-emptyGender): primi, primo, seicento, trecentoventotto, duecento, duemiladiciotto, milleseicento, ottocento, secondo, sediciEMPTY(138): due, quattro, tre, cinque, quattordici, sette, dieci, mille, undici, cinquanta
| Paradigm primo | Masc | Fem |
|---|---|---|
| _ | primi | |
| Number=Sing|NumType=Ord | primo | prima |
| Number=Plur|NumType=Ord | primi |
ADV
33 ADV tokens (1% of all ADV tokens) have a non-empty value of Gender.
The most frequent other feature values with which ADV and Gender co-occurred: PronType=EMPTY (29; 88%).
ADV tokens may have the following values of Gender:
Fem(12; 36% of non-emptyGender): cosa, etcetera, lì, più, invece, molte, quali, tutta, tutte, vicinaMasc(21; 64% of non-emptyGender): quanto, giusto, lì, meno, almeno, bene, esatto, fino, lontano, mancoEMPTY(2179): non, sì, no, anche, più, poi, molto, così, bene, adesso
| Paradigm lì | Masc | Fem |
|---|---|---|
| lì | lì |
ADP
2 ADP tokens (0% of all ADP tokens) have a non-empty value of Gender.
ADP tokens may have the following values of Gender:
Fem(1; 50% of non-emptyGender): aMasc(1; 50% of non-emptyGender): inEMPTY(1901): di, in, a, per, da, con, su, come, secondo, tra
CCONJ
2 CCONJ tokens (0% of all CCONJ tokens) have a non-empty value of Gender.
CCONJ tokens may have the following values of Gender:
Fem(2; 100% of non-emptyGender): oppureEMPTY(1000): e, cioè, ma, quindi, però, o, comunque, sia, che, infatti
X
1 X tokens (0% of all X tokens) have a non-empty value of Gender.
X tokens may have the following values of Gender:
Fem(1; 100% of non-emptyGender): s~EMPTY(353): x, s~, no~, a~, day, di~, may, n~, p~, ti~
Relations with Agreement in Gender
The 10 most frequent relations where parent and child node agree in Gender:
NOUN –[det]–> DET (1388; 81%),
NOUN –[amod]–> ADJ (341; 69%),
NOUN –[conj]–> NOUN (46; 56%),
PROPN –[det]–> DET (34; 51%),
ADJ –[det]–> DET (24; 59%),
NOUN –[det:poss]–> DET (23; 66%),
ADJ –[nsubj]–> NOUN (22; 76%),
DET –[reparandum]–> DET (16; 57%),
NOUN –[parataxis]–> NOUN (15; 63%),
INTJ –[discourse]–> INTJ (13; 100%).