Treebank Statistics: UD_Italian-KIParlaForest: Features: Gender
This feature is universal.
It occurs with 2 different values: Fem, Masc.
2755 tokens (29%) have a non-empty value of Gender.
816 types (48%) occur at least once with a non-empty value of Gender.
661 lemmas (53%) occur at least once with a non-empty value of Gender.
The feature is used with 11 part-of-speech tags: NOUN (1155; 12% instances), DET (746; 8% instances), ADJ (307; 3% instances), PRON (290; 3% instances), VERB (205; 2% instances), ADV (21; 0% instances), AUX (14; 0% instances), NUM (13; 0% instances), PROPN (2; 0% instances), CCONJ (1; 0% instances), X (1; 0% instances).
NOUN
1155 NOUN tokens (95% of all NOUN tokens) have a non-empty value of Gender.
The most frequent other feature values with which NOUN and Gender co-occurred: Number=Sing (803; 70%).
NOUN tokens may have the following values of Gender:
Fem(602; 52% of non-emptyGender): città, casa, realtà, università, cosa, parte, via, zona, macchina, voltaMasc(553; 48% of non-emptyGender): tipo, centro, anni, senso, minuti, uovo, anno, livello, piedi, saccoEMPTY(63): po’, cazzo, tipo, ‘mbare, badile, incentivo, inglese, apostrofo, assistente, audio
| Paradigm tipo | Masc | Fem |
|---|---|---|
| _ | tipo | |
| Number=Sing | tipo | tipa |
Gender seems to be lexical feature of NOUN. 98% lemmas (406) occur only with one value of Gender.
DET
746 DET tokens (86% of all DET tokens) have a non-empty value of Gender.
The most frequent other feature values with which DET and Gender co-occurred: PronType=Art (629; 84%), Number=Sing (550; 74%), Definite=Def (425; 57%).
DET tokens may have the following values of Gender:
Fem(363; 49% of non-emptyGender): la, le, una, un’, delle, quella, mia, questa, tutte, altraMasc(383; 51% of non-emptyGender): il, un, i, gli, dei, lo, tutti, questo, uno, ‘stoEMPTY(126): l’, che, tutto, ogni, ‘sta, altra, altre, dei, il, quanti
| Paradigm il | Masc | Fem |
|---|---|---|
| Definite=Def|Number=Sing|PronType=Art | il, lo | la |
| Definite=Def|Number=Plur|PronType=Art | i, gli | le |
| Number=Sing|Person=3|PronType=Prs | lo | la |
| Number=Plur|PronType=Art | i |
ADJ
307 ADJ tokens (75% of all ADJ tokens) have a non-empty value of Gender.
The most frequent other feature values with which ADJ and Gender co-occurred: PronType=EMPTY (261; 85%), Number=Sing (242; 79%).
ADJ tokens may have the following values of Gender:
Fem(151; 49% of non-emptyGender): mia, piccola, bella, tua, mezza, universitaria, altra, lontana, sola, tedescaMasc(156; 51% of non-emptyGender): esatto, miei, strano, grosso, piccolo, scorso, vero, bel, bellissimo, belloEMPTY(104): grande, difficile, certo, altra, familiare, intollerante, mezz’, culturale, diverse, inintelligibile
| Paradigm mio | Masc | Fem |
|---|---|---|
| Number=Sing|Poss=Yes|PronType=Prs | mia | |
| Number=Plur | miei | |
| Number=Plur|Poss=Yes|PronType=Prs | miei |
PRON
290 PRON tokens (29% of all PRON tokens) have a non-empty value of Gender.
The most frequent other feature values with which PRON and Gender co-occurred: Number=Sing (234; 81%), Person=EMPTY (165; 57%), PronType=Prs (147; 51%).
PRON tokens may have the following values of Gender:
Fem(65; 22% of non-emptyGender): lei, quella, questa, le, altra, la, quelle, tutta, tutte, unaMasc(225; 78% of non-emptyGender): lo, quello, l’, tutti, questo, uno, li, niente, quelli, altroEMPTY(712): c’, io, che, ci, mi, me, si, ti, te, ne
| Paradigm quello | Masc | Fem |
|---|---|---|
| Number=Sing | quello, quel | quella |
| Number=Plur | quelli | quelle |
VERB
205 VERB tokens (17% of all VERB tokens) have a non-empty value of Gender.
The most frequent other feature values with which VERB and Gender co-occurred: Mood=EMPTY (205; 100%), Person=EMPTY (205; 100%), Number=Sing (179; 87%), Tense=Past (172; 84%), VerbForm=Part (172; 84%).
VERB tokens may have the following values of Gender:
Fem(53; 26% of non-emptyGender): fatta, basta, chiusa, costruita, fo, legata, mangiata, preferita, ristrutturata, sputtanataMasc(152; 74% of non-emptyGender): detto, fatto, sentito, vissuto, capito, mangiato, parlato, pensato, raccontato, scopertoEMPTY(993): è, so, fa, fare, ha, diciamo, penso, era, hai, andare
| Paradigm essere | Masc | Fem |
|---|---|---|
| stato | stata |
ADV
21 ADV tokens (2% of all ADV tokens) have a non-empty value of Gender.
The most frequent other feature values with which ADV and Gender co-occurred: PronType=EMPTY (19; 90%).
ADV tokens may have the following values of Gender:
Fem(7; 33% of non-emptyGender): cosa, lì, molte, tutta, tutte, vicinaMasc(14; 67% of non-emptyGender): quanto, giusto, meno, bene, esatto, lontano, manco, pochino, quanti, veroEMPTY(1023): non, anche, più, poi, molto, sempre, così, no, fuori, adesso
| Paradigm vicino | Masc | Fem |
|---|---|---|
| vicino | vicina |
Gender seems to be lexical feature of ADV. 93% lemmas (13) occur only with one value of Gender.
AUX
14 AUX tokens (2% of all AUX tokens) have a non-empty value of Gender.
The most frequent other feature values with which AUX and Gender co-occurred: Mood=EMPTY (14; 100%), Person=EMPTY (14; 100%), Number=Sing (11; 79%), Tense=Past (9; 64%), VerbForm=Part (9; 64%).
AUX tokens may have the following values of Gender:
Fem(8; 57% of non-emptyGender): stata, sonMasc(6; 43% of non-emptyGender): son, stato, ero, stavoEMPTY(566): è, ho, era, ha, devi, sono, son, devo, hai, sei
| Paradigm essere | Masc | Fem |
|---|---|---|
| _ | son | son |
| Number=Sing | ero | |
| Number=Sing|Tense=Past|VerbForm=Part | stato | stata |
NUM
13 NUM tokens (14% of all NUM tokens) have a non-empty value of Gender.
The most frequent other feature values with which NUM and Gender co-occurred: NumType=Ord (12; 92%), Number=Sing (9; 69%).
NUM tokens may have the following values of Gender:
Fem(6; 46% of non-emptyGender): prima, secondaMasc(7; 54% of non-emptyGender): primi, primo, secondoEMPTY(79): due, quattro, tre, undici, cinquanta, dodici, quattordici, venti, cinque, dieci
| Paradigm primo | Masc | Fem |
|---|---|---|
| Number=Sing | primo | prima |
| Number=Plur | primi |
PROPN
2 PROPN tokens (1% of all PROPN tokens) have a non-empty value of Gender.
PROPN tokens may have the following values of Gender:
Masc(2; 100% of non-emptyGender): fermoEMPTY(255): [TOWN_NAME], ancona, bologna, pesaro, [PLACE_NAME], fermo, gialli, imola, marche, pasqua
CCONJ
1 CCONJ tokens (0% of all CCONJ tokens) have a non-empty value of Gender.
CCONJ tokens may have the following values of Gender:
Fem(1; 100% of non-emptyGender): oppureEMPTY(628): e, cioè, ma, però, quindi, comunque, o, infatti, invece, mentre
X
1 X tokens (0% of all X tokens) have a non-empty value of Gender.
X tokens may have the following values of Gender:
Fem(1; 100% of non-emptyGender): s~EMPTY(208): x, s~, day, may, no~, ti~, a~, bibbidibobbidibu, da~, d~
Relations with Agreement in Gender
The 10 most frequent relations where parent and child node agree in Gender:
NOUN –[det]–> DET (597; 84%),
NOUN –[amod]–> ADJ (134; 76%),
NOUN –[conj]–> NOUN (25; 60%),
ADJ –[nsubj]–> NOUN (11; 85%),
NOUN –[amod]–> DET (11; 69%),
NOUN –[parataxis]–> NOUN (11; 61%),
NOUN –[det:poss]–> DET (9; 100%),
NOUN –[reparandum]–> NOUN (8; 100%),
ADJ –[det]–> DET (7; 58%),
DET –[reparandum]–> DET (7; 54%).