Treebank Statistics: UD_Slovenian-SST: Features: Gender
This feature is universal.
It occurs with 3 different values: Fem
, Masc
, Neut
.
This is a layered feature with the following layers: Gender, Gender[psor].
9589 tokens (33%) have a non-empty value of Gender
.
4433 types (72%) occur at least once with a non-empty value of Gender
.
2967 lemmas (75%) occur at least once with a non-empty value of Gender
.
The feature is used with 8 part-of-speech tags: NOUN (3626; 12% instances), ADJ (1664; 6% instances), DET (1611; 5% instances), VERB (1164; 4% instances), PRON (682; 2% instances), PROPN (444; 2% instances), NUM (270; 1% instances), AUX (128; 0% instances).
NOUN
3626 NOUN tokens (100% of all NOUN
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which NOUN
and Gender
co-occurred: Animacy=EMPTY (3245; 89%), Number=Sing (2736; 75%).
NOUN
tokens may have the following values of Gender
:
Fem
(1518; 42% of non-emptyGender
): strani, stvari, hvala, minut, stopinj, gospa, stran, razmere, stvar, vezeMasc
(1627; 45% of non-emptyGender
): dan, redu, čas, evrov, koncu, gospod, ljudi, način, dni, delNeut
(481; 13% of non-emptyGender
): bistvu, jutro, leto, leta, vprašanje, letih, ime, mestu, let, leti
Paradigm oči | Masc | Fem |
---|---|---|
Case=Gen|Number=Plur | oči | |
Case=Nom|Number=Sing | oči |
Gender
seems to be lexical feature of NOUN
. 100% lemmas (1523) occur only with one value of Gender
.
ADJ
1664 ADJ tokens (100% of all ADJ
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which ADJ
and Gender
co-occurred: VerbForm=EMPTY (1478; 89%), Degree=Pos (1442; 87%), Definite=EMPTY (1350; 81%), Number=Sing (1266; 76%), Case=Nom (880; 53%).
ADJ
tokens may have the following values of Gender
:
Fem
(650; 39% of non-emptyGender
): drugo, druga, lepa, rdeča, same, druge, glavna, sama, dobra, tretjoMasc
(650; 39% of non-emptyGender
): dober, prvi, drugi, lep, sam, stari, mali, sami, cel, praviNeut
(364; 22% of non-emptyGender
): dobro, glavnem, zanimivo, drugega, mogoče, drugo, hudega, jasno, podobno, pomembno
Paradigm drug | Masc | Fem | Neut |
---|---|---|---|
Case=Acc|Definite=Def|Number=Sing | drugi | ||
Case=Acc|Number=Sing | drugo | drugo | |
Case=Acc|Number=Plur | druge | druge | |
Case=Dat|Number=Sing | drugemu | ||
Case=Gen|Number=Sing | drugega | druge | drugega |
Case=Gen|Number=Plur | drugih | drugih | |
Case=Ins|Number=Sing | drugo | drugim | |
Case=Ins|Number=Plur | drugimi | ||
Case=Loc|Number=Sing | drugi | drugem | |
Case=Loc|Number=Dual | drugih | ||
Case=Nom|Definite=Def|Number=Sing | drugi | ||
Case=Nom|Definite=Ind|Number=Sing | drug | ||
Case=Nom|Number=Sing | druga | drugo | |
Case=Nom|Number=Plur | drugi | druge |
DET
1611 DET tokens (87% of all DET
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which DET
and Gender
co-occurred: Number=Sing (1332; 83%), PronType=Dem (1055; 65%).
DET
tokens may have the following values of Gender
:
Fem
(363; 23% of non-emptyGender
): ta, te, to, tej, take, naša, neke, neko, teh, mojaMasc
(415; 26% of non-emptyGender
): ta, tisti, tem, vsi, tega, kakšen, ti, oni, vsak, tehNeut
(833; 52% of non-emptyGender
): to, vse, nič, tem, tega, nekaj, tisto, tole, temu, takoEMPTY
(233): malo, nekaj, več, koliko, dosti, toliko, veliko, pol, manj, preveč
Paradigm ta | Masc | Fem | Neut |
---|---|---|---|
Case=Acc|Number=Sing | ta, tega | to | to |
Case=Acc|Number=Plur | te | te | ta |
Case=Dat|Number=Sing | temu | tej | temu |
Case=Dat|Number=Plur | tem | tem | tem |
Case=Gen|Number=Sing | tega | te | tega |
Case=Gen|Number=Plur | teh | teh | teh |
Case=Ins|Number=Sing | tem | to | tem |
Case=Ins|Number=Plur | temi | temi | |
Case=Loc|Number=Sing | tem | tej | tem |
Case=Loc|Number=Plur | teh | teh | |
Case=Nom|Number=Sing | ta | ta | to |
Case=Nom|Number=Dual | ti | ||
Case=Nom|Number=Plur | ti | te | ta |
VERB
1164 VERB tokens (30% of all VERB
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which VERB
and Gender
co-occurred: Mood=EMPTY (1164; 100%), Person=EMPTY (1164; 100%), Polarity=EMPTY (1164; 100%), Tense=EMPTY (1164; 100%), VerbForm=Part (1164; 100%), Number=Sing (781; 67%).
VERB
tokens may have the following values of Gender
:
Fem
(314; 27% of non-emptyGender
): imela, šla, bila, rekla, videla, dala, naredila, delala, izdala, moglaMasc
(719; 62% of non-emptyGender
): bil, rekel, imeli, imel, rekli, šel, dobil, videl, videli, mogelNeut
(131; 11% of non-emptyGender
): bilo, šlo, prišlo, zgodilo, dalo, ostalo, moglo, moralo, potegnilo, rataloEMPTY
(2769): je, vem, veš, mislim, ni, recimo, ima, so, bo, pravi
Paradigm biti | Masc | Fem | Neut |
---|---|---|---|
Aspect=Imp|Number=Sing | bil | bilo | |
Number=Sing | bil | bila | bilo |
Number=Dual | bila | ||
Number=Plur | bili | bile |
PRON
682 PRON tokens (42% of all PRON
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which PRON
and Gender
co-occurred: Reflex=EMPTY (682; 100%), Number=Sing (493; 72%), Variant=EMPTY (473; 69%), PronType=Prs (397; 58%).
PRON
tokens may have the following values of Gender
:
Fem
(122; 18% of non-emptyGender
): jo, ona, jih, ji, je, njej, njo, midve, me, njeMasc
(320; 47% of non-emptyGender
): ga, mi, jih, kdo, vi, on, jim, mu, oni, njegaNeut
(240; 35% of non-emptyGender
): kaj, kar, česa, čim, jih, karkoli, ga, marsikaj, čem, čemerEMPTY
(959): se, jaz, ti, mi, si, nas, vam, meni, me, mene
Paradigm on | Masc | Fem | Neut |
---|---|---|---|
Case=Acc|Number=Sing | njega | njo | |
Case=Acc|Number=Sing|Variant=Short | ga | jo | ga |
Case=Acc|Number=Plur | njih | ||
Case=Acc|Number=Plur|Variant=Short | jih | jih | jih |
Case=Dat|Number=Sing | njemu | njej | |
Case=Dat|Number=Sing|Variant=Short | mu | ji | |
Case=Dat|Number=Plur | njim | ||
Case=Dat|Number=Plur|Variant=Short | jim | jim | |
Case=Gen|Number=Sing | njega | nje | |
Case=Gen|Number=Sing|Variant=Short | ga | je | |
Case=Gen|Number=Plur|Variant=Short | jih | jih | |
Case=Ins|Number=Sing | njim | njo | |
Case=Ins|Number=Dual | njima | ||
Case=Ins|Number=Plur | njimi | njimi | |
Case=Loc|Number=Sing | njem | njej | |
Case=Loc|Number=Plur | njih | njih | |
Case=Nom|Number=Sing | on | ona | |
Case=Nom|Number=Dual | onadva | ||
Case=Nom|Number=Plur | oni | one |
PROPN
444 PROPN tokens (59% of all PROPN
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which PROPN
and Gender
co-occurred: Number=Sing (403; 91%), Case=Nom (233; 52%).
PROPN
tokens may have the following values of Gender
:
Fem
(167; 38% of non-emptyGender
): slovenija, sloveniji, slovenije, božjah, karavanke, bistrica, evropi, jugoslaviji, orsa, viktorijeMasc
(267; 60% of non-emptyGender
): jones, tom, david, healy, iraku, jezus, herman, paranoid, petty, quincyNeut
(10; 2% of non-emptyGender
): pohorja, celja, jezerskim, laškega, madžarskem, pohorje, posočju, velenjuEMPTY
(314): [name:personal], [name:surname], [name:address], [name:organisation], [name:place]
Gender
seems to be lexical feature of PROPN
. 100% lemmas (306) occur only with one value of Gender
.
NUM
270 NUM tokens (54% of all NUM
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which NUM
and Gender
co-occurred: NumForm=Word (270; 100%), NumType=Card (269; 100%), Number=Sing (153; 57%).
NUM
tokens may have the following values of Gender
:
Fem
(127; 47% of non-emptyGender
): eno, ena, dve, ene, tri, štiri, dveh, eni, treh, štirihMasc
(125; 46% of non-emptyGender
): dva, en, enega, tri, eden, eni, štirje, enim, štiri, trijeNeut
(18; 7% of non-emptyGender
): eno, tri, ena, dve, enim, tremi, štirih, štirimEMPTY
(229): tisoč, dvajset, pet, petnajst, sto, šest, deset, petdeset, petsto, sedem
Paradigm en | Masc | Fem | Neut |
---|---|---|---|
Case=Acc|Number=Sing | en, enega | eno | eno |
Case=Acc|Number=Plur | ene | ||
Case=Dat|Number=Sing | enemu | ||
Case=Gen|Number=Sing | enega | ene | |
Case=Gen|Number=Plur | enih | ||
Case=Ins|Number=Sing | enim | eno | enim |
Case=Loc|Number=Sing | eni | ||
Case=Nom|Number=Sing | en | ena | eno |
Case=Nom|Number=Plur | eni | ena |
AUX
128 AUX tokens (7% of all AUX
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which AUX
and Gender
co-occurred: Mood=EMPTY (128; 100%), Person=EMPTY (128; 100%), Polarity=EMPTY (128; 100%), Tense=EMPTY (128; 100%), VerbForm=Part (128; 100%), Number=Sing (105; 82%).
AUX
tokens may have the following values of Gender
:
Fem
(36; 28% of non-emptyGender
): bila, bileMasc
(60; 47% of non-emptyGender
): bil, bili, bilaNeut
(32; 25% of non-emptyGender
): bilo, bilaEMPTY
(1808): je, so, sem, bi, bo, smo, ni, si, bomo, ste
Paradigm biti | Masc | Fem | Neut |
---|---|---|---|
Aspect=Imp|Number=Sing | bil | bilo | |
Number=Sing | bil | bila | bilo |
Number=Dual | bila | ||
Number=Plur | bili | bile | bila |
Relations with Agreement in Gender
The 10 most frequent relations where parent and child node agree in Gender
:
NOUN –[amod]–> ADJ (942; 99%),
NOUN –[det]–> DET (581; 90%),
NOUN –[nummod]–> NUM (141; 54%),
NOUN –[conj]–> NOUN (106; 60%),
ADJ –[nsubj]–> NOUN (75; 96%),
PROPN –[flat:name]–> PROPN (75; 100%),
ADJ –[conj]–> ADJ (50; 93%),
ADJ –[nsubj]–> DET (39; 93%),
NOUN –[appos]–> NOUN (30; 64%),
ADJ –[det]–> DET (27; 93%).