Treebank Statistics: UD_Slovenian-SSJ: Features: Gender
This feature is universal.
It occurs with 3 different values: Fem, Masc, Neut.
This is a layered feature with the following layers: Gender, Gender[psor].
121349 tokens (45%) have a non-empty value of Gender.
45177 types (93%) occur at least once with a non-empty value of Gender.
21537 lemmas (85%) occur at least once with a non-empty value of Gender.
The feature is used with 8 part-of-speech tags: NOUN (56865; 21% instances), ADJ (28426; 11% instances), VERB (11423; 4% instances), PROPN (10239; 4% instances), DET (7978; 3% instances), PRON (3964; 1% instances), AUX (1441; 1% instances), NUM (1013; 0% instances).
NOUN
56865 NOUN tokens (100% of all NOUN tokens) have a non-empty value of Gender.
The most frequent other feature values with which NOUN and Gender co-occurred: Number=Sing (40401; 71%).
NOUN tokens may have the following values of Gender:
Fem(23065; 41% of non-emptyGender): strani, države, večina, možnosti, vrste, država, oči, podlagi, poti, skupineMasc(23631; 42% of non-emptyGender): primer, dan, ljudi, čas, del, času, dni, tolarjev, svetu, milijonovNeut(10169; 18% of non-emptyGender): leta, let, delo, leto, letih, mesto, dela, življenje, vprašanje, mestu
| Paradigm delo | Masc | Neut |
|---|---|---|
| Case=Acc|Number=Sing | delo | |
| Case=Acc|Number=Dual | deli | |
| Case=Acc|Number=Plur | dela | |
| Case=Dat|Number=Sing | delu | |
| Case=Gen|Number=Sing | dela | dela |
| Case=Gen|Number=Plur | del | |
| Case=Ins|Number=Sing | delom | |
| Case=Ins|Number=Plur | deli | |
| Case=Loc|Number=Sing | delu | |
| Case=Loc|Number=Plur | delih | |
| Case=Nom|Number=Sing | delo | |
| Case=Nom|Number=Plur | dela |
Gender seems to be lexical feature of NOUN. 100% lemmas (8840) occur only with one value of Gender.
ADJ
28426 ADJ tokens (100% of all ADJ tokens) have a non-empty value of Gender.
The most frequent other feature values with which ADJ and Gender co-occurred: Degree=Pos (25970; 91%), VerbForm=EMPTY (24781; 87%), Definite=EMPTY (24330; 86%), Number=Sing (19341; 68%).
ADJ tokens may have the following values of Gender:
Fem(11860; 42% of non-emptyGender): druge, drugi, prva, nove, velika, sama, novo, druga, evropske, drugihMasc(11337; 40% of non-emptyGender): prvi, drugi, sam, drugih, slovenski, sami, velik, novi, pravi, velikiNeut(5229; 18% of non-emptyGender): mogoče, potrebno, pomembno, jasno, novo, drugim, težko, znano, dobro, podobno
| Paradigm drug | Masc | Fem | Neut |
|---|---|---|---|
| Case=Acc|Definite=Def|Number=Sing | drugi | ||
| Case=Acc|Definite=Ind|Number=Sing | drug | ||
| Case=Acc|Number=Sing | drugega | drugo | drugo |
| Case=Acc|Number=Plur | druge | druge | druga |
| Case=Dat|Number=Sing | drugemu | drugi | |
| Case=Dat|Number=Plur | drugim | ||
| Case=Gen|Number=Sing | drugega | druge | drugega |
| Case=Gen|Number=Plur | drugih | drugih | drugih |
| Case=Ins|Number=Sing | drugim | drugo | drugim |
| Case=Ins|Number=Plur | drugimi | drugimi | drugimi |
| Case=Loc|Number=Sing | drugem | drugi | drugem |
| Case=Loc|Number=Dual | drugih | drugih | |
| Case=Loc|Number=Plur | drugih | drugih | drugih |
| Case=Nom|Definite=Def|Number=Sing | drugi | ||
| Case=Nom|Definite=Ind|Number=Sing | drug | ||
| Case=Nom|Number=Sing | druga | drugo | |
| Case=Nom|Number=Dual | drugi | ||
| Case=Nom|Number=Plur | drugi | druge | druga |
VERB
11423 VERB tokens (46% of all VERB tokens) have a non-empty value of Gender.
The most frequent other feature values with which VERB and Gender co-occurred: Mood=EMPTY (11423; 100%), Person=EMPTY (11423; 100%), Tense=EMPTY (11423; 100%), VerbForm=Part (11423; 100%), Number=Sing (7536; 66%), Aspect=Perf (6957; 61%).
VERB tokens may have the following values of Gender:
Fem(2834; 25% of non-emptyGender): bila, imela, postala, morala, začela, rekla, dobila, prišla, pokazala, povedalaMasc(7346; 64% of non-emptyGender): imel, moral, povedal, imeli, morali, bil, rekel, začel, dejal, postalNeut(1243; 11% of non-emptyGender): bilo, zgodilo, uspelo, prišlo, šlo, zdelo, začelo, ostalo, dalo, imeloEMPTY(13169): je, ima, ni, gre, so, imajo, bo, mora, pomeni, pravi
| Paradigm biti | Masc | Fem | Neut |
|---|---|---|---|
| Number=Sing | bil | bila | bilo, blo |
| Number=Dual | bila, bla | bili | bili |
| Number=Plur | bili | bile |
PROPN
10239 PROPN tokens (100% of all PROPN tokens) have a non-empty value of Gender.
The most frequent other feature values with which PROPN and Gender co-occurred: Number=Sing (9702; 95%), Case=Nom (5770; 56%).
PROPN tokens may have the following values of Gender:
Fem(3302; 32% of non-emptyGender): Slovenije, Sloveniji, EU, Slovenija, ZDA, Evropi, Ljubljana, Ljubljani, Evrope, SlovenijoMasc(6668; 65% of non-emptyGender): Maribor, Janez, Mariboru, New, Bojan, ESS, Jože, Slovenci, Slovencev, BorisNeut(269; 3% of non-emptyGender): Celje, Kosova, Hrvaškem, Japonskem, Kitajskem, Koroškem, Kosovu, Laško, Slovenskem, Celju
| Paradigm EU | Masc | Fem |
|---|---|---|
| Case=Acc | EU | |
| Case=Gen | EU | |
| Case=Loc | EU | |
| Case=Nom | EU | EU |
Gender seems to be lexical feature of PROPN. 98% lemmas (4962) occur only with one value of Gender.
DET
7978 DET tokens (85% of all DET tokens) have a non-empty value of Gender.
The most frequent other feature values with which DET and Gender co-occurred: Number[psor]=EMPTY (6502; 81%), Person=EMPTY (6502; 81%), Number=Sing (5692; 71%), Poss=EMPTY (5684; 71%).
DET tokens may have the following values of Gender:
Fem(2511; 31% of non-emptyGender): svojo, te, svoje, ta, vse, to, svoji, tej, kateri, vsehMasc(2875; 36% of non-emptyGender): ta, vsi, tem, vsak, svoj, njegov, katerem, vse, tega, tistiNeut(2592; 32% of non-emptyGender): to, tem, tega, vse, temu, svoje, njegovo, tisto, vsega, vsemEMPTY(1374): več, nekaj, veliko, manj, dovolj, malo, toliko, pol, preveč, največ
| Paradigm ta | Masc | Fem | Neut |
|---|---|---|---|
| Case=Acc|Number=Sing | ta, tega | to | to |
| Case=Acc|Number=Dual | ti | ||
| Case=Acc|Number=Plur | te | te | ta |
| Case=Dat|Number=Sing | temu | tej | temu |
| Case=Dat|Number=Plur | tem | tem | tem |
| Case=Gen|Number=Sing | tega | te | tega |
| Case=Gen|Number=Dual | teh | ||
| Case=Gen|Number=Plur | teh | teh | teh |
| Case=Ins|Number=Sing | tem | to | tem |
| Case=Ins|Number=Dual | tema | ||
| Case=Ins|Number=Plur | temi | temi | temi |
| Case=Loc|Number=Sing | tem | tej | tem |
| Case=Loc|Number=Dual | teh | ||
| Case=Loc|Number=Plur | teh | teh | teh |
| Case=Nom|Number=Sing | ta | ta | to |
| Case=Nom|Number=Dual | ta | ti | |
| Case=Nom|Number=Plur | ti | te | ta |
PRON
3964 PRON tokens (44% of all PRON tokens) have a non-empty value of Gender.
The most frequent other feature values with which PRON and Gender co-occurred: Reflex=EMPTY (3964; 100%), Number=Sing (2985; 75%), PronType=Prs (2822; 71%), Person=3 (2772; 70%), Variant=Short (2162; 55%), Case=Acc (1988; 50%).
PRON tokens may have the following values of Gender:
Fem(972; 25% of non-emptyGender): jo, jih, ji, njej, njo, je, ona, jim, njih, njeMasc(1906; 48% of non-emptyGender): ga, jih, mu, jim, kdo, njim, njimi, njih, njem, nihčeNeut(1086; 27% of non-emptyGender): kar, kaj, nekaj, nič, ga, jih, čemer, česar, ničesar, marsikajEMPTY(5142): se, si, mi, nas, nam, me, vam, vas, jaz, ti
| Paradigm on | Masc | Fem | Neut |
|---|---|---|---|
| Case=Acc|Number=Sing | njega | njo | |
| Case=Acc|Number=Sing|Variant=Short | ga | jo | ga |
| Case=Acc|Number=Dual | njiju, onadva | njiju | |
| Case=Acc|Number=Dual|Variant=Short | ju, jih | ju | ju |
| Case=Acc|Number=Plur | njih, nje | ||
| Case=Acc|Number=Plur|Variant=Short | jih | jih | jih |
| Case=Dat|Number=Sing | njemu | njej | |
| Case=Dat|Number=Sing|Variant=Short | mu | ji | mu |
| Case=Dat|Number=Dual | njima | ||
| Case=Dat|Number=Dual|Variant=Short | jima | jima | |
| Case=Dat|Number=Plur | njim | njim | |
| Case=Dat|Number=Plur|Variant=Short | jim | jim | jim |
| Case=Gen|Number=Sing | njega | nje | njega |
| Case=Gen|Number=Sing|Variant=Short | ga | je | ga |
| Case=Gen|Number=Dual | njiju | ||
| Case=Gen|Number=Dual|Variant=Short | ju | ||
| Case=Gen|Number=Plur | njih | njih | njih |
| Case=Gen|Number=Plur|Variant=Short | jih | jih | jih |
| Case=Ins|Number=Sing | njim | njo | njim |
| Case=Ins|Number=Dual | njima | njima | |
| Case=Ins|Number=Plur | njimi | njimi | njimi |
| Case=Loc|Number=Sing | njem | njej | njem |
| Case=Loc|Number=Dual | njiju | njima | |
| Case=Loc|Number=Plur | njih | njih | njih |
| Case=Nom|Number=Sing | on | ona | |
| Case=Nom|Number=Dual | onadva | ||
| Case=Nom|Number=Plur | oni |
AUX
1441 AUX tokens (8% of all AUX tokens) have a non-empty value of Gender.
The most frequent other feature values with which AUX and Gender co-occurred: Mood=EMPTY (1441; 100%), Person=EMPTY (1441; 100%), Polarity=EMPTY (1441; 100%), Tense=EMPTY (1441; 100%), VerbForm=Part (1441; 100%), Number=Sing (1129; 78%).
AUX tokens may have the following values of Gender:
Fem(442; 31% of non-emptyGender): bila, bile, bili, blaMasc(719; 50% of non-emptyGender): bil, bili, bila, bliNeut(280; 19% of non-emptyGender): bilo, bila, bili, bloEMPTY(15886): je, so, bi, bo, ni, sem, bodo, sta, smo, niso
| Paradigm biti | Masc | Fem | Neut |
|---|---|---|---|
| Number=Sing | bil | bila, bla | bilo, blo |
| Number=Dual | bila | bili | bili |
| Number=Plur | bili, bli | bile | bila |
NUM
1013 NUM tokens (18% of all NUM tokens) have a non-empty value of Gender.
The most frequent other feature values with which NUM and Gender co-occurred: NumForm=Word (1013; 100%), NumType=Card (1007; 99%).
NUM tokens may have the following values of Gender:
Fem(417; 41% of non-emptyGender): ena, eno, dve, eni, tri, dveh, štiri, ene, treh, dvemaMasc(449; 44% of non-emptyGender): dva, eden, en, enega, dveh, enem, tri, treh, trije, štiriNeut(147; 15% of non-emptyGender): eno, dve, tri, štirih, dveh, enem, treh, štiri, dvema, tremiEMPTY(4572): 2, 1, 10, 3, 6, 30, 1., 20, pet, tisoč
| Paradigm en | Masc | Fem | Neut |
|---|---|---|---|
| Case=Acc|Number=Sing | en, enega, Enga | eno | eno |
| Case=Dat|Number=Sing | enemu | eni | |
| Case=Gen|Number=Sing | enega, enga | ene | enega |
| Case=Gen|Number=Plur | enih | ||
| Case=Ins|Number=Sing | enim | eno | enim |
| Case=Loc|Number=Sing | enem | eni | enem |
| Case=Loc|Number=Plur | enih | ||
| Case=Nom|Number=Sing | en | ena | eno |
| Case=Nom|Number=Plur | eni |
Relations with Agreement in Gender
The 10 most frequent relations where parent and child node agree in Gender:
NOUN –[amod]–> ADJ (21257; 99%),
NOUN –[det]–> DET (4982; 88%),
ADJ –[nsubj]–> NOUN (1636; 98%),
NOUN –[nmod]–> PROPN (1633; 54%),
PROPN –[flat:name]–> PROPN (1539; 99%),
ADJ –[conj]–> ADJ (1216; 93%),
VERB –[nsubj]–> PROPN (1065; 73%),
VERB –[conj]–> VERB (1004; 69%),
PROPN –[conj]–> PROPN (625; 77%),
PROPN –[amod]–> ADJ (454; 99%).