Treebank Statistics: UD_Slovenian-SSJ: Features: Gender
This feature is universal.
It occurs with 3 different values: Fem
, Masc
, Neut
.
This is a layered feature with the following layers: Gender, Gender[psor].
121349 tokens (45%) have a non-empty value of Gender
.
45177 types (93%) occur at least once with a non-empty value of Gender
.
21537 lemmas (85%) occur at least once with a non-empty value of Gender
.
The feature is used with 8 part-of-speech tags: NOUN (56865; 21% instances), ADJ (28426; 11% instances), VERB (11423; 4% instances), PROPN (10239; 4% instances), DET (7978; 3% instances), PRON (3964; 1% instances), AUX (1441; 1% instances), NUM (1013; 0% instances).
NOUN
56865 NOUN tokens (100% of all NOUN
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which NOUN
and Gender
co-occurred: Number=Sing (40401; 71%).
NOUN
tokens may have the following values of Gender
:
Fem
(23065; 41% of non-emptyGender
): strani, države, večina, možnosti, vrste, država, oči, podlagi, poti, skupineMasc
(23631; 42% of non-emptyGender
): primer, dan, ljudi, čas, del, času, dni, tolarjev, svetu, milijonovNeut
(10169; 18% of non-emptyGender
): leta, let, delo, leto, letih, mesto, dela, življenje, vprašanje, mestu
Paradigm delo | Masc | Neut |
---|---|---|
Case=Acc|Number=Sing | delo | |
Case=Acc|Number=Dual | deli | |
Case=Acc|Number=Plur | dela | |
Case=Dat|Number=Sing | delu | |
Case=Gen|Number=Sing | dela | dela |
Case=Gen|Number=Plur | del | |
Case=Ins|Number=Sing | delom | |
Case=Ins|Number=Plur | deli | |
Case=Loc|Number=Sing | delu | |
Case=Loc|Number=Plur | delih | |
Case=Nom|Number=Sing | delo | |
Case=Nom|Number=Plur | dela |
Gender
seems to be lexical feature of NOUN
. 100% lemmas (8840) occur only with one value of Gender
.
ADJ
28426 ADJ tokens (100% of all ADJ
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which ADJ
and Gender
co-occurred: Degree=Pos (25970; 91%), VerbForm=EMPTY (24781; 87%), Definite=EMPTY (24330; 86%), Number=Sing (19341; 68%).
ADJ
tokens may have the following values of Gender
:
Fem
(11860; 42% of non-emptyGender
): druge, drugi, prva, nove, velika, sama, novo, druga, evropske, drugihMasc
(11337; 40% of non-emptyGender
): prvi, drugi, sam, drugih, slovenski, sami, velik, novi, pravi, velikiNeut
(5229; 18% of non-emptyGender
): mogoče, potrebno, pomembno, jasno, novo, drugim, težko, znano, dobro, podobno
Paradigm drug | Masc | Fem | Neut |
---|---|---|---|
Case=Acc|Definite=Def|Number=Sing | drugi | ||
Case=Acc|Definite=Ind|Number=Sing | drug | ||
Case=Acc|Number=Sing | drugega | drugo | drugo |
Case=Acc|Number=Plur | druge | druge | druga |
Case=Dat|Number=Sing | drugemu | drugi | |
Case=Dat|Number=Plur | drugim | ||
Case=Gen|Number=Sing | drugega | druge | drugega |
Case=Gen|Number=Plur | drugih | drugih | drugih |
Case=Ins|Number=Sing | drugim | drugo | drugim |
Case=Ins|Number=Plur | drugimi | drugimi | drugimi |
Case=Loc|Number=Sing | drugem | drugi | drugem |
Case=Loc|Number=Dual | drugih | drugih | |
Case=Loc|Number=Plur | drugih | drugih | drugih |
Case=Nom|Definite=Def|Number=Sing | drugi | ||
Case=Nom|Definite=Ind|Number=Sing | drug | ||
Case=Nom|Number=Sing | druga | drugo | |
Case=Nom|Number=Dual | drugi | ||
Case=Nom|Number=Plur | drugi | druge | druga |
VERB
11423 VERB tokens (46% of all VERB
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which VERB
and Gender
co-occurred: Mood=EMPTY (11423; 100%), Person=EMPTY (11423; 100%), Tense=EMPTY (11423; 100%), VerbForm=Part (11423; 100%), Number=Sing (7536; 66%), Aspect=Perf (6957; 61%).
VERB
tokens may have the following values of Gender
:
Fem
(2834; 25% of non-emptyGender
): bila, imela, postala, morala, začela, rekla, dobila, prišla, pokazala, povedalaMasc
(7346; 64% of non-emptyGender
): imel, moral, povedal, imeli, morali, bil, rekel, začel, dejal, postalNeut
(1243; 11% of non-emptyGender
): bilo, zgodilo, uspelo, prišlo, šlo, zdelo, začelo, ostalo, dalo, imeloEMPTY
(13169): je, ima, ni, gre, so, imajo, bo, mora, pomeni, pravi
Paradigm biti | Masc | Fem | Neut |
---|---|---|---|
Number=Sing | bil | bila | bilo, blo |
Number=Dual | bila, bla | bili | bili |
Number=Plur | bili | bile |
PROPN
10239 PROPN tokens (100% of all PROPN
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which PROPN
and Gender
co-occurred: Number=Sing (9702; 95%), Case=Nom (5770; 56%).
PROPN
tokens may have the following values of Gender
:
Fem
(3302; 32% of non-emptyGender
): Slovenije, Sloveniji, EU, Slovenija, ZDA, Evropi, Ljubljana, Ljubljani, Evrope, SlovenijoMasc
(6668; 65% of non-emptyGender
): Maribor, Janez, Mariboru, New, Bojan, ESS, Jože, Slovenci, Slovencev, BorisNeut
(269; 3% of non-emptyGender
): Celje, Kosova, Hrvaškem, Japonskem, Kitajskem, Koroškem, Kosovu, Laško, Slovenskem, Celju
Paradigm EU | Masc | Fem |
---|---|---|
Case=Acc | EU | |
Case=Gen | EU | |
Case=Loc | EU | |
Case=Nom | EU | EU |
Gender
seems to be lexical feature of PROPN
. 98% lemmas (4962) occur only with one value of Gender
.
DET
7978 DET tokens (85% of all DET
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which DET
and Gender
co-occurred: Number[psor]=EMPTY (6502; 81%), Person=EMPTY (6502; 81%), Number=Sing (5692; 71%), Poss=EMPTY (5684; 71%).
DET
tokens may have the following values of Gender
:
Fem
(2511; 31% of non-emptyGender
): svojo, te, svoje, ta, vse, to, svoji, tej, kateri, vsehMasc
(2875; 36% of non-emptyGender
): ta, vsi, tem, vsak, svoj, njegov, katerem, vse, tega, tistiNeut
(2592; 32% of non-emptyGender
): to, tem, tega, vse, temu, svoje, njegovo, tisto, vsega, vsemEMPTY
(1374): več, nekaj, veliko, manj, dovolj, malo, toliko, pol, preveč, največ
Paradigm ta | Masc | Fem | Neut |
---|---|---|---|
Case=Acc|Number=Sing | ta, tega | to | to |
Case=Acc|Number=Dual | ti | ||
Case=Acc|Number=Plur | te | te | ta |
Case=Dat|Number=Sing | temu | tej | temu |
Case=Dat|Number=Plur | tem | tem | tem |
Case=Gen|Number=Sing | tega | te | tega |
Case=Gen|Number=Dual | teh | ||
Case=Gen|Number=Plur | teh | teh | teh |
Case=Ins|Number=Sing | tem | to | tem |
Case=Ins|Number=Dual | tema | ||
Case=Ins|Number=Plur | temi | temi | temi |
Case=Loc|Number=Sing | tem | tej | tem |
Case=Loc|Number=Dual | teh | ||
Case=Loc|Number=Plur | teh | teh | teh |
Case=Nom|Number=Sing | ta | ta | to |
Case=Nom|Number=Dual | ta | ti | |
Case=Nom|Number=Plur | ti | te | ta |
PRON
3964 PRON tokens (44% of all PRON
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which PRON
and Gender
co-occurred: Reflex=EMPTY (3964; 100%), Number=Sing (2985; 75%), PronType=Prs (2822; 71%), Person=3 (2772; 70%), Variant=Short (2162; 55%), Case=Acc (1988; 50%).
PRON
tokens may have the following values of Gender
:
Fem
(972; 25% of non-emptyGender
): jo, jih, ji, njej, njo, je, ona, jim, njih, njeMasc
(1906; 48% of non-emptyGender
): ga, jih, mu, jim, kdo, njim, njimi, njih, njem, nihčeNeut
(1086; 27% of non-emptyGender
): kar, kaj, nekaj, nič, ga, jih, čemer, česar, ničesar, marsikajEMPTY
(5142): se, si, mi, nas, nam, me, vam, vas, jaz, ti
Paradigm on | Masc | Fem | Neut |
---|---|---|---|
Case=Acc|Number=Sing | njega | njo | |
Case=Acc|Number=Sing|Variant=Short | ga | jo | ga |
Case=Acc|Number=Dual | njiju, onadva | njiju | |
Case=Acc|Number=Dual|Variant=Short | ju, jih | ju | ju |
Case=Acc|Number=Plur | njih, nje | ||
Case=Acc|Number=Plur|Variant=Short | jih | jih | jih |
Case=Dat|Number=Sing | njemu | njej | |
Case=Dat|Number=Sing|Variant=Short | mu | ji | mu |
Case=Dat|Number=Dual | njima | ||
Case=Dat|Number=Dual|Variant=Short | jima | jima | |
Case=Dat|Number=Plur | njim | njim | |
Case=Dat|Number=Plur|Variant=Short | jim | jim | jim |
Case=Gen|Number=Sing | njega | nje | njega |
Case=Gen|Number=Sing|Variant=Short | ga | je | ga |
Case=Gen|Number=Dual | njiju | ||
Case=Gen|Number=Dual|Variant=Short | ju | ||
Case=Gen|Number=Plur | njih | njih | njih |
Case=Gen|Number=Plur|Variant=Short | jih | jih | jih |
Case=Ins|Number=Sing | njim | njo | njim |
Case=Ins|Number=Dual | njima | njima | |
Case=Ins|Number=Plur | njimi | njimi | njimi |
Case=Loc|Number=Sing | njem | njej | njem |
Case=Loc|Number=Dual | njiju | njima | |
Case=Loc|Number=Plur | njih | njih | njih |
Case=Nom|Number=Sing | on | ona | |
Case=Nom|Number=Dual | onadva | ||
Case=Nom|Number=Plur | oni |
AUX
1441 AUX tokens (8% of all AUX
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which AUX
and Gender
co-occurred: Mood=EMPTY (1441; 100%), Person=EMPTY (1441; 100%), Polarity=EMPTY (1441; 100%), Tense=EMPTY (1441; 100%), VerbForm=Part (1441; 100%), Number=Sing (1129; 78%).
AUX
tokens may have the following values of Gender
:
Fem
(442; 31% of non-emptyGender
): bila, bile, bili, blaMasc
(719; 50% of non-emptyGender
): bil, bili, bila, bliNeut
(280; 19% of non-emptyGender
): bilo, bila, bili, bloEMPTY
(15886): je, so, bi, bo, ni, sem, bodo, sta, smo, niso
Paradigm biti | Masc | Fem | Neut |
---|---|---|---|
Number=Sing | bil | bila, bla | bilo, blo |
Number=Dual | bila | bili | bili |
Number=Plur | bili, bli | bile | bila |
NUM
1013 NUM tokens (18% of all NUM
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which NUM
and Gender
co-occurred: NumForm=Word (1013; 100%), NumType=Card (1007; 99%).
NUM
tokens may have the following values of Gender
:
Fem
(417; 41% of non-emptyGender
): ena, eno, dve, eni, tri, dveh, štiri, ene, treh, dvemaMasc
(449; 44% of non-emptyGender
): dva, eden, en, enega, dveh, enem, tri, treh, trije, štiriNeut
(147; 15% of non-emptyGender
): eno, dve, tri, štirih, dveh, enem, treh, štiri, dvema, tremiEMPTY
(4572): 2, 1, 10, 3, 6, 30, 1., 20, pet, tisoč
Paradigm en | Masc | Fem | Neut |
---|---|---|---|
Case=Acc|Number=Sing | en, enega, Enga | eno | eno |
Case=Dat|Number=Sing | enemu | eni | |
Case=Gen|Number=Sing | enega, enga | ene | enega |
Case=Gen|Number=Plur | enih | ||
Case=Ins|Number=Sing | enim | eno | enim |
Case=Loc|Number=Sing | enem | eni | enem |
Case=Loc|Number=Plur | enih | ||
Case=Nom|Number=Sing | en | ena | eno |
Case=Nom|Number=Plur | eni |
Relations with Agreement in Gender
The 10 most frequent relations where parent and child node agree in Gender
:
NOUN –[amod]–> ADJ (21257; 99%),
NOUN –[det]–> DET (4982; 88%),
ADJ –[nsubj]–> NOUN (1636; 98%),
NOUN –[nmod]–> PROPN (1633; 54%),
PROPN –[flat:name]–> PROPN (1539; 99%),
ADJ –[conj]–> ADJ (1216; 93%),
VERB –[nsubj]–> PROPN (1065; 73%),
VERB –[conj]–> VERB (1004; 69%),
PROPN –[conj]–> PROPN (625; 77%),
PROPN –[amod]–> ADJ (454; 99%).