Gender: gender
Gender is a lexical feature of nouns and proper nouns, and an inflectional feature of other parts of speech (adjectives, verbs, auxiliary, pronouns, determiners and numerals) that mark agreement with nouns.
Masc: masculine gender
Examples
- prijatelj “male friend”
- sluga “manservant”
- grad “castle”
- stroj “machine”
Fem: feminine gender
Examples
- prijateljica “female friend”
- bankirka “female banker”
- univerza “university”
- perut “wing”
Neut: neuter gender
Examples
- leto “year”
- življenje “life”
- podjetje “company”
- sodišče “court”
Conversion from JOS
All tokens with feature Gender=masculine are converted to Gender=Masc, all tokens with feature Gender=feminine are converted to Gender=Fem and all tokens with feature Gender=neuter are converted to Gender=Neut.
Treebank Statistics (UD_Slovenian)
This feature is universal.
It occurs with 3 different values: Fem, Masc, Neut.
This is a layered feature with the following layers: Gender, Gender[psor].
64731 tokens (46%) have a non-empty value of Gender.
29150 types (92%) occur at least once with a non-empty value of Gender.
14613 lemmas (87%) occur at least once with a non-empty value of Gender.
The feature is used with 8 part-of-speech tags: sl-pos/NOUN (30139; 21% instances), sl-pos/ADJ (15027; 11% instances), sl-pos/VERB (7644; 5% instances), sl-pos/PROPN (4682; 3% instances), sl-pos/PRON (3855; 3% instances), sl-pos/DET (2877; 2% instances), sl-pos/NUM (486; 0% instances), sl-pos/AUX (21; 0% instances).
NOUN
30139 sl-pos/NOUN tokens (100% of all NOUN tokens) have a non-empty value of Gender.
The most frequent other feature values with which NOUN and Gender co-occurred: Number=Sing (21345; 71%).
NOUN tokens may have the following values of Gender:
Fem(12182; 40% of non-emptyGender): strani, države, pomoč, oči, možnosti, poti, pot, stvari, skupine, vrsteMasc(12496; 41% of non-emptyGender): dan, čas, ljudi, del, tolarjev, dni, način, času, časa, otrokNeut(5461; 18% of non-emptyGender): leta, let, leto, letih, življenje, dela, delo, mesto, vprašanje, mestu
| Paradigm pot | Masc | Fem |
|---|---|---|
| Case=Acc|Number=Sing | pot | |
| Case=Acc|Number=Plur | poti | |
| Case=Dat|Number=Sing | poti | |
| Case=Gen|Number=Sing | pota | poti |
| Case=Gen|Number=Plur | poti | |
| Case=Ins|Number=Sing | potjo | |
| Case=Ins|Number=Plur | potmi | |
| Case=Loc|Number=Sing | poti | |
| Case=Loc|Number=Plur | poteh | |
| Case=Nom|Number=Sing | pot | |
| Case=Nom|Number=Plur | poti |
Gender seems to be lexical feature of NOUN. 100% lemmas (6404) occur only with one value of Gender.
ADJ
15027 sl-pos/ADJ tokens (100% of all ADJ tokens) have a non-empty value of Gender.
The most frequent other feature values with which ADJ and Gender co-occurred: Degree=Pos (13768; 92%), VerbForm=EMPTY (13098; 87%), Definite=EMPTY (12959; 86%), Number=Sing (10131; 67%).
ADJ tokens may have the following values of Gender:
Fem(6165; 41% of non-emptyGender): druge, drugi, nove, novo, prva, sama, velika, veliko, druga, drugoMasc(6013; 40% of non-emptyGender): sam, prvi, drugi, slovenski, drugih, pravi, sami, novi, velik, državniNeut(2849; 19% of non-emptyGender): mogoče, pomembno, jasno, potrebno, težko, novo, drugim, novega, prihodnje, drugo
| Paradigm drug | Masc | Fem | Neut |
|---|---|---|---|
| Case=Acc|Definite=Def|Number=Sing | drugi | ||
| Case=Acc|Definite=Ind|Number=Sing | drug | ||
| Case=Acc|Number=Sing | drugega | drugo | drugo |
| Case=Acc|Number=Plur | druge | druge | druga |
| Case=Dat|Number=Sing | drugemu | drugi | |
| Case=Dat|Number=Plur | drugim | ||
| Case=Gen|Number=Sing | drugega | druge | drugega |
| Case=Gen|Number=Plur | drugih | drugih | |
| Case=Ins|Number=Sing | drugim | drugo | drugim |
| Case=Ins|Number=Plur | drugimi | drugimi | |
| Case=Loc|Number=Sing | drugem | drugi | drugem |
| Case=Loc|Number=Dual | drugih | ||
| Case=Loc|Number=Plur | drugih | drugih | drugih |
| Case=Nom|Definite=Def|Number=Sing | drugi | ||
| Case=Nom|Definite=Ind|Number=Sing | drug | ||
| Case=Nom|Number=Sing | druga | drugo | |
| Case=Nom|Number=Plur | drugi | druge | druga |
VERB
7644 sl-pos/VERB tokens (44% of all VERB tokens) have a non-empty value of Gender.
The most frequent other feature values with which VERB and Gender co-occurred: Negative=EMPTY (7644; 100%), VerbForm=Part (7644; 100%), Tense=EMPTY (7644; 100%), Person=EMPTY (7644; 100%), Mood=EMPTY (7644; 100%), Number=Sing (5091; 67%), Aspect=Perf (4188; 55%).
VERB tokens may have the following values of Gender:
Fem(1947; 25% of non-emptyGender): bila, bile, imela, morala, začela, prišla, rekla, pokazala, vedela, dobilaMasc(4800; 63% of non-emptyGender): bil, bili, imel, moral, povedal, imeli, morali, dejal, začeli, postalNeut(897; 12% of non-emptyGender): bilo, zgodilo, uspelo, prišlo, zdelo, šlo, začelo, bila, ostalo, pomeniloEMPTY(9666): je, so, ni, bo, ima, gre, imajo, biti, mora, pomeni
| Paradigm biti | Masc | Fem | Neut |
|---|---|---|---|
| Number=Sing | bil | bila | bilo, blo |
| Number=Dual | bila, bla | bili | |
| Number=Plur | bili | bile | bila |
PROPN
4682 sl-pos/PROPN tokens (100% of all PROPN tokens) have a non-empty value of Gender.
The most frequent other feature values with which PROPN and Gender co-occurred: Number=Sing (4408; 94%), Case=Nom (2427; 52%).
PROPN tokens may have the following values of Gender:
Fem(1613; 34% of non-emptyGender): Slovenije, Sloveniji, Slovenija, EU, Ljubljani, ZDA, Slovenijo, Evropi, LJUBLJANA, LjubljanaMasc(2922; 62% of non-emptyGender): Mariboru, Slovenci, Maribor, New, Drnovšek, Janez, Jože, Gregor, Milan, SlovencevNeut(147; 3% of non-emptyGender): Celje, Kosova, Kosovu, Koroškem, Slovenskem, Dolenjskem, Gorenjskem, Hrvaškem, Celju, Japonskem
| Paradigm EU | Masc | Fem |
|---|---|---|
| Case=Acc | EU | |
| Case=Gen | EU | |
| Case=Loc | EU | |
| Case=Nom | EU | EU |
Gender seems to be lexical feature of PROPN. 99% lemmas (2571) occur only with one value of Gender.
PRON
3855 sl-pos/PRON tokens (56% of all PRON tokens) have a non-empty value of Gender.
The most frequent other feature values with which PRON and Gender co-occurred: Number=Sing (2987; 77%), Variant=EMPTY (2513; 65%), Person=EMPTY (2167; 56%).
PRON tokens may have the following values of Gender:
Fem(824; 21% of non-emptyGender): jo, jih, ji, njej, katero, njo, kateri, katerih, ta, katereMasc(1538; 40% of non-emptyGender): ga, jih, mu, kdo, jim, katerem, vsi, njim, njih, kateregaNeut(1493; 39% of non-emptyGender): to, tem, kaj, kar, tega, vse, nekaj, nič, ga, temuEMPTY(3074): se, si, mi, nas, nam, vam, me, vas, jaz, sebi
| Paradigm on | Masc | Fem | Neut |
|---|---|---|---|
| Case=Acc|Number=Sing | njega | njo | |
| Case=Acc|Number=Sing|Variant=Short | ga | jo | ga |
| Case=Acc|Number=Dual | njiju | ||
| Case=Acc|Number=Dual|Variant=Short | ju | ju | ju |
| Case=Acc|Number=Plur | njih, nje | ||
| Case=Acc|Number=Plur|Variant=Short | jih | jih | jih |
| Case=Dat|Number=Sing | njemu | njej | |
| Case=Dat|Number=Sing|Variant=Short | mu | ji | mu |
| Case=Dat|Number=Dual | njima | ||
| Case=Dat|Number=Dual|Variant=Short | jima | jima | |
| Case=Dat|Number=Plur | njim | njim | |
| Case=Dat|Number=Plur|Variant=Short | jim | jim | jim |
| Case=Gen|Number=Sing | njega | nje | njega |
| Case=Gen|Number=Sing|Variant=Short | ga | je | ga |
| Case=Gen|Number=Dual | njiju | ||
| Case=Gen|Number=Dual|Variant=Short | ju | ||
| Case=Gen|Number=Plur | njih | njih | njih |
| Case=Gen|Number=Plur|Variant=Short | jih | jih | jih |
| Case=Ins|Number=Sing | njim | njo | njim |
| Case=Ins|Number=Dual | njima | njima | |
| Case=Ins|Number=Plur | njimi | njimi | njimi |
| Case=Loc|Number=Sing | njem | njej | njem |
| Case=Loc|Number=Dual | njiju | ||
| Case=Loc|Number=Plur | njih | njih | njih |
| Case=Nom|Number=Sing | on | ona | |
| Case=Nom|Number=Dual | onadva | ||
| Case=Nom|Number=Plur | oni |
DET
2877 sl-pos/DET tokens (86% of all DET tokens) have a non-empty value of Gender.
The most frequent other feature values with which DET and Gender co-occurred: Degree=EMPTY (2877; 100%), Gender[psor]=EMPTY (2498; 87%), Number[psor]=EMPTY (2073; 72%), Poss=EMPTY (2073; 72%), Person=EMPTY (2073; 72%), Number=Sing (1926; 67%).
DET tokens may have the following values of Gender:
Fem(1173; 41% of non-emptyGender): svojo, te, svoje, to, svoji, vse, tej, vseh, ta, njegoveMasc(1230; 43% of non-emptyGender): ta, tem, svoj, vsak, njegov, vse, tega, ves, svojega, tehNeut(474; 16% of non-emptyGender): svoje, tem, to, njegovo, tega, svojega, vsako, njeno, svoja, svojimEMPTY(455): nekaj, več, veliko, dovolj, manj, malo, največ, pol, toliko, mnogo
| Paradigm ta | Masc | Fem | Neut |
|---|---|---|---|
| Case=Acc|Number=Sing | ta, tega | to | to |
| Case=Acc|Number=Dual | ti | ||
| Case=Acc|Number=Plur | te | te | ta |
| Case=Dat|Number=Sing | temu | tej | temu |
| Case=Dat|Number=Plur | tem | ||
| Case=Gen|Number=Sing | tega | te | tega |
| Case=Gen|Number=Dual | teh | ||
| Case=Gen|Number=Plur | teh | teh | teh |
| Case=Ins|Number=Sing | tem | to | tem |
| Case=Ins|Number=Plur | temi | temi | |
| Case=Loc|Number=Sing | tem | tej | tem |
| Case=Loc|Number=Plur | teh | teh | teh |
| Case=Nom|Number=Sing | ta | ta | to |
| Case=Nom|Number=Dual | ta | ||
| Case=Nom|Number=Plur | ti | te | ta |
NUM
486 sl-pos/NUM tokens (25% of all NUM tokens) have a non-empty value of Gender.
The most frequent other feature values with which NUM and Gender co-occurred: NumForm=Word (486; 100%), NumType=Card (481; 99%).
NUM tokens may have the following values of Gender:
Fem(187; 38% of non-emptyGender): ena, eno, tri, dve, štiri, eni, dveh, ene, dvema, trehMasc(223; 46% of non-emptyGender): dva, eden, en, dveh, enega, enem, tri, trije, štirih, trehNeut(76; 16% of non-emptyGender): eno, štirih, treh, tri, dve, štiri, dveh, tremi, dvema, dvojeEMPTY(1441): tisoč, pet, 10, sto, 15, 2000, deset, 1., 50, 30
| Paradigm en | Masc | Fem | Neut |
|---|---|---|---|
| Case=Acc|Number=Sing | en, enega | eno | eno |
| Case=Dat|Number=Sing | enemu | eni | |
| Case=Gen|Number=Sing | enega | ene | enega |
| Case=Ins|Number=Sing | enim | eno | enim |
| Case=Loc|Number=Sing | enem | eni | enem |
| Case=Loc|Number=Plur | enih | ||
| Case=Nom|Number=Sing | en | ena | eno |
| Case=Nom|Number=Plur | eni |
AUX
21 sl-pos/AUX tokens (0% of all AUX tokens) have a non-empty value of Gender.
The most frequent other feature values with which AUX and Gender co-occurred: VerbForm=Part (21; 100%), Mood=EMPTY (21; 100%), Negative=EMPTY (21; 100%), Tense=EMPTY (21; 100%), Person=EMPTY (21; 100%), Number=Sing (14; 67%).
AUX tokens may have the following values of Gender:
Fem(3; 14% of non-emptyGender): bilaMasc(18; 86% of non-emptyGender): bil, bili, bilaEMPTY(7126): je, so, bi, bo, sem, ni, bodo, sta, smo, niso
| Paradigm biti | Masc | Fem |
|---|---|---|
| Number=Sing | bil | bila |
| Number=Dual | bila | |
| Number=Plur | bili |
Relations with Agreement in Gender
The 10 most frequent relations where parent and child node agree in Gender:
NOUN –[amod]–> ADJ (11304; 99%),
NOUN –[det]–> DET (2824; 86%),
ADJ –[nsubj]–> NOUN (879; 98%),
NOUN –[nmod]–> PROPN (839; 55%),
PROPN –[name]–> PROPN (683; 100%),
ADJ –[conj]–> ADJ (632; 93%),
VERB –[nsubj]–> PROPN (580; 73%),
VERB –[conj]–> VERB (550; 68%),
PROPN –[amod]–> ADJ (246; 100%),
PROPN –[conj]–> PROPN (224; 71%).
Gender in other languages: [bg] [cs] [de] [el] [en] [es] [eu] [fa] [fr] [ga] [he] [hu] [it] [ja] [ko] [sv] [u]