Treebank Statistics: UD_Upper_Sorbian-UFAL: Features: Gender
This feature is universal.
It occurs with 3 different values: Fem
, Masc
, Neut
.
This is a layered feature with the following layers: Gender, Gender[psor].
4930 tokens (44%) have a non-empty value of Gender
.
3298 types (76%) occur at least once with a non-empty value of Gender
.
2112 lemmas (69%) occur at least once with a non-empty value of Gender
.
The feature is used with 9 part-of-speech tags: NOUN (2527; 23% instances), ADJ (1384; 12% instances), PROPN (539; 5% instances), DET (270; 2% instances), PRON (123; 1% instances), VERB (48; 0% instances), NUM (36; 0% instances), AUX (2; 0% instances), ADV (1; 0% instances).
NOUN
2527 NOUN tokens (99% of all NOUN
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which NOUN
and Gender
co-occurred: Number=Sing (1688; 67%), Animacy=EMPTY (1386; 55%).
NOUN
tokens may have the following values of Gender
:
Fem
(934; 37% of non-emptyGender
): l, rěč, woda, rěčow, stolica, rostliny, wody, rěče, knihi, bibliotekiMasc
(1143; 45% of non-emptyGender
): př, kilometrow, nastawki, kraja, lěttysaca, čas, institut, stat, wobraz, časaNeut
(450; 18% of non-emptyGender
): město, lěta, lěće, mócnarstwo, pismo, słowo, lět, města, hospodarstwo, knjejstwaEMPTY
(16): km, m, CEST, droždźemi, duri, hodź, jan, thumb
Paradigm dataja | Fem | Neut |
---|---|---|
Case=Acc | dataje, daty | daty |
Case=Gen | datow |
Gender
seems to be lexical feature of NOUN
. 99% lemmas (1012) occur only with one value of Gender
.
ADJ
1384 ADJ tokens (97% of all ADJ
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which ADJ
and Gender
co-occurred: Animacy=EMPTY (1224; 88%), Voice=EMPTY (1203; 87%), VerbForm=EMPTY (1202; 87%), Number=Sing (886; 64%), Degree=EMPTY (873; 63%).
ADJ
tokens may have the following values of Gender
:
Fem
(557; 40% of non-emptyGender
): serbskeje, wulku, druhe, serbska, wotpowědne, dalše, druhich, hornjej, kruta, němskejMasc
(579; 42% of non-emptyGender
): serbski, prěni, Serbskeho, wulki, Ekscelentny, Serbskim, Třećeho, Zjednoćenych, ablawtowych, cyłymNeut
(248; 18% of non-emptyGender
): najwjetše, wulke, klinowe, wuznamne, prěnje, Kaspiske, Kaspiskeho, aktualne, bjezdawkowe, běłeEMPTY
(37): němsko, Awstro, Planowane, Tibeto, d, dołho, druhich, duchowno, hornjo, krótko
Paradigm serbski | Masc | Fem | Neut |
---|---|---|---|
Animacy=Inan|Case=Acc|Degree=Pos|Number=Dual | serbskej | ||
Case=Acc|Degree=Pos|Number=Sing | serbski | serbske | |
Case=Acc|Number=Sing | serbsku | ||
Case=Dat|Number=Sing | serbskemu | ||
Case=Dat|Number=Plur | serbskim | ||
Case=Gen|Degree=Pos|Number=Sing | serbskeje | ||
Case=Gen|Number=Sing | Serbskeho | serbskeje | |
Case=Gen|Number=Plur | serbskich | ||
Case=Ins|Number=Sing | serbskej, serbsku | ||
Case=Loc|Degree=Pos|Number=Sing | Serbskim | ||
Case=Loc|Number=Sing | Serbskim | serbskej | |
Case=Nom|Degree=Pos|Number=Sing | Serbski, SERBSKI | serbska | |
Case=Nom|Number=Sing | serbska | ||
Case=Nom|Number=Plur | serbske |
PROPN
539 PROPN tokens (90% of all PROPN
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which PROPN
and Gender
co-occurred: Number=Sing (484; 90%).
PROPN
tokens may have the following values of Gender
:
Fem
(209; 39% of non-emptyGender
): Mezopotamiskeje, Mezopotamiska, Mezopotamiskej, Wikimedia, Łužicy, Europje, Assyriska, Němskeje, Wikipedija, AfriceMasc
(281; 52% of non-emptyGender
): Sumeričanow, Assur, Aššur, Babylon, Budyšinje, Hammurabi, Jakub, Ur, Akkada, AramejčanowNeut
(49; 9% of non-emptyGender
): Commons, Esperanto, Nadu, Slepo, Łobjom, Aleppo, Baku, Bangalore, Bengaluru, EsperanćeEMPTY
(57): Aššur, C, Adl, Angeles, Gasche, Los, Tamil, Tlustulimu, Beth, Bilād
Paradigm Institut | Masc | Neut |
---|---|---|
Animacy=Inan|Case=Acc | Institut | |
Case=Nom | Institut |
Gender
seems to be lexical feature of PROPN
. 99% lemmas (319) occur only with one value of Gender
.
DET
270 DET tokens (83% of all DET
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which DET
and Gender
co-occurred: Abbr=EMPTY (235; 87%), Number[psor]=EMPTY (226; 84%), Person=EMPTY (226; 84%), Poss=EMPTY (197; 73%), Animacy=EMPTY (180; 67%), Number=Sing (164; 61%).
DET
tokens may have the following values of Gender
:
Fem
(126; 47% of non-emptyGender
): n, kotraž, kotrež, tuta, swoju, tute, tutej, tutu, kotrejž, někotrychMasc
(104; 39% of non-emptyGender
): kotrež, kotryž, tutón, n, někotři, swoje, tute, tutym, kotrychž, někotreNeut
(40; 15% of non-emptyGender
): kotrež, tute, kóžde, žane, swoje, tajke, twojim, Wobě, kajke, kotrejžEMPTY
(57): jeho, jich, wjele, jeje, mnoho, n, Někotre, Tutón, Wšě, mjenje
Paradigm kotryž | Masc | Fem | Neut |
---|---|---|---|
Animacy=Anim|Case=Dat|Number=Plur | kotrymž | ||
Animacy=Anim|Case=Nom|Number=Sing | kotryž | ||
Animacy=Anim|Case=Nom|Number=Plur | kotřiž | ||
Animacy=Inan|Case=Acc|Number=Sing | kotryž | ||
Animacy=Inan|Case=Gen|Number=Plur | kotrychž | ||
Animacy=Inan|Case=Loc|Number=Plur | kotrychž | ||
Animacy=Inan|Case=Nom|Number=Sing | kotryž, kotrež | ||
Animacy=Inan|Case=Nom|Number=Plur | kotrež | ||
Case=Gen|Number=Sing | kotrehož | kotrejež | |
Case=Ins|Number=Plur | kotrymiž | ||
Case=Loc|Number=Sing | kotrymž | kotrejž | |
Case=Loc|Number=Plur | kotrychž | kotrychž | |
Case=Nom|Number=Sing | kotryž | kotraž | kotrež |
Case=Nom|Number=Dual | kotrejž | ||
Case=Nom|Number=Plur | kotrež | kotrež | kotrež |
PRON
123 PRON tokens (36% of all PRON
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which PRON
and Gender
co-occurred: Reflex=EMPTY (123; 100%), Number=Sing (112; 91%), Person=EMPTY (73; 59%).
PRON
tokens may have the following values of Gender
:
Fem
(18; 15% of non-emptyGender
): wona, Jej, je, jeje, ju, njej, njeje, nju, woneMasc
(22; 18% of non-emptyGender
): wón, jón, Woni, je, jeho, kiž, nich, nimNeut
(83; 67% of non-emptyGender
): to, toho, tym, wono, wone, čimž, t, tomu, něšto, štožEMPTY
(215): so, kiž, je, sej, nam, sobu, ty, Wonej, sebi
Paradigm wón | Masc | Fem | Neut |
---|---|---|---|
Animacy=Anim|Case=Nom|Number=Plur | Woni | ||
Animacy=Inan|Case=Acc|Number=Plur | je | ||
Animacy=Nhum|Case=Acc|Number=Sing | jeho | ||
Case=Acc|Number=Sing | jón, jeho | ju, nju | |
Case=Acc|Number=Plur | je | ||
Case=Dat|Number=Sing | Jej, jeje, njej | ||
Case=Gen|Number=Sing | njeje | ||
Case=Gen|Number=Plur | nich | ||
Case=Ins|Number=Plur | nimi | ||
Case=Loc|Number=Sing | nim | nim | |
Case=Nom|Number=Sing | wón | wona | wono, wone |
Case=Nom|Number=Plur | wone | wone | |
Number=Sing | jón |
VERB
48 VERB tokens (6% of all VERB
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which VERB
and Gender
co-occurred: Mood=EMPTY (46; 96%), Person=EMPTY (46; 96%), Tense=Past (46; 96%), VerbForm=Part (46; 96%), Number=Sing (30; 63%).
VERB
tokens may have the following values of Gender
:
Fem
(12; 25% of non-emptyGender
): dodźeržała, eksistowali, kontrolowali, móhła, předstaja, přeměniła, přełožili, přistupiła, rostła, stabilizowałaMasc
(30; 63% of non-emptyGender
): přewzali, wužiwali, započał, ilustrował, mał, mjenował, měł, nastał, poradźił, poznamjeniliNeut
(6; 13% of non-emptyGender
): móhli, poradźiło, předstajili, stali, stało, wočakowałoEMPTY
(774): ma, leži, móže, wobsahuje, móžeš, su, hlej, maja, rěči, běchu
Paradigm předstajić | Masc | Fem | Neut |
---|---|---|---|
Animacy=Inan|Mood=Ind|Number=Sing|Person=3|Tense=Pres|VerbForm=Fin | předstaja | ||
Mood=Ind|Number=Sing|Person=3|Tense=Pres|VerbForm=Fin | předstaja | ||
Number=Plur|Tense=Past|VerbForm=Part|Voice=Act | předstajili |
NUM
36 NUM tokens (9% of all NUM
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which NUM
and Gender
co-occurred: NumType=Card (35; 97%).
NUM
tokens may have the following values of Gender
:
Fem
(12; 33% of non-emptyGender
): jedna, jednu, štyri, dwaj, dwě, dwěmaj, miliardow, woběmaj, štyrjochMasc
(20; 56% of non-emptyGender
): jedyn, dwaj, Mio, dweju, jedneho, jedny, traje, štyrjochNeut
(4; 11% of non-emptyGender
): dwěmaj, jednymEMPTY
(346): 2, 1, 6, 4, 3, 5, 7, I, 000, 10
Paradigm jedyn | Masc | Fem | Neut |
---|---|---|---|
Animacy=Anim|Case=Nom | jedny, jedyn | ||
Animacy=Inan|Case=Acc | jedyn | ||
Animacy=Inan|Case=Gen | jedneho | ||
Animacy=Inan|Case=Nom | jedyn | ||
Case=Acc | jedyn | jednu | |
Case=Loc | jednym | ||
Case=Nom | jedyn | jedna |
AUX
2 AUX tokens (1% of all AUX
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which AUX
and Gender
co-occurred: Mood=EMPTY (2; 100%), Number=Sing (2; 100%), Person=EMPTY (2; 100%), Tense=Past (2; 100%), VerbForm=Part (2; 100%), Voice=Act (2; 100%).
AUX
tokens may have the following values of Gender
:
Fem
(1; 50% of non-emptyGender
): byłaMasc
(1; 50% of non-emptyGender
): byłEMPTY
(287): je, su, bu, bě, buchu, by, njeje, njejsu, běchu, buštej
Paradigm być | Masc | Fem |
---|---|---|
był | była |
ADV
1 ADV tokens (0% of all ADV
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which ADV
and Gender
co-occurred: Degree=Pos (1; 100%), PronType=EMPTY (1; 100%).
ADV
tokens may have the following values of Gender
:
Fem
(1; 100% of non-emptyGender
): wuchodneEMPTY
(534): tež, tak, hišće, zwjetša, hač, něhdźe, hižo, tu, wjace, najprjedy
Relations with Agreement in Gender
The 10 most frequent relations where parent and child node agree in Gender
:
NOUN –[amod]–> ADJ (1052; 96%),
NOUN –[det]–> DET (170; 79%),
NOUN –[conj]–> NOUN (161; 68%),
ADJ –[nsubj]–> NOUN (75; 89%),
ADJ –[conj]–> ADJ (62; 97%),
PROPN –[conj]–> PROPN (52; 59%),
PROPN –[flat]–> PROPN (52; 73%),
PROPN –[amod]–> ADJ (41; 95%),
PROPN –[nmod]–> NOUN (22; 67%),
ADJ –[nsubj]–> DET (21; 95%).