home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Croatian-SET: Features: Gender

This feature is universal. It occurs with 3 different values: Fem, Masc, Neut.

This is a layered feature with the following layers: Gender, Gender[psor].

100308 tokens (50%) have a non-empty value of Gender. 32347 types (91%) occur at least once with a non-empty value of Gender. 15628 lemmas (84%) occur at least once with a non-empty value of Gender. The feature is used with 8 part-of-speech tags: NOUN (48386; 24% instances), ADJ (22932; 11% instances), PROPN (12825; 6% instances), DET (7332; 4% instances), VERB (6090; 3% instances), PRON (1517; 1% instances), AUX (615; 0% instances), NUM (611; 0% instances).

NOUN

48386 NOUN tokens (100% of all NOUN tokens) have a non-empty value of Gender.

The most frequent other feature values with which NOUN and Gender co-occurred: Number=Sing (34528; 71%).

NOUN tokens may have the following values of Gender:

Paradigm kunaMascFem
Case=Acc|Number=Singkunu
Case=Acc|Number=Plurkune
Case=Gen|Number=Singkune
Case=Gen|Number=Plurkunakuna
Case=Nom|Number=Plurkune

Gender seems to be lexical feature of NOUN. 99% lemmas (6346) occur only with one value of Gender.

ADJ

22932 ADJ tokens (95% of all ADJ tokens) have a non-empty value of Gender.

The most frequent other feature values with which ADJ and Gender co-occurred: Degree=Pos (21824; 95%), Definite=Def (20684; 90%), Number=Sing (15101; 66%).

ADJ tokens may have the following values of Gender:

Paradigm velikMascFemNeut
Animacy=Inan|Case=Acc|Definite=Def|Degree=Pos|Number=Singveliki
Animacy=Inan|Case=Acc|Definite=Def|Degree=Cmp|Number=Singveći
Animacy=Inan|Case=Acc|Definite=Def|Degree=Sup|Number=Singnajveći
Animacy=Inan|Case=Acc|Definite=Ind|Degree=Pos|Number=Singvelik, veći
Case=Acc|Definite=Def|Degree=Pos|Number=Singveliku
Case=Acc|Definite=Def|Degree=Pos|Number=Plurvelikevelikeveća
Case=Acc|Definite=Def|Degree=Cmp|Number=Singvećuveće
Case=Acc|Definite=Def|Degree=Cmp|Number=Plurvećeveće
Case=Acc|Definite=Def|Degree=Sup|Number=Singnajvećunajveće
Case=Acc|Definite=Def|Degree=Sup|Number=Plurnajvećenajveća
Case=Dat|Definite=Def|Degree=Pos|Number=Singvelikomvelikoj
Case=Dat|Definite=Def|Degree=Pos|Number=Plurvelikim
Case=Dat|Definite=Def|Degree=Cmp|Number=Singvećoj
Case=Dat|Definite=Def|Degree=Sup|Number=Singnajvećemnajvećim
Case=Dat|Definite=Def|Degree=Sup|Number=Plurnajvećim
Case=Gen|Definite=Def|Degree=Pos|Number=Singvelikog, velika, velikogavelikevelikog, najvećeg
Case=Gen|Definite=Def|Degree=Pos|Number=Plurvelikihvelikihvelikih
Case=Gen|Definite=Def|Degree=Cmp|Number=Singvećegvećevećeg
Case=Gen|Definite=Def|Degree=Cmp|Number=Plurvećihvećih
Case=Gen|Definite=Def|Degree=Sup|Number=Singnajvećeg, najvećanajveće
Case=Gen|Definite=Def|Degree=Sup|Number=Plurnajvećihnajvećihnajvećih
Case=Ins|Definite=Def|Degree=Pos|Number=Singvelikimvelikomnajvećim
Case=Ins|Definite=Def|Degree=Pos|Number=Plurvelikimvelikim
Case=Ins|Definite=Def|Degree=Cmp|Number=Singvećimvećom
Case=Ins|Definite=Def|Degree=Cmp|Number=Plurvećim
Case=Ins|Definite=Def|Degree=Sup|Number=Singnajvećimnajvećom
Case=Ins|Definite=Def|Degree=Sup|Number=Plurnajvećimnajvećima
Case=Loc|Definite=Def|Degree=Pos|Number=Singvelikomvelikojvelikom
Case=Loc|Definite=Def|Degree=Pos|Number=Plurvelikimvelikimvelikim
Case=Loc|Definite=Def|Degree=Cmp|Number=Singvećemvećoj
Case=Loc|Definite=Def|Degree=Sup|Number=Singnajvećemnajvećoj
Case=Loc|Definite=Def|Degree=Sup|Number=Plurnajvećim
Case=Nom|Definite=Def|Degree=Pos|Number=Singvelikivelikaveliko
Case=Nom|Definite=Def|Degree=Pos|Number=Plurvelikivelikevelika
Case=Nom|Definite=Def|Degree=Cmp|Number=Singvećivećaveće
Case=Nom|Definite=Def|Degree=Cmp|Number=Plurveća
Case=Nom|Definite=Def|Degree=Sup|Number=Singnajvećinajvećanajveće
Case=Nom|Definite=Def|Degree=Sup|Number=Plurnajvećinajvećenajveća
Case=Nom|Definite=Ind|Degree=Pos|Number=Singvelik

PROPN

12825 PROPN tokens (100% of all PROPN tokens) have a non-empty value of Gender.

The most frequent other feature values with which PROPN and Gender co-occurred: Number=Sing (12511; 98%), Case=Nom (6511; 51%).

PROPN tokens may have the following values of Gender:

Paradigm BiHMascFem
Case=AccBiH
Case=DatBiH
Case=GenBiHBiH, BIH
Case=LocBiHBiH
Case=NomBiH

Gender seems to be lexical feature of PROPN. 98% lemmas (4245) occur only with one value of Gender.

DET

7332 DET tokens (95% of all DET tokens) have a non-empty value of Gender.

The most frequent other feature values with which DET and Gender co-occurred: Number[psor]=EMPTY (6503; 89%), Person=EMPTY (6503; 89%), Poss=EMPTY (5750; 78%), Number=Sing (5100; 70%).

DET tokens may have the following values of Gender:

Paradigm kojiMascFemNeut
Animacy=Anim|Case=Acc|Number=Singkojeg, kojega
Animacy=Inan|Case=Acc|Number=Singkoji
Case=Acc|Number=Singkojukoje
Case=Acc|Number=Plurkojekojekoja
Case=Dat|Number=Singkojemu, kojemkojojkojem, kojemu
Case=Dat|Number=Plurkojimakojimakojima
Case=Gen|Number=Singkojeg, kojegakojekojeg, kojega
Case=Gen|Number=Plurkojihkojihkojih
Case=Ins|Number=Singkojimkojomkojim
Case=Ins|Number=Plurkojimakojimakojima
Case=Loc|Number=Singkojem, kojemu, komkojojkojem, kojemu
Case=Loc|Number=Plurkojima, kojimkojima, kojimkojima
Case=Nom|Number=Singkojikojakoje
Case=Nom|Number=Plurkojikojekoja

VERB

6090 VERB tokens (35% of all VERB tokens) have a non-empty value of Gender.

The most frequent other feature values with which VERB and Gender co-occurred: Mood=EMPTY (6090; 100%), Person=EMPTY (6090; 100%), Tense=Past (6090; 100%), VerbForm=Part (6090; 100%), Voice=Act (6090; 100%), Number=Sing (4412; 72%).

VERB tokens may have the following values of Gender:

Paradigm moćiMascFemNeut
Number=Singmogaomoglamoglo
Number=Plurmoglimoglemogla

PRON

1517 PRON tokens (29% of all PRON tokens) have a non-empty value of Gender.

The most frequent other feature values with which PRON and Gender co-occurred: Reflex=EMPTY (1517; 100%), Person=EMPTY (815; 54%), Number=EMPTY (814; 54%), Case=Nom (772; 51%).

PRON tokens may have the following values of Gender:

Paradigm onMascFemNeut
Case=Accga, njegaje, ju, njuga, nj, njega, ono
Case=Datmu, njemujoj, njoj
Case=Gennjeganje, je
Case=Insnjim, njimenjom, njomenjime, njim
Case=Locnjemunjoj
Case=Nomononaono

AUX

615 AUX tokens (5% of all AUX tokens) have a non-empty value of Gender.

The most frequent other feature values with which AUX and Gender co-occurred: Mood=EMPTY (615; 100%), Person=EMPTY (615; 100%), Tense=Past (615; 100%), VerbForm=Part (615; 100%), Number=Sing (489; 80%).

AUX tokens may have the following values of Gender:

Paradigm bitiMascFemNeut
Number=Singbiobilabilo
Number=Plurbilibilebila

NUM

611 NUM tokens (19% of all NUM tokens) have a non-empty value of Gender.

The most frequent other feature values with which NUM and Gender co-occurred: NumType=Card (576; 94%), Number=Sing (433; 71%), Case=Nom (322; 53%).

NUM tokens may have the following values of Gender:

Paradigm jedanMascFemNeut
Animacy=Anim|Case=Acc|Number=Singjednog
Animacy=Inan|Case=Acc|Number=Singjedan
Case=Acc|Number=Singjednujedno
Case=Dat|Number=Singjednoj
Case=Gen|Number=Singjednogjednejednog, jednoga
Case=Ins|Number=Singjednimjednom
Case=Loc|Number=Singjednom, jednomejednojjednom
Case=Nom|Number=Singjedanjednajedno
Case=Nom|Number=Plurjedni

Relations with Agreement in Gender

The 10 most frequent relations where parent and child node agree in Gender: NOUN –[amod]–> ADJ (17174; 95%), NOUN –[det]–> DET (3190; 98%), PROPN –[flat]–> PROPN (2331; 97%), ADJ –[nsubj]–> NOUN (1448; 93%), NOUN –[flat]–> PROPN (1334; 75%), VERB –[nsubj]–> PROPN (1131; 57%), ADJ –[conj]–> ADJ (774; 94%), PROPN –[conj]–> PROPN (726; 75%), NOUN –[acl]–> ADJ (677; 85%), VERB –[conj]–> VERB (413; 54%).