home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Upper_Sorbian-UFAL: Features: Gender

This feature is universal. It occurs with 3 different values: Fem, Masc, Neut.

This is a layered feature with the following layers: Gender, Gender[psor].

4930 tokens (44%) have a non-empty value of Gender. 3298 types (76%) occur at least once with a non-empty value of Gender. 2112 lemmas (69%) occur at least once with a non-empty value of Gender. The feature is used with 9 part-of-speech tags: NOUN (2527; 23% instances), ADJ (1384; 12% instances), PROPN (539; 5% instances), DET (270; 2% instances), PRON (123; 1% instances), VERB (48; 0% instances), NUM (36; 0% instances), AUX (2; 0% instances), ADV (1; 0% instances).

NOUN

2527 NOUN tokens (99% of all NOUN tokens) have a non-empty value of Gender.

The most frequent other feature values with which NOUN and Gender co-occurred: Number=Sing (1688; 67%), Animacy=EMPTY (1386; 55%).

NOUN tokens may have the following values of Gender:

Paradigm datajaFemNeut
Case=Accdataje, datydaty
Case=Gendatow

Gender seems to be lexical feature of NOUN. 99% lemmas (1012) occur only with one value of Gender.

ADJ

1384 ADJ tokens (97% of all ADJ tokens) have a non-empty value of Gender.

The most frequent other feature values with which ADJ and Gender co-occurred: Animacy=EMPTY (1224; 88%), Voice=EMPTY (1203; 87%), VerbForm=EMPTY (1202; 87%), Number=Sing (886; 64%), Degree=EMPTY (873; 63%).

ADJ tokens may have the following values of Gender:

Paradigm serbskiMascFemNeut
Animacy=Inan|Case=Acc|Degree=Pos|Number=Dualserbskej
Case=Acc|Degree=Pos|Number=Singserbskiserbske
Case=Acc|Number=Singserbsku
Case=Dat|Number=Singserbskemu
Case=Dat|Number=Plurserbskim
Case=Gen|Degree=Pos|Number=Singserbskeje
Case=Gen|Number=SingSerbskehoserbskeje
Case=Gen|Number=Plurserbskich
Case=Ins|Number=Singserbskej, serbsku
Case=Loc|Degree=Pos|Number=SingSerbskim
Case=Loc|Number=SingSerbskimserbskej
Case=Nom|Degree=Pos|Number=SingSerbski, SERBSKIserbska
Case=Nom|Number=Singserbska
Case=Nom|Number=Plurserbske

PROPN

539 PROPN tokens (90% of all PROPN tokens) have a non-empty value of Gender.

The most frequent other feature values with which PROPN and Gender co-occurred: Number=Sing (484; 90%).

PROPN tokens may have the following values of Gender:

Paradigm InstitutMascNeut
Animacy=Inan|Case=AccInstitut
Case=NomInstitut

Gender seems to be lexical feature of PROPN. 99% lemmas (319) occur only with one value of Gender.

DET

270 DET tokens (83% of all DET tokens) have a non-empty value of Gender.

The most frequent other feature values with which DET and Gender co-occurred: Abbr=EMPTY (235; 87%), Number[psor]=EMPTY (226; 84%), Person=EMPTY (226; 84%), Poss=EMPTY (197; 73%), Animacy=EMPTY (180; 67%), Number=Sing (164; 61%).

DET tokens may have the following values of Gender:

Paradigm kotryžMascFemNeut
Animacy=Anim|Case=Dat|Number=Plurkotrymž
Animacy=Anim|Case=Nom|Number=Singkotryž
Animacy=Anim|Case=Nom|Number=Plurkotřiž
Animacy=Inan|Case=Acc|Number=Singkotryž
Animacy=Inan|Case=Gen|Number=Plurkotrychž
Animacy=Inan|Case=Loc|Number=Plurkotrychž
Animacy=Inan|Case=Nom|Number=Singkotryž, kotrež
Animacy=Inan|Case=Nom|Number=Plurkotrež
Case=Gen|Number=Singkotrehožkotrejež
Case=Ins|Number=Plurkotrymiž
Case=Loc|Number=Singkotrymžkotrejž
Case=Loc|Number=Plurkotrychžkotrychž
Case=Nom|Number=Singkotryžkotražkotrež
Case=Nom|Number=Dualkotrejž
Case=Nom|Number=Plurkotrežkotrežkotrež

PRON

123 PRON tokens (36% of all PRON tokens) have a non-empty value of Gender.

The most frequent other feature values with which PRON and Gender co-occurred: Reflex=EMPTY (123; 100%), Number=Sing (112; 91%), Person=EMPTY (73; 59%).

PRON tokens may have the following values of Gender:

Paradigm wónMascFemNeut
Animacy=Anim|Case=Nom|Number=PlurWoni
Animacy=Inan|Case=Acc|Number=Plurje
Animacy=Nhum|Case=Acc|Number=Singjeho
Case=Acc|Number=Singjón, jehoju, nju
Case=Acc|Number=Plurje
Case=Dat|Number=SingJej, jeje, njej
Case=Gen|Number=Singnjeje
Case=Gen|Number=Plurnich
Case=Ins|Number=Plurnimi
Case=Loc|Number=Singnimnim
Case=Nom|Number=Singwónwonawono, wone
Case=Nom|Number=Plurwonewone
Number=Singjón

VERB

48 VERB tokens (6% of all VERB tokens) have a non-empty value of Gender.

The most frequent other feature values with which VERB and Gender co-occurred: Mood=EMPTY (46; 96%), Person=EMPTY (46; 96%), Tense=Past (46; 96%), VerbForm=Part (46; 96%), Number=Sing (30; 63%).

VERB tokens may have the following values of Gender:

Paradigm předstajićMascFemNeut
Animacy=Inan|Mood=Ind|Number=Sing|Person=3|Tense=Pres|VerbForm=Finpředstaja
Mood=Ind|Number=Sing|Person=3|Tense=Pres|VerbForm=Finpředstaja
Number=Plur|Tense=Past|VerbForm=Part|Voice=Actpředstajili

NUM

36 NUM tokens (9% of all NUM tokens) have a non-empty value of Gender.

The most frequent other feature values with which NUM and Gender co-occurred: NumType=Card (35; 97%).

NUM tokens may have the following values of Gender:

Paradigm jedynMascFemNeut
Animacy=Anim|Case=Nomjedny, jedyn
Animacy=Inan|Case=Accjedyn
Animacy=Inan|Case=Genjedneho
Animacy=Inan|Case=Nomjedyn
Case=Accjedynjednu
Case=Locjednym
Case=Nomjedynjedna

AUX

2 AUX tokens (1% of all AUX tokens) have a non-empty value of Gender.

The most frequent other feature values with which AUX and Gender co-occurred: Mood=EMPTY (2; 100%), Number=Sing (2; 100%), Person=EMPTY (2; 100%), Tense=Past (2; 100%), VerbForm=Part (2; 100%), Voice=Act (2; 100%).

AUX tokens may have the following values of Gender:

Paradigm byćMascFem
byłbyła

ADV

1 ADV tokens (0% of all ADV tokens) have a non-empty value of Gender.

The most frequent other feature values with which ADV and Gender co-occurred: Degree=Pos (1; 100%), PronType=EMPTY (1; 100%).

ADV tokens may have the following values of Gender:

Relations with Agreement in Gender

The 10 most frequent relations where parent and child node agree in Gender: NOUN –[amod]–> ADJ (1052; 96%), NOUN –[det]–> DET (170; 79%), NOUN –[conj]–> NOUN (161; 68%), ADJ –[nsubj]–> NOUN (75; 89%), ADJ –[conj]–> ADJ (62; 97%), PROPN –[conj]–> PROPN (52; 59%), PROPN –[flat]–> PROPN (52; 73%), PROPN –[amod]–> ADJ (41; 95%), PROPN –[nmod]–> NOUN (22; 67%), ADJ –[nsubj]–> DET (21; 95%).