home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Swedish-Talbanken: Features: Gender

This feature is universal. It occurs with 4 different values: Com, Fem, Masc, Neut.

33614 tokens (35%) have a non-empty value of Gender. 10234 types (68%) occur at least once with a non-empty value of Gender. 6850 lemmas (67%) occur at least once with a non-empty value of Gender. The feature is used with 6 part-of-speech tags: NOUN (22566; 23% instances), PRON (4071; 4% instances), DET (3718; 4% instances), ADJ (3141; 3% instances), NUM (91; 0% instances), VERB (27; 0% instances).

NOUN

22566 NOUN tokens (98% of all NOUN tokens) have a non-empty value of Gender.

The most frequent other feature values with which NOUN and Gender co-occurred: Case=Nom (21444; 95%), Number=Sing (15358; 68%), Definite=Ind (14738; 65%).

NOUN tokens may have the following values of Gender:

Paradigm äktenskapNeutCom
Case=Gen|Definite=Def|Number=Singäktenskapetsäktenskapens
Case=Gen|Definite=Ind|Number=Singäktenskaps
Case=Gen|Definite=Ind|Number=Pluräktenskaps
Case=Nom|Definite=Def|Number=Singäktenskapetäktenskapen
Case=Nom|Definite=Ind|Number=Singäktenskap
Case=Nom|Definite=Ind|Number=Pluräktenskap

Gender seems to be lexical feature of NOUN. 99% lemmas (5836) occur only with one value of Gender.

PRON

4071 PRON tokens (61% of all PRON tokens) have a non-empty value of Gender.

The most frequent other feature values with which PRON and Gender co-occurred: Poss=EMPTY (3706; 91%), Number=Sing (3653; 90%), Definite=Def (2942; 72%), PronType=Prs (2875; 71%), Case=EMPTY (2391; 59%).

PRON tokens may have the following values of Gender:

Paradigm dennaMascNeutCom
PronType=Demdennedettadenna
PronType=Prsdetta

DET

3718 DET tokens (75% of all DET tokens) have a non-empty value of Gender.

The most frequent other feature values with which DET and Gender co-occurred: Number=Sing (3718; 100%), PronType=Art (3222; 87%), Definite=Ind (2325; 63%).

DET tokens may have the following values of Gender:

Paradigm enNeutCom
Definite=Def|PronType=Artdetden
Definite=Inden
Definite=Ind|PronType=Artetten

ADJ

3141 ADJ tokens (37% of all ADJ tokens) have a non-empty value of Gender.

The most frequent other feature values with which ADJ and Gender co-occurred: Number=Sing (3136; 100%), Case=Nom (3130; 100%), Definite=Ind (3030; 96%), Tense=EMPTY (2578; 82%), VerbForm=EMPTY (2578; 82%), Degree=Pos (2573; 82%).

ADJ tokens may have the following values of Gender:

Paradigm storMascNeutCom
Definite=Def|Degree=Supstörste
Definite=Ind|Degree=Posstortstor

NUM

91 NUM tokens (5% of all NUM tokens) have a non-empty value of Gender.

The most frequent other feature values with which NUM and Gender co-occurred: Case=Nom (91; 100%), NumType=Card (91; 100%).

NUM tokens may have the following values of Gender:

Paradigm enNeutCom
etten

VERB

27 VERB tokens (0% of all VERB tokens) have a non-empty value of Gender.

The most frequent other feature values with which VERB and Gender co-occurred: Mood=EMPTY (27; 100%), Tense=Past (27; 100%), VerbForm=Part (27; 100%), Voice=EMPTY (27; 100%).

VERB tokens may have the following values of Gender:

Gender seems to be lexical feature of VERB. 100% lemmas (22) occur only with one value of Gender.

Relations with Agreement in Gender

The 10 most frequent relations where parent and child node agree in Gender: NOUN –[det]–> DET (3463; 76%), NOUN –[nmod]–> NOUN (1576; 54%), NOUN –[conj]–> NOUN (1387; 67%), NOUN –[nmod:poss]–> NOUN (552; 60%), NOUN –[nmod:poss]–> PRON (339; 51%), NOUN –[appos]–> NOUN (184; 64%), NOUN –[nsubj]–> NOUN (152; 56%), ADJ –[conj]–> ADJ (125; 79%), ADJ –[expl]–> PRON (112; 92%), ADJ –[nsubj]–> PRON (93; 52%).