home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Swedish-Talbanken: Features: Gender

This feature is universal. It occurs with 4 different values: Com, Fem, Masc, Neut.

33306 tokens (34%) have a non-empty value of Gender. 10153 types (67%) occur at least once with a non-empty value of Gender. 6775 lemmas (65%) occur at least once with a non-empty value of Gender. The feature is used with 6 part-of-speech tags: NOUN (22562; 23% instances), PRON (3973; 4% instances), DET (3713; 4% instances), ADJ (2940; 3% instances), NUM (91; 0% instances), VERB (27; 0% instances).

NOUN

22562 NOUN tokens (98% of all NOUN tokens) have a non-empty value of Gender.

The most frequent other feature values with which NOUN and Gender co-occurred: Case=Nom (21440; 95%), Number=Sing (15353; 68%), Definite=Ind (14734; 65%).

NOUN tokens may have the following values of Gender:

Paradigm äktenskapNeutCom
Case=Gen|Definite=Def|Number=Singäktenskapetsäktenskapens
Case=Gen|Definite=Ind|Number=Singäktenskaps
Case=Gen|Definite=Ind|Number=Pluräktenskaps
Case=Nom|Definite=Def|Number=Singäktenskapetäktenskapen
Case=Nom|Definite=Ind|Number=Singäktenskap
Case=Nom|Definite=Ind|Number=Pluräktenskap

Gender seems to be lexical feature of NOUN. 99% lemmas (5836) occur only with one value of Gender.

PRON

3973 PRON tokens (59% of all PRON tokens) have a non-empty value of Gender.

The most frequent other feature values with which PRON and Gender co-occurred: Poss=EMPTY (3607; 91%), Number=Sing (3555; 89%), Definite=Def (2946; 74%), PronType=Prs (2848; 72%), Case=EMPTY (2292; 58%).

PRON tokens may have the following values of Gender:

Paradigm denNeutCom
ExtPos=ADV|PronType=Prsdet
PronType=Demdetden
PronType=Prsdetden

DET

3713 DET tokens (76% of all DET tokens) have a non-empty value of Gender.

The most frequent other feature values with which DET and Gender co-occurred: Number=Sing (3713; 100%), PronType=Art (3209; 86%), Definite=Ind (2320; 62%).

DET tokens may have the following values of Gender:

Paradigm enNeutCom
Definite=Def|PronType=Artden
Definite=Inden
Definite=Ind|PronType=Artetten

ADJ

2940 ADJ tokens (34% of all ADJ tokens) have a non-empty value of Gender.

The most frequent other feature values with which ADJ and Gender co-occurred: Case=Nom (2935; 100%), Degree=Pos (2930; 100%), Number=Sing (2929; 100%), Definite=Ind (2893; 98%), Tense=EMPTY (2511; 85%), VerbForm=EMPTY (2511; 85%).

ADJ tokens may have the following values of Gender:

Paradigm storNeutCom
Definite=Def|Degree=Supstörste
Definite=Ind|Degree=Pos|Number=Singstortstor

NUM

91 NUM tokens (5% of all NUM tokens) have a non-empty value of Gender.

The most frequent other feature values with which NUM and Gender co-occurred: Case=Nom (91; 100%), NumType=Card (91; 100%).

NUM tokens may have the following values of Gender:

Paradigm enNeutCom
etten

VERB

27 VERB tokens (0% of all VERB tokens) have a non-empty value of Gender.

The most frequent other feature values with which VERB and Gender co-occurred: Mood=EMPTY (27; 100%), Voice=Pass (27; 100%), Tense=Past (20; 74%), VerbForm=Part (20; 74%).

VERB tokens may have the following values of Gender:

Gender seems to be lexical feature of VERB. 100% lemmas (22) occur only with one value of Gender.

Relations with Agreement in Gender

The 10 most frequent relations where parent and child node agree in Gender: NOUN –[det]–> DET (3477; 77%), NOUN –[nmod]–> NOUN (1706; 54%), NOUN –[conj]–> NOUN (1396; 67%), NOUN –[nmod:poss]–> NOUN (552; 60%), NOUN –[nmod:poss]–> PRON (352; 51%), NOUN –[appos]–> NOUN (180; 63%), NOUN –[nsubj]–> NOUN (152; 55%), NOUN –[obl]–> NOUN (139; 50%), ADJ –[conj]–> ADJ (106; 71%), ADJ –[expl]–> PRON (105; 85%).