Treebank Statistics: UD_Norwegian-Nynorsk: Features: Gender
This feature is universal.
It occurs with 4 different values: Com, Fem, Masc, Neut.
Some words have combined values of the feature; 1 combinations have been observed: Fem|Masc.
95231 tokens (32%) have a non-empty value of Gender.
21495 types (69%) occur at least once with a non-empty value of Gender.
14947 lemmas (66%) occur at least once with a non-empty value of Gender.
The feature is used with 7 part-of-speech tags: NOUN (55465; 18% instances), ADJ (15206; 5% instances), DET (11269; 4% instances), PRON (10383; 3% instances), PROPN (2829; 1% instances), NUM (75; 0% instances), VERB (4; 0% instances).
NOUN
55465 NOUN tokens (98% of all NOUN tokens) have a non-empty value of Gender.
The most frequent other feature values with which NOUN and Gender co-occurred: Number=EMPTY (40907; 74%), Definite=Ind (33350; 60%).
NOUN tokens may have the following values of Gender:
Fem(12994; 23% of non-emptyGender): regjeringa, saka, verda, tid, boka, tida, kroner, meldinga, klokka, kyrkjaMasc(25620; 46% of non-emptyGender): dag, prosent, kommunen, del, staden, mannen, filmen, vegen, millionar, statenNeut(16851; 30% of non-emptyGender): år, landet, folk, departementet, arbeidet, språk, politiet, samfunnet, livet, fylketEMPTY(1065): kap., lag, fjor, nr., kultur-, vare, går, islam, kr, s.
| Paradigm lov | Masc | Fem | Neut |
|---|---|---|---|
| _ | lova, lovi | ||
| Definite=Ind | lov | lov | lov |
| Definite=Ind|Number=Plur | lover | ||
| Number=Plur | lovene |
Gender seems to be lexical feature of NOUN. 99% lemmas (11852) occur only with one value of Gender.
ADJ
15206 ADJ tokens (52% of all ADJ tokens) have a non-empty value of Gender.
The most frequent other feature values with which ADJ and Gender co-occurred: Definite=Ind (15205; 100%), Number=EMPTY (15203; 100%), Degree=Pos (12486; 82%), VerbForm=EMPTY (12484; 82%).
ADJ tokens may have the following values of Gender:
Com(725; 5% of non-emptyGender): god, stor, ny, klar, norsk, glad, offentleg, rasjonell, ung, langFem(18; 0% of non-emptyGender): lita, bundi, opa, teki, vedtekiFem,Masc(5720; 38% of non-emptyGender): ny, stor, god, norsk, mykje, viktig, fast, offentleg, klar, langMasc(108; 1% of non-emptyGender): liten, open, kristen, oppteken, god, lunken, medfaren, sliten, velkomen, ForbodenNeut(8635; 57% of non-emptyGender): mykje, godt, heilt, langt, svært, litt, rett, veldig, viktig, norskEMPTY(13893): meir, mange, fleire, nye, store, heile, norske, siste, mest, tidlegare
| Paradigm god | Fem,Masc | Masc | Neut | Com |
|---|---|---|---|---|
| god | god | godt | god |
DET
11269 DET tokens (75% of all DET tokens) have a non-empty value of Gender.
The most frequent other feature values with which DET and Gender co-occurred: Number=EMPTY (11269; 100%), PronType=Art (9190; 82%).
DET tokens may have the following values of Gender:
Com(8; 0% of non-emptyGender): alleFem(2493; 22% of non-emptyGender): ei, den, denne, anna, slik, eiga, all, inga, kvar, nokaFem,Masc(25; 0% of non-emptyGender): alleMasc(4555; 40% of non-emptyGender): ein, den, denne, kvar, eigen, annan, nokon, ingen, all, slikNeut(4188; 37% of non-emptyGender): eit, det, anna, noko, dette, kvart, eitt, eige, alt, sliktEMPTY(3718): dei, andre, alle, same, desse, nokre, sjølv, slike, kva, sjølve
| Paradigm all | Fem,Masc | Masc | Fem | Neut | Com |
|---|---|---|---|---|---|
| alle | all | all | alt | alle |
PRON
10383 PRON tokens (54% of all PRON tokens) have a non-empty value of Gender.
The most frequent other feature values with which PRON and Gender co-occurred: Animacy=EMPTY (10381; 100%), Number=EMPTY (10379; 100%), PronType=Prs (10351; 100%), Person=3 (9223; 89%), Case=EMPTY (8096; 78%).
PRON tokens may have the following values of Gender:
Fem(978; 9% of non-emptyGender): ho, si, henne, vår, hans, mi, hennar, di, deira, eiFem,Masc(171; 2% of non-emptyGender): den, denne, dénMasc(2129; 21% of non-emptyGender): han, sin, vår, min, hans, nokon, hennar, deira, ingen, dinNeut(7105; 68% of non-emptyGender): det, dette, noko, sitt, alt, vårt, mitt, hans, slikt, hennarEMPTY(8983): dei, eg, vi, seg, ein, me, du, kva, oss, sine
| Paradigm sin | Masc | Fem | Neut |
|---|---|---|---|
| sin | si | sitt |
PROPN
2829 PROPN tokens (16% of all PROPN tokens) have a non-empty value of Gender.
PROPN tokens may have the following values of Gender:
Fem(653; 23% of non-emptyGender): Marie, Solveig, Kristin, Liv, Sigrid, Tone, Signe, Anne, Heidi, JorunnMasc(1899; 67% of non-emptyGender): Olav, Arne, Jan, Dag, Gunnar, Stein, Paulus, Jesus, Erik, SnorreNeut(277; 10% of non-emptyGender): Stortinget, Framstegspartiet, Senterpartiet, Vestlandet, Dagbladet, Folketinget, Vinmonopolet, Austlandet, Skattedirektoratet, StortingetsEMPTY(14934): Noreg, Førde, Språkrådet, Sogn, USA, SV, Fjordane, Oslo, Kviteseid, Høgre
Gender seems to be lexical feature of PROPN. 100% lemmas (395) occur only with one value of Gender.
NUM
75 NUM tokens (2% of all NUM tokens) have a non-empty value of Gender.
The most frequent other feature values with which NUM and Gender co-occurred: NumType=Card (75; 100%), Number=EMPTY (75; 100%).
NUM tokens may have the following values of Gender:
Fem(27; 36% of non-emptyGender): noko, éi, annakvarMasc(44; 59% of non-emptyGender): éin, noko, annankvar, èinNeut(4; 5% of non-emptyGender): Noko, annakvart, halvanna, halvtannaEMPTY(3973): to, tre, fire, ti, fem, 20, 1, seks, 2005, 2006
| Paradigm noko | Masc | Fem | Neut |
|---|---|---|---|
| noko | noko | Noko |
VERB
4 VERB tokens (0% of all VERB tokens) have a non-empty value of Gender.
The most frequent other feature values with which VERB and Gender co-occurred: Mood=Ind (4; 100%), Tense=Pres (4; 100%), VerbForm=Fin,Part (4; 100%).
VERB tokens may have the following values of Gender:
Fem,Masc(1; 25% of non-emptyGender): stoppaNeut(3; 75% of non-emptyGender): blir, innført, lagtEMPTY(28772): har, seier, er, få, kjem, får, meiner, ha, går, fekk
Relations with Agreement in Gender
The 10 most frequent relations where parent and child node agree in Gender:
NOUN –[det]–> DET (9643; 76%),
NOUN –[nmod:poss]–> PRON (1094; 74%),
ADJ –[expl]–> PRON (825; 90%),
ADJ –[conj]–> ADJ (526; 74%),
DET –[nmod]–> NOUN (144; 69%),
PRON –[acl:relcl]–> ADJ (70; 75%),
PRON –[det]–> DET (28; 67%),
ADJ –[xcomp]–> ADJ (26; 52%),
ADJ –[amod]–> ADJ (24; 60%),
DET –[conj]–> DET (23; 88%).