Treebank Statistics: UD_Lithuanian-HSE: Features: Gender
This feature is universal.
It occurs with 3 different values: Fem
, Masc
, Neut
.
2232 tokens (42%) have a non-empty value of Gender
.
1636 types (70%) occur at least once with a non-empty value of Gender
.
1085 lemmas (68%) occur at least once with a non-empty value of Gender
.
The feature is used with 8 part-of-speech tags: NOUN (1102; 21% instances), ADJ (399; 7% instances), PROPN (300; 6% instances), VERB (178; 3% instances), PRON (145; 3% instances), DET (91; 2% instances), NUM (14; 0% instances), AUX (3; 0% instances).
NOUN
1102 NOUN tokens (100% of all NOUN
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which NOUN
and Gender
co-occurred: Number=Sing (714; 65%).
NOUN
tokens may have the following values of Gender
:
Fem
(420; 38% of non-emptyGender
): tautos, tauta, tiesa, valstybės, tautą, tolerancijos, abejonės, dauguma, klaida, komedijosMasc
(682; 62% of non-emptyGender
): laikais, metų, pasaulyje, pilotų, amžiaus, daugelis, filosofas, metu, pagrindo, prietaisaiEMPTY
(3): m, pusėn
Paradigm mąstykla | Masc | Fem |
---|---|---|
Case=Acc | mąstyklą | |
Case=Gen | mąstyklos | |
Case=Loc | mąstykloje | |
Case=Nom | mąstykla |
Gender
seems to be lexical feature of NOUN
. 99% lemmas (582) occur only with one value of Gender
.
ADJ
399 ADJ tokens (96% of all ADJ
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which ADJ
and Gender
co-occurred: Degree=Pos (374; 94%), Definite=Ind (366; 92%), Number=Sing (245; 61%).
ADJ
tokens may have the following values of Gender
:
Fem
(157; 39% of non-emptyGender
): tautinės, tautinė, viena, Laikinosios, didelė, didelės, didžiulės, ekonominės, kitokių, kitosMasc
(235; 59% of non-emptyGender
): kitų, vienas, lietuvis, gero, lietuviu, lietuvius, lietuvių, vienintelis, įvairiais, blogesnisNeut
(7; 2% of non-emptyGender
): nesunku, sunku, aišku, maža, nekukluEMPTY
(15): 1939, XIX, šiaip, 1941, 1944, 1961, 2002, 25, 423, XX
Paradigm aiškus | Masc | Fem | Neut |
---|---|---|---|
Case=Dat|Number=Sing | aiškiam | ||
Case=Gen|Number=Sing | aiškios | ||
Case=Nom|Number=Sing | aiškus | ||
Polarity=Pos | aišku |
PROPN
300 PROPN tokens (93% of all PROPN
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which PROPN
and Gender
co-occurred: Number=Sing (287; 96%).
PROPN
tokens may have the following values of Gender
:
Fem
(113; 38% of non-emptyGender
): Lietuvos, Europos, Rusijos, Lietuva, Rusija, Lietuvoje, Vilma, Lietuvą, Rusijai, JuknaitėMasc
(187; 62% of non-emptyGender
): Strepsiado, Sokratas, Sokrato, Strepsiadas, Tu-154, Aristofano, Vytautas, Radžvilas, Stalino, SąjūdžioEMPTY
(23): BM, MARS, KGB, R., A., JAV, MN-61, NATO, SSRS, TSRS
Gender
seems to be lexical feature of PROPN
. 100% lemmas (142) occur only with one value of Gender
.
VERB
178 VERB tokens (25% of all VERB
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which VERB
and Gender
co-occurred: Person=EMPTY (178; 100%), Mood=EMPTY (170; 96%), VerbForm=Part (169; 95%), Definite=Ind (165; 93%), Polarity=Pos (155; 87%), Number=Sing (101; 57%), Case=Nom (97; 54%), Voice=Act (96; 54%).
VERB
tokens may have the following values of Gender
:
Fem
(44; 25% of non-emptyGender
): duodama, paskelbta, Mąstanti, apsuptose, atitinkančios, atmesta, atsakyta, atsiradusi, atsižvelgdamos, baigtaMasc
(111; 62% of non-emptyGender
): vadinami, girdėję, grįžtamasis, laikomas, pastebimi, skirtas, sudužusio, žinojęs, Pradėjęs, apibrėžimasNeut
(23; 13% of non-emptyGender
): žinoma, galima, bandoma, Būtina, Kalbama, apima, esama, manoma, mokama, negalimaEMPTY
(530): gali, turi, negali, būti, nėra, sako, žino, analizuoja, bando, dera
Paradigm žinoti | Masc | Fem | Neut |
---|---|---|---|
Aspect=Perf|Case=Nom|Number=Sing|Polarity=Pos|Tense=Past|Voice=Act | žinojęs | ||
Case=Loc|Number=Plur|Polarity=Neg|Tense=Pres|Voice=Act | nežinančiose | ||
Case=Nom|Number=Sing|Polarity=Pos|Tense=Pres|Voice=Act | žinomas | ||
Polarity=Pos|Tense=Pres|Voice=Pass | žinoma |
PRON
145 PRON tokens (57% of all PRON
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which PRON
and Gender
co-occurred: Number=Sing (108; 74%), Person=EMPTY (78; 54%).
PRON
tokens may have the following values of Gender
:
Fem
(40; 28% of non-emptyGender
): ji, ją, kurios, jos, kurią, kuri, ja, jai, jas, kokiosMasc
(105; 72% of non-emptyGender
): jis, to, kuris, jie, jo, nieko, jį, kurie, kurį, visoEMPTY
(109): tai, juos, kas, jų, mes, man, mus, mums, aš, ką
Paradigm kuris | Masc | Fem |
---|---|---|
Case=Acc|Number=Sing | kurį | kurią |
Case=Acc|Number=Plur | kuriuos | kurias |
Case=Dat|Number=Plur | kuriems | |
Case=Gen|Number=Sing | kurio | kurios |
Case=Gen|Number=Plur | kurių | |
Case=Ins|Number=Plur | kuriais | |
Case=Loc|Number=Sing | kurioje | |
Case=Nom|Number=Sing | kuris | kuri |
Case=Nom|Number=Plur | kurie | kurios |
DET
91 DET tokens (55% of all DET
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which DET
and Gender
co-occurred: Number=Sing (60; 66%).
DET
tokens may have the following values of Gender
:
Fem
(40; 44% of non-emptyGender
): kokia, tokia, tokios, tokią, tos, kokias, tokiai, tokias, visa, šiojeMasc
(51; 56% of non-emptyGender
): tas, tą, to, tų, viso, jokių, kiekvienas, pats, toks, jokiuEMPTY
(75): mūsų, savo, jo, jų, jos, mano, tavo, šių, tam, tie
Paradigm tas | Masc | Fem |
---|---|---|
Case=Acc|Number=Sing | tą | tą |
Case=Gen|Definite=Ind|Number=Plur | tų | |
Case=Gen|Number=Sing | to | tos |
Case=Ins|Number=Plur | tais | |
Case=Loc|Number=Sing | tame | |
Case=Nom|Number=Sing | tas | toji |
Case=Nom|Number=Plur | tie | Tos |
NUM
14 NUM tokens (58% of all NUM
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which NUM
and Gender
co-occurred: Case=Acc (8; 57%), Number=EMPTY (8; 57%).
NUM
tokens may have the following values of Gender
:
Fem
(3; 21% of non-emptyGender
): Dešimtys, dvi, trijųMasc
(11; 79% of non-emptyGender
): du, trys, šimtus, penkis, trijų, tūkstančius, vieną, šimtąEMPTY
(10): penkiasdešimt, 1994, 30, 4151, 52, 7, 92, dešimt, tūkst.
Paradigm trys | Masc | Fem |
---|---|---|
Case=Gen | trijų | trijų |
Case=Nom | trys |
AUX
3 AUX tokens (3% of all AUX
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which AUX
and Gender
co-occurred: Mood=EMPTY (3; 100%), Person=EMPTY (3; 100%), Polarity=Pos (3; 100%), Tense=Pres (3; 100%), VerbForm=Part (3; 100%), Number=EMPTY (2; 67%), Voice=Act (2; 67%).
AUX
tokens may have the following values of Gender
:
Masc
(1; 33% of non-emptyGender
): esąsNeut
(2; 67% of non-emptyGender
): Esama, esąEMPTY
(109): buvo, yra, nėra, būtų, būti, nebūtų, būna, esu, nebuvo, bus
Paradigm būti | Masc | Neut |
---|---|---|
Case=Nom|Number=Sing|Voice=Act | esąs | |
Voice=Act | esą | |
Voice=Pass | Esama |
Relations with Agreement in Gender
The 10 most frequent relations where parent and child node agree in Gender
:
NOUN –[amod]–> ADJ (221; 87%),
NOUN –[conj]–> NOUN (91; 69%),
PROPN –[flat]–> PROPN (35; 85%),
ADJ –[conj]–> ADJ (29; 94%),
PROPN –[nmod]–> NOUN (28; 90%),
NOUN –[acl]–> VERB (26; 84%),
NOUN –[amod]–> VERB (22; 81%),
ADJ –[nsubj]–> NOUN (21; 95%),
PROPN –[conj]–> PROPN (15; 75%),
VERB –[conj]–> VERB (14; 58%).