Treebank Statistics: UD_Lithuanian-HSE: Features: Gender
This feature is universal.
It occurs with 3 different values: Fem, Masc, Neut.
2232 tokens (42%) have a non-empty value of Gender.
1636 types (70%) occur at least once with a non-empty value of Gender.
1085 lemmas (68%) occur at least once with a non-empty value of Gender.
The feature is used with 8 part-of-speech tags: NOUN (1102; 21% instances), ADJ (399; 7% instances), PROPN (300; 6% instances), VERB (178; 3% instances), PRON (145; 3% instances), DET (91; 2% instances), NUM (14; 0% instances), AUX (3; 0% instances).
NOUN
1102 NOUN tokens (100% of all NOUN tokens) have a non-empty value of Gender.
The most frequent other feature values with which NOUN and Gender co-occurred: Number=Sing (714; 65%).
NOUN tokens may have the following values of Gender:
Fem(420; 38% of non-emptyGender): tautos, tauta, tiesa, valstybės, tautą, tolerancijos, abejonės, dauguma, klaida, komedijosMasc(682; 62% of non-emptyGender): laikais, metų, pasaulyje, pilotų, amžiaus, daugelis, filosofas, metu, pagrindo, prietaisaiEMPTY(3): m, pusėn
| Paradigm mąstykla | Masc | Fem |
|---|---|---|
| Case=Acc | mąstyklą | |
| Case=Gen | mąstyklos | |
| Case=Loc | mąstykloje | |
| Case=Nom | mąstykla |
Gender seems to be lexical feature of NOUN. 99% lemmas (582) occur only with one value of Gender.
ADJ
399 ADJ tokens (96% of all ADJ tokens) have a non-empty value of Gender.
The most frequent other feature values with which ADJ and Gender co-occurred: Degree=Pos (374; 94%), Definite=Ind (366; 92%), Number=Sing (245; 61%).
ADJ tokens may have the following values of Gender:
Fem(157; 39% of non-emptyGender): tautinės, tautinė, viena, Laikinosios, didelė, didelės, didžiulės, ekonominės, kitokių, kitosMasc(235; 59% of non-emptyGender): kitų, vienas, lietuvis, gero, lietuviu, lietuvius, lietuvių, vienintelis, įvairiais, blogesnisNeut(7; 2% of non-emptyGender): nesunku, sunku, aišku, maža, nekukluEMPTY(15): 1939, XIX, šiaip, 1941, 1944, 1961, 2002, 25, 423, XX
| Paradigm aiškus | Masc | Fem | Neut |
|---|---|---|---|
| Case=Dat|Number=Sing | aiškiam | ||
| Case=Gen|Number=Sing | aiškios | ||
| Case=Nom|Number=Sing | aiškus | ||
| Polarity=Pos | aišku |
PROPN
300 PROPN tokens (93% of all PROPN tokens) have a non-empty value of Gender.
The most frequent other feature values with which PROPN and Gender co-occurred: Number=Sing (287; 96%).
PROPN tokens may have the following values of Gender:
Fem(113; 38% of non-emptyGender): Lietuvos, Europos, Rusijos, Lietuva, Rusija, Lietuvoje, Vilma, Lietuvą, Rusijai, JuknaitėMasc(187; 62% of non-emptyGender): Strepsiado, Sokratas, Sokrato, Strepsiadas, Tu-154, Aristofano, Vytautas, Radžvilas, Stalino, SąjūdžioEMPTY(23): BM, MARS, KGB, R., A., JAV, MN-61, NATO, SSRS, TSRS
Gender seems to be lexical feature of PROPN. 100% lemmas (142) occur only with one value of Gender.
VERB
178 VERB tokens (25% of all VERB tokens) have a non-empty value of Gender.
The most frequent other feature values with which VERB and Gender co-occurred: Person=EMPTY (178; 100%), Mood=EMPTY (170; 96%), VerbForm=Part (169; 95%), Definite=Ind (165; 93%), Polarity=Pos (155; 87%), Number=Sing (101; 57%), Case=Nom (97; 54%), Voice=Act (96; 54%).
VERB tokens may have the following values of Gender:
Fem(44; 25% of non-emptyGender): duodama, paskelbta, Mąstanti, apsuptose, atitinkančios, atmesta, atsakyta, atsiradusi, atsižvelgdamos, baigtaMasc(111; 62% of non-emptyGender): vadinami, girdėję, grįžtamasis, laikomas, pastebimi, skirtas, sudužusio, žinojęs, Pradėjęs, apibrėžimasNeut(23; 13% of non-emptyGender): žinoma, galima, bandoma, Būtina, Kalbama, apima, esama, manoma, mokama, negalimaEMPTY(530): gali, turi, negali, būti, nėra, sako, žino, analizuoja, bando, dera
| Paradigm žinoti | Masc | Fem | Neut |
|---|---|---|---|
| Aspect=Perf|Case=Nom|Number=Sing|Polarity=Pos|Tense=Past|Voice=Act | žinojęs | ||
| Case=Loc|Number=Plur|Polarity=Neg|Tense=Pres|Voice=Act | nežinančiose | ||
| Case=Nom|Number=Sing|Polarity=Pos|Tense=Pres|Voice=Act | žinomas | ||
| Polarity=Pos|Tense=Pres|Voice=Pass | žinoma |
PRON
145 PRON tokens (57% of all PRON tokens) have a non-empty value of Gender.
The most frequent other feature values with which PRON and Gender co-occurred: Number=Sing (108; 74%), Person=EMPTY (78; 54%).
PRON tokens may have the following values of Gender:
Fem(40; 28% of non-emptyGender): ji, ją, kurios, jos, kurią, kuri, ja, jai, jas, kokiosMasc(105; 72% of non-emptyGender): jis, to, kuris, jie, jo, nieko, jį, kurie, kurį, visoEMPTY(109): tai, juos, kas, jų, mes, man, mus, mums, aš, ką
| Paradigm kuris | Masc | Fem |
|---|---|---|
| Case=Acc|Number=Sing | kurį | kurią |
| Case=Acc|Number=Plur | kuriuos | kurias |
| Case=Dat|Number=Plur | kuriems | |
| Case=Gen|Number=Sing | kurio | kurios |
| Case=Gen|Number=Plur | kurių | |
| Case=Ins|Number=Plur | kuriais | |
| Case=Loc|Number=Sing | kurioje | |
| Case=Nom|Number=Sing | kuris | kuri |
| Case=Nom|Number=Plur | kurie | kurios |
DET
91 DET tokens (55% of all DET tokens) have a non-empty value of Gender.
The most frequent other feature values with which DET and Gender co-occurred: Number=Sing (60; 66%).
DET tokens may have the following values of Gender:
Fem(40; 44% of non-emptyGender): kokia, tokia, tokios, tokią, tos, kokias, tokiai, tokias, visa, šiojeMasc(51; 56% of non-emptyGender): tas, tą, to, tų, viso, jokių, kiekvienas, pats, toks, jokiuEMPTY(75): mūsų, savo, jo, jų, jos, mano, tavo, šių, tam, tie
| Paradigm tas | Masc | Fem |
|---|---|---|
| Case=Acc|Number=Sing | tą | tą |
| Case=Gen|Definite=Ind|Number=Plur | tų | |
| Case=Gen|Number=Sing | to | tos |
| Case=Ins|Number=Plur | tais | |
| Case=Loc|Number=Sing | tame | |
| Case=Nom|Number=Sing | tas | toji |
| Case=Nom|Number=Plur | tie | Tos |
NUM
14 NUM tokens (58% of all NUM tokens) have a non-empty value of Gender.
The most frequent other feature values with which NUM and Gender co-occurred: Case=Acc (8; 57%), Number=EMPTY (8; 57%).
NUM tokens may have the following values of Gender:
Fem(3; 21% of non-emptyGender): Dešimtys, dvi, trijųMasc(11; 79% of non-emptyGender): du, trys, šimtus, penkis, trijų, tūkstančius, vieną, šimtąEMPTY(10): penkiasdešimt, 1994, 30, 4151, 52, 7, 92, dešimt, tūkst.
| Paradigm trys | Masc | Fem |
|---|---|---|
| Case=Gen | trijų | trijų |
| Case=Nom | trys |
AUX
3 AUX tokens (3% of all AUX tokens) have a non-empty value of Gender.
The most frequent other feature values with which AUX and Gender co-occurred: Mood=EMPTY (3; 100%), Person=EMPTY (3; 100%), Polarity=Pos (3; 100%), Tense=Pres (3; 100%), VerbForm=Part (3; 100%), Number=EMPTY (2; 67%), Voice=Act (2; 67%).
AUX tokens may have the following values of Gender:
Masc(1; 33% of non-emptyGender): esąsNeut(2; 67% of non-emptyGender): Esama, esąEMPTY(109): buvo, yra, nėra, būtų, būti, nebūtų, būna, esu, nebuvo, bus
| Paradigm būti | Masc | Neut |
|---|---|---|
| Case=Nom|Number=Sing|Voice=Act | esąs | |
| Voice=Act | esą | |
| Voice=Pass | Esama |
Relations with Agreement in Gender
The 10 most frequent relations where parent and child node agree in Gender:
NOUN –[amod]–> ADJ (221; 87%),
NOUN –[conj]–> NOUN (91; 69%),
PROPN –[flat]–> PROPN (35; 85%),
ADJ –[conj]–> ADJ (29; 94%),
PROPN –[nmod]–> NOUN (28; 90%),
NOUN –[acl]–> VERB (26; 84%),
NOUN –[amod]–> VERB (22; 81%),
ADJ –[nsubj]–> NOUN (21; 95%),
PROPN –[conj]–> PROPN (15; 75%),
VERB –[conj]–> VERB (14; 58%).