Treebank Statistics: UD_Romanian-TueCL: Features: Gender
This feature is universal.
It occurs with 2 different values: Fem, Masc.
1456 tokens (33%) have a non-empty value of Gender.
919 types (59%) occur at least once with a non-empty value of Gender.
698 lemmas (61%) occur at least once with a non-empty value of Gender.
The feature is used with 8 part-of-speech tags: NOUN (827; 19% instances), ADJ (219; 5% instances), DET (192; 4% instances), VERB (103; 2% instances), PRON (100; 2% instances), NUM (6; 0% instances), PROPN (6; 0% instances), AUX (3; 0% instances).
NOUN
827 NOUN tokens (98% of all NOUN tokens) have a non-empty value of Gender.
The most frequent other feature values with which NOUN and Gender co-occurred: Typo=EMPTY (738; 89%), Number=Sing (598; 72%), Definite=Ind (521; 63%), Case=Acc,Nom (477; 58%).
NOUN tokens may have the following values of Gender:
Fem(445; 54% of non-emptyGender): femeie, femeia, femeile, femei, fetele, fată, femeii, fete, iubire, mamăMasc(382; 46% of non-emptyGender): bărbat, PUPICI, bărbatul, bărbații, barbat, bărbați, fund, bărbaților, bani, sutienEMPTY(17): BITCH, BRO, MILFă, baby, butter, crop, football, girl, mall, party
| Paradigm soț | Masc | Fem |
|---|---|---|
| Case=Acc,Nom|Definite=Def|Number=Sing | soțul | |
| Case=Dat,Gen|Definite=Def|Number=Sing | soției | |
| Definite=Ind|Number=Sing | soț | |
| Definite=Ind|Number=Plur | soți | |
| Definite=Ind|Number=Plur|Typo=Yes | soti |
Gender seems to be lexical feature of NOUN. 98% lemmas (443) occur only with one value of Gender.
ADJ
219 ADJ tokens (93% of all ADJ tokens) have a non-empty value of Gender.
The most frequent other feature values with which ADJ and Gender co-occurred: Definite=Ind (212; 97%), Degree=Pos (198; 90%), Typo=EMPTY (187; 85%), Number=Sing (157; 72%), Case=EMPTY (112; 51%).
ADJ tokens may have the following values of Gender:
Fem(134; 61% of non-emptyGender): frumoasă, frumoasa, bună, dulce, urâtă, drăguță, existente, feministe, ieftină, receMasc(85; 39% of non-emptyGender): DULCI, misogini, FRUMOȘI, atent, libidinoși, misogin, sexual, superb, șocant, ApetisantEMPTY(16): mare, sexy, așa, DULCI, Hot, SEXSY, bine, mini, nesexy, propriilor
| Paradigm frumos | Masc | Fem |
|---|---|---|
| Case=Acc,Nom|Number=Sing | frumoasă, frumoasa | |
| Case=Acc,Nom|Number=Sing|Typo=Yes | frumoasa | |
| Number=Sing | FRUMOS | |
| Number=Sing|Typo=Yes | frumos | |
| Number=Plur | FRUMOȘI, frumoase | frumoase |
| Number=Plur|Typo=Yes | FRUMOSE |
DET
192 DET tokens (91% of all DET tokens) have a non-empty value of Gender.
The most frequent other feature values with which DET and Gender co-occurred: Number[psor]=EMPTY (172; 90%), Poss=EMPTY (166; 86%), Position=EMPTY (163; 85%), Case=Acc,Nom (154; 80%), Number=Sing (149; 78%), PronType=Ind (129; 67%), Person=EMPTY (113; 59%).
DET tokens may have the following values of Gender:
Fem(112; 58% of non-emptyGender): o, mea, toate, asta, ta, alea, multe, această, alte, ceaMasc(80; 42% of non-emptyGender): un, mulți, a, acestui, toți, unui, acest, al, asta, celEMPTY(20): ce, lor, ei, lui, niste, atâta, fiecare, orice, unor
| Paradigm un | Masc | Fem |
|---|---|---|
| Case=Acc,Nom|PronType=Ind | un | o |
| Case=Dat,Gen|PronType=Ind | unui | unei |
| Case=Dat,Gen|PronType=Ind|Typo=Yes | unui | |
| PronType=Art | o |
VERB
103 VERB tokens (19% of all VERB tokens) have a non-empty value of Gender.
The most frequent other feature values with which VERB and Gender co-occurred: Mood=EMPTY (103; 100%), Person=EMPTY (103; 100%), Tense=EMPTY (103; 100%), VerbForm=Part (103; 100%), Typo=EMPTY (91; 88%), Number=Sing (86; 83%).
VERB tokens may have the following values of Gender:
Fem(41; 40% of non-emptyGender): apucat, venit, zis, Sustinuta, agresată, ajuns, auzit, avut, batut, castigatMasc(62; 60% of non-emptyGender): spus, ajuns, dat, facut, zis, făcut, obligat, văzut, înțeles, PERMISEMPTY(441): au, are, dau, face, vrea, știu, ai, fac, fierbe, uite
| Paradigm face | Masc | Fem |
|---|---|---|
| Typo=Yes | facut | facut |
| făcut |
Gender seems to be lexical feature of VERB. 91% lemmas (69) occur only with one value of Gender.
PRON
100 PRON tokens (26% of all PRON tokens) have a non-empty value of Gender.
The most frequent other feature values with which PRON and Gender co-occurred: Reflex=EMPTY (100; 100%), Person=3 (97; 97%), Variant=EMPTY (91; 91%), Number=Sing (72; 72%), Case=Acc,Nom (65; 65%), PronType=Prs (54; 54%).
PRON tokens may have the following values of Gender:
Fem(57; 57% of non-emptyGender): o, -o, ea, asta, le, una, aia, ele, toate, NiciMasc(43; 43% of non-emptyGender): el, unu, îl, mulți, altul, astia, unu’, unul, Ala, altuEMPTY(291): ce, se, care, te, eu, tine, mine, mă, îți, îmi
| Paradigm el | Masc | Fem |
|---|---|---|
| Case=Acc,Nom|Number=Sing|Strength=Strong | el | ea |
| Case=Acc,Nom|Number=Plur|Strength=Strong | ele | |
| Case=Acc|Number=Sing|Strength=Weak | îl | o, -o |
| Case=Acc|Number=Sing|Strength=Weak|Variant=Short | l | -o |
| Case=Acc|Number=Plur|Strength=Weak | le | le, îi |
NUM
6 NUM tokens (19% of all NUM tokens) have a non-empty value of Gender.
The most frequent other feature values with which NUM and Gender co-occurred: NumForm=Word (5; 83%), NumType=Card (5; 83%), Number=Plur (5; 83%).
NUM tokens may have the following values of Gender:
Fem(1; 17% of non-emptyGender): primaMasc(5; 83% of non-emptyGender): doi, amândoi, doua, treiEMPTY(25): 10, 2, 3, 9, 1, 112, 12, 12000, 2,5, 20
PROPN
6 PROPN tokens (8% of all PROPN tokens) have a non-empty value of Gender.
PROPN tokens may have the following values of Gender:
Fem(5; 83% of non-emptyGender): Elenei, Ezada, Maica, Marea, SinnMasc(1; 17% of non-emptyGender): DoamneEMPTY(66): România, Mirela, Vaida, Irinel, Maria, @KlausIohannis, @Utilizator_x3, ALEXANDRA, Africa, Alex
AUX
3 AUX tokens (1% of all AUX tokens) have a non-empty value of Gender.
The most frequent other feature values with which AUX and Gender co-occurred: Mood=EMPTY (3; 100%), Number=Sing (3; 100%), Person=EMPTY (3; 100%), Tense=EMPTY (3; 100%), VerbForm=Part (3; 100%).
AUX tokens may have the following values of Gender:
Masc(3; 100% of non-emptyGender): fostEMPTY(250): a, e, ești, sunt, am, este, ai, esti, aș, era
Relations with Agreement in Gender
The 10 most frequent relations where parent and child node agree in Gender:
NOUN –[det]–> DET (177; 89%),
NOUN –[amod]–> ADJ (109; 92%),
NOUN –[list]–> NOUN (18; 95%),
ADJ –[nsubj]–> NOUN (11; 85%),
ADJ –[conj]–> ADJ (10; 83%),
NOUN –[nsubj]–> NOUN (10; 63%),
ADJ –[list]–> ADJ (8; 100%),
ADJ –[obl]–> NOUN (6; 60%),
NOUN –[orphan]–> NOUN (6; 67%),
ADJ –[det]–> DET (4; 100%).