Treebank Statistics: UD_Romanian-TueCL: Features: Gender
This feature is universal.
It occurs with 2 different values: Fem
, Masc
.
1456 tokens (33%) have a non-empty value of Gender
.
919 types (59%) occur at least once with a non-empty value of Gender
.
698 lemmas (61%) occur at least once with a non-empty value of Gender
.
The feature is used with 8 part-of-speech tags: NOUN (827; 19% instances), ADJ (219; 5% instances), DET (192; 4% instances), VERB (103; 2% instances), PRON (100; 2% instances), NUM (6; 0% instances), PROPN (6; 0% instances), AUX (3; 0% instances).
NOUN
827 NOUN tokens (98% of all NOUN
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which NOUN
and Gender
co-occurred: Typo=EMPTY (738; 89%), Number=Sing (598; 72%), Definite=Ind (521; 63%), Case=Acc,Nom (477; 58%).
NOUN
tokens may have the following values of Gender
:
Fem
(445; 54% of non-emptyGender
): femeie, femeia, femeile, femei, fetele, fată, femeii, fete, iubire, mamăMasc
(382; 46% of non-emptyGender
): bărbat, PUPICI, bărbatul, bărbații, barbat, bărbați, fund, bărbaților, bani, sutienEMPTY
(17): BITCH, BRO, MILFă, baby, butter, crop, football, girl, mall, party
Paradigm soț | Masc | Fem |
---|---|---|
Case=Acc,Nom|Definite=Def|Number=Sing | soțul | |
Case=Dat,Gen|Definite=Def|Number=Sing | soției | |
Definite=Ind|Number=Sing | soț | |
Definite=Ind|Number=Plur | soți | |
Definite=Ind|Number=Plur|Typo=Yes | soti |
Gender
seems to be lexical feature of NOUN
. 98% lemmas (443) occur only with one value of Gender
.
ADJ
219 ADJ tokens (93% of all ADJ
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which ADJ
and Gender
co-occurred: Definite=Ind (212; 97%), Degree=Pos (198; 90%), Typo=EMPTY (187; 85%), Number=Sing (157; 72%), Case=EMPTY (112; 51%).
ADJ
tokens may have the following values of Gender
:
Fem
(134; 61% of non-emptyGender
): frumoasă, frumoasa, bună, dulce, urâtă, drăguță, existente, feministe, ieftină, receMasc
(85; 39% of non-emptyGender
): DULCI, misogini, FRUMOȘI, atent, libidinoși, misogin, sexual, superb, șocant, ApetisantEMPTY
(16): mare, sexy, așa, DULCI, Hot, SEXSY, bine, mini, nesexy, propriilor
Paradigm frumos | Masc | Fem |
---|---|---|
Case=Acc,Nom|Number=Sing | frumoasă, frumoasa | |
Case=Acc,Nom|Number=Sing|Typo=Yes | frumoasa | |
Number=Sing | FRUMOS | |
Number=Sing|Typo=Yes | frumos | |
Number=Plur | FRUMOȘI, frumoase | frumoase |
Number=Plur|Typo=Yes | FRUMOSE |
DET
192 DET tokens (91% of all DET
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which DET
and Gender
co-occurred: Number[psor]=EMPTY (172; 90%), Poss=EMPTY (166; 86%), Position=EMPTY (163; 85%), Case=Acc,Nom (154; 80%), Number=Sing (149; 78%), PronType=Ind (129; 67%), Person=EMPTY (113; 59%).
DET
tokens may have the following values of Gender
:
Fem
(112; 58% of non-emptyGender
): o, mea, toate, asta, ta, alea, multe, această, alte, ceaMasc
(80; 42% of non-emptyGender
): un, mulți, a, acestui, toți, unui, acest, al, asta, celEMPTY
(20): ce, lor, ei, lui, niste, atâta, fiecare, orice, unor
Paradigm un | Masc | Fem |
---|---|---|
Case=Acc,Nom|PronType=Ind | un | o |
Case=Dat,Gen|PronType=Ind | unui | unei |
Case=Dat,Gen|PronType=Ind|Typo=Yes | unui | |
PronType=Art | o |
VERB
103 VERB tokens (19% of all VERB
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which VERB
and Gender
co-occurred: Mood=EMPTY (103; 100%), Person=EMPTY (103; 100%), Tense=EMPTY (103; 100%), VerbForm=Part (103; 100%), Typo=EMPTY (91; 88%), Number=Sing (86; 83%).
VERB
tokens may have the following values of Gender
:
Fem
(41; 40% of non-emptyGender
): apucat, venit, zis, Sustinuta, agresată, ajuns, auzit, avut, batut, castigatMasc
(62; 60% of non-emptyGender
): spus, ajuns, dat, facut, zis, făcut, obligat, văzut, înțeles, PERMISEMPTY
(441): au, are, dau, face, vrea, știu, ai, fac, fierbe, uite
Paradigm face | Masc | Fem |
---|---|---|
Typo=Yes | facut | facut |
făcut |
Gender
seems to be lexical feature of VERB
. 91% lemmas (69) occur only with one value of Gender
.
PRON
100 PRON tokens (26% of all PRON
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which PRON
and Gender
co-occurred: Reflex=EMPTY (100; 100%), Person=3 (97; 97%), Variant=EMPTY (91; 91%), Number=Sing (72; 72%), Case=Acc,Nom (65; 65%), PronType=Prs (54; 54%).
PRON
tokens may have the following values of Gender
:
Fem
(57; 57% of non-emptyGender
): o, -o, ea, asta, le, una, aia, ele, toate, NiciMasc
(43; 43% of non-emptyGender
): el, unu, îl, mulți, altul, astia, unu’, unul, Ala, altuEMPTY
(291): ce, se, care, te, eu, tine, mine, mă, îți, îmi
Paradigm el | Masc | Fem |
---|---|---|
Case=Acc,Nom|Number=Sing|Strength=Strong | el | ea |
Case=Acc,Nom|Number=Plur|Strength=Strong | ele | |
Case=Acc|Number=Sing|Strength=Weak | îl | o, -o |
Case=Acc|Number=Sing|Strength=Weak|Variant=Short | l | -o |
Case=Acc|Number=Plur|Strength=Weak | le | le, îi |
NUM
6 NUM tokens (19% of all NUM
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which NUM
and Gender
co-occurred: NumForm=Word (5; 83%), NumType=Card (5; 83%), Number=Plur (5; 83%).
NUM
tokens may have the following values of Gender
:
Fem
(1; 17% of non-emptyGender
): primaMasc
(5; 83% of non-emptyGender
): doi, amândoi, doua, treiEMPTY
(25): 10, 2, 3, 9, 1, 112, 12, 12000, 2,5, 20
PROPN
6 PROPN tokens (8% of all PROPN
tokens) have a non-empty value of Gender
.
PROPN
tokens may have the following values of Gender
:
Fem
(5; 83% of non-emptyGender
): Elenei, Ezada, Maica, Marea, SinnMasc
(1; 17% of non-emptyGender
): DoamneEMPTY
(66): România, Mirela, Vaida, Irinel, Maria, @KlausIohannis, @Utilizator_x3, ALEXANDRA, Africa, Alex
AUX
3 AUX tokens (1% of all AUX
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which AUX
and Gender
co-occurred: Mood=EMPTY (3; 100%), Number=Sing (3; 100%), Person=EMPTY (3; 100%), Tense=EMPTY (3; 100%), VerbForm=Part (3; 100%).
AUX
tokens may have the following values of Gender
:
Masc
(3; 100% of non-emptyGender
): fostEMPTY
(250): a, e, ești, sunt, am, este, ai, esti, aș, era
Relations with Agreement in Gender
The 10 most frequent relations where parent and child node agree in Gender
:
NOUN –[det]–> DET (177; 89%),
NOUN –[amod]–> ADJ (109; 92%),
NOUN –[list]–> NOUN (18; 95%),
ADJ –[nsubj]–> NOUN (11; 85%),
ADJ –[conj]–> ADJ (10; 83%),
NOUN –[nsubj]–> NOUN (10; 63%),
ADJ –[list]–> ADJ (8; 100%),
ADJ –[obl]–> NOUN (6; 60%),
NOUN –[orphan]–> NOUN (6; 67%),
ADJ –[det]–> DET (4; 100%).