Treebank Statistics: UD_Romanian-RRT: Features: Gender
This feature is universal.
It occurs with 2 different values: Fem, Masc.
90555 tokens (41%) have a non-empty value of Gender.
24186 types (77%) occur at least once with a non-empty value of Gender.
12047 lemmas (70%) occur at least once with a non-empty value of Gender.
The feature is used with 8 part-of-speech tags: NOUN (52758; 24% instances), ADJ (14454; 7% instances), DET (10401; 5% instances), VERB (7633; 3% instances), PRON (3418; 2% instances), NUM (937; 0% instances), AUX (632; 0% instances), PROPN (322; 0% instances).
NOUN
52758 NOUN tokens (97% of all NOUN tokens) have a non-empty value of Gender.
The most frequent other feature values with which NOUN and Gender co-occurred: Number=Sing (38509; 73%), Case=Acc,Nom (28751; 54%), Definite=Def (27145; 51%).
NOUN tokens may have the following values of Gender:
Fem(32516; 62% of non-emptyGender): conformitate, membre, statele, Comisia, parte, față, partea, fața, comisiei, anexaMasc(20242; 38% of non-emptyGender): ani, timp, cazul, loc, timpul, mod, acord, b, lucru, cadrulEMPTY(1499): art., a., CE, nr., numele, b., mg, lit., alin., CEE
| Paradigm timp | Masc | Fem |
|---|---|---|
| Case=Acc,Nom|Definite=Def|Number=Sing | timpul | |
| Case=Acc,Nom|Definite=Def|Number=Sing|Variant=Short | timpu' | |
| Case=Acc,Nom|Definite=Def|Number=Plur | timpurile | |
| Case=Dat,Gen|Definite=Def|Number=Sing | timpului | |
| Definite=Ind|Number=Sing | timp | |
| Definite=Ind|Number=Plur | timpuri |
Gender seems to be lexical feature of NOUN. 92% lemmas (7009) occur only with one value of Gender.
ADJ
14454 ADJ tokens (95% of all ADJ tokens) have a non-empty value of Gender.
The most frequent other feature values with which ADJ and Gender co-occurred: Degree=Pos (14414; 100%), Definite=Ind (13578; 94%), Number=Sing (9598; 66%), Case=EMPTY (8867; 61%).
ADJ tokens may have the following values of Gender:
Fem(9199; 64% of non-emptyGender): europene, necesare, prezenta, europeană, mică, naționale, română, chimice, prezentei, maximăMasc(5255; 36% of non-emptyGender): prezentul, nou, european, prezentului, general, mic, național, bun, românesc, singurEMPTY(832): mare, asemenea, mari, mici, standard, noi, vechi, anume, românești, verde
| Paradigm mare | Masc | Fem |
|---|---|---|
| Case=Acc,Nom|Definite=Def|Number=Sing | marele | marea |
| Case=Acc,Nom|Definite=Def|Number=Plur | marii | marile |
| Case=Dat,Gen|Definite=Def|Number=Sing | marelui | Marii |
| Case=Dat,Gen|Definite=Ind|Number=Sing | mari |
DET
10401 DET tokens (90% of all DET tokens) have a non-empty value of Gender.
The most frequent other feature values with which DET and Gender co-occurred: Position=EMPTY (8973; 86%), Number=Sing (8609; 83%), Person=EMPTY (7681; 74%), Poss=EMPTY (7067; 68%), Case=Acc,Nom (6029; 58%), PronType=Ind (5344; 51%).
DET tokens may have the following values of Gender:
Fem(6251; 60% of non-emptyGender): o, a, ale, unei, toate, această, aceste, cele, alte, multeMasc(4150; 40% of non-emptyGender): un, al, unui, acest, cel, său, ai, același, cei, acestuiEMPTY(1123): lui, orice, unor, fiecare, acestor, niște, tuturor, celor, altor, ce
| Paradigm un | Masc | Fem |
|---|---|---|
| Case=Acc,Nom|ExtPos=ADV|Number=Sing | un | o |
| Case=Acc,Nom|Number=Sing | un | o |
| Case=Acc,Nom|Number=Sing|Variant=Short | -un | -o |
| Case=Acc,Nom|Number=Plur|Person=3|Position=Prenom | unii | unele |
| Case=Dat,Gen|Number=Sing | unui | unei |
VERB
7633 VERB tokens (33% of all VERB tokens) have a non-empty value of Gender.
The most frequent other feature values with which VERB and Gender co-occurred: Mood=EMPTY (7633; 100%), Person=EMPTY (7633; 100%), Tense=EMPTY (7633; 100%), VerbForm=Part (7633; 100%), Number=Sing (5574; 73%).
VERB tokens may have the following values of Gender:
Fem(2844; 37% of non-emptyGender): prevăzute, menționate, prevăzută, stabilite, legate, utilizate, prezentate, asociate, puse, obținuteMasc(4789; 63% of non-emptyGender): avut, făcut, spus, putut, rupt, dat, murit, devenit, luat, rănitEMPTY(15357): poate, trebuie, pot, putea, avea, are, face, era, există, au
| Paradigm avea | Masc | Fem |
|---|---|---|
| Number=Sing | avut | avută |
| Number=Plur | avute |
PRON
3418 PRON tokens (28% of all PRON tokens) have a non-empty value of Gender.
The most frequent other feature values with which PRON and Gender co-occurred: Reflex=EMPTY (3418; 100%), Person=3 (3402; 100%), Variant=EMPTY (2977; 87%), Number=Sing (2512; 73%), PronType=Prs (1951; 57%), Case=Acc,Nom (1929; 56%).
PRON tokens may have the following values of Gender:
Fem(1675; 49% of non-emptyGender): o, le, ea, ei, ceea, aceasta, acestea, -o, una, eleMasc(1743; 51% of non-emptyGender): el, lui, -l, îl, unul, l-, ei, acesta, cel, ceiEMPTY(8895): se, care, ce, s-, își, -și, lor, și-, îi, -i
| Paradigm el | Masc | Fem |
|---|---|---|
| Case=Acc,Nom|Number=Sing|Strength=Strong | el | ea |
| Case=Acc,Nom|Number=Plur|Strength=Strong | ei | ele |
| Case=Acc|Number=Sing|Strength=Weak | îl | o |
| Case=Acc|Number=Sing|Strength=Weak|Variant=Short | -l, l- | -o |
| Case=Acc|Number=Plur|Strength=Weak | îi | le |
| Case=Acc|Number=Plur|Strength=Weak|Variant=Short | -i, i- | le-, -le |
| Case=Dat,Gen|Number=Sing|Strength=Strong | lui | ei |
| Case=Dat|Number=Plur|Strength=Weak|Variant=Short | -i |
NUM
937 NUM tokens (17% of all NUM tokens) have a non-empty value of Gender.
The most frequent other feature values with which NUM and Gender co-occurred: NumForm=Word (889; 95%), Number=Plur (484; 52%), NumType=Card (469; 50%).
NUM tokens may have the following values of Gender:
Fem(579; 62% of non-emptyGender): două, prima, primele, doua, milioane, o, ambele, mii, treia, ultimeleMasc(358; 38% of non-emptyGender): primul, doi, doilea, ultimii, un, ultimul, unu, primului, amândoi, treileaEMPTY(4615): 1, 2, 3, 4, trei, 5, 6, 7, 8, i
| Paradigm doi | Masc | Fem |
|---|---|---|
| Number=Sing|NumType=Ord | doilea, secund | doua |
| Number=Plur|NumType=Card | doi | două |
AUX
632 AUX tokens (7% of all AUX tokens) have a non-empty value of Gender.
The most frequent other feature values with which AUX and Gender co-occurred: Mood=EMPTY (632; 100%), Number=Sing (632; 100%), Person=EMPTY (632; 100%), Tense=EMPTY (632; 100%), VerbForm=Part (632; 100%).
AUX tokens may have the following values of Gender:
Masc(632; 100% of non-emptyGender): fostEMPTY(7941): a, este, au, sunt, fi, era, va, ar, am, fie
PROPN
322 PROPN tokens (5% of all PROPN tokens) have a non-empty value of Gender.
PROPN tokens may have the following values of Gender:
Fem(254; 79% of non-emptyGender): României, Moldovei, Dunării, Europei, Franței, Italiei, Norvegiei, Rusiei, Ungariei, GermanieiMasc(68; 21% of non-emptyGender): Carpaților, Iașilor, Jiului, Banatul, Iașii, Israelul, Israelului, Aradului, Banatului, BucureștiuluiEMPTY(5561): România, Winston, București, Timișoara, Iași, Ion, Paris, Alexandru, O’Brien, Moldova
Gender seems to be lexical feature of PROPN. 100% lemmas (104) occur only with one value of Gender.
Relations with Agreement in Gender
The 10 most frequent relations where parent and child node agree in Gender:
NOUN –[amod]–> ADJ (11771; 94%),
NOUN –[nmod]–> NOUN (9120; 53%),
NOUN –[det]–> DET (8145; 82%),
NOUN –[conj]–> NOUN (2490; 73%),
VERB –[nsubj:pass]–> NOUN (1022; 60%),
ADJ –[conj]–> ADJ (660; 93%),
VERB –[conj]–> VERB (527; 61%),
ADJ –[nsubj]–> NOUN (382; 91%),
VERB –[obl:agent]–> NOUN (346; 51%),
NOUN –[appos]–> NOUN (308; 57%).