Treebank Statistics: UD_Romanian-RRT: Features: Gender
This feature is universal.
It occurs with 2 different values: Fem, Masc.
90303 tokens (41%) have a non-empty value of Gender.
24194 types (77%) occur at least once with a non-empty value of Gender.
12077 lemmas (70%) occur at least once with a non-empty value of Gender.
The feature is used with 8 part-of-speech tags: NOUN (52825; 24% instances), ADJ (14473; 7% instances), DET (10400; 5% instances), VERB (7632; 3% instances), PRON (3079; 1% instances), NUM (940; 0% instances), AUX (632; 0% instances), PROPN (322; 0% instances).
NOUN
52825 NOUN tokens (97% of all NOUN tokens) have a non-empty value of Gender.
The most frequent other feature values with which NOUN and Gender co-occurred: Number=Sing (38591; 73%), Case=Acc,Nom (28805; 55%), Definite=Def (27197; 51%).
NOUN tokens may have the following values of Gender:
Fem(32519; 62% of non-emptyGender): conformitate, membre, statele, Comisia, parte, față, partea, fața, comisiei, urmăMasc(20306; 38% of non-emptyGender): ani, timp, cazul, loc, timpul, mod, acord, b, lucru, cadrulEMPTY(1431): art., a., nr., CE, b., mg, lit., alin., ml, CEE
| Paradigm timp | Masc | Fem |
|---|---|---|
| Case=Acc,Nom|Definite=Def|Number=Sing | timpul | |
| Case=Acc,Nom|Definite=Def|Number=Sing|Variant=Short | timpu' | |
| Case=Acc,Nom|Definite=Def|Number=Plur | timpurile | |
| Case=Dat,Gen|Definite=Def|Number=Sing | timpului | |
| Definite=Ind|Number=Sing | timp | |
| Definite=Ind|Number=Plur | timpuri |
Gender seems to be lexical feature of NOUN. 92% lemmas (7032) occur only with one value of Gender.
ADJ
14473 ADJ tokens (95% of all ADJ tokens) have a non-empty value of Gender.
The most frequent other feature values with which ADJ and Gender co-occurred: Degree=Pos (14433; 100%), Definite=Ind (13595; 94%), Number=Sing (9614; 66%), Case=EMPTY (8878; 61%).
ADJ tokens may have the following values of Gender:
Fem(9208; 64% of non-emptyGender): europene, necesare, prezenta, europeană, mică, naționale, română, chimice, prezentei, maximăMasc(5265; 36% of non-emptyGender): prezentul, nou, european, prezentului, general, mic, național, bun, românesc, singurEMPTY(824): mare, asemenea, mari, mici, standard, noi, vechi, anume, românești, roșii
| Paradigm mare | Masc | Fem |
|---|---|---|
| Case=Acc,Nom|Definite=Def|Number=Sing | marele | marea |
| Case=Acc,Nom|Definite=Def|Number=Plur | marii | marile |
| Case=Dat,Gen|Definite=Def|Number=Sing | marelui | Marii |
| Case=Dat,Gen|Definite=Ind|Number=Sing | mari |
DET
10400 DET tokens (86% of all DET tokens) have a non-empty value of Gender.
The most frequent other feature values with which DET and Gender co-occurred: Position=EMPTY (8973; 86%), Number=Sing (8608; 83%), Person=EMPTY (7680; 74%), Poss=EMPTY (7069; 68%), Case=Acc,Nom (6027; 58%), PronType=Ind (5339; 51%).
DET tokens may have the following values of Gender:
Fem(6251; 60% of non-emptyGender): o, a, ale, unei, toate, această, aceste, cele, alte, multeMasc(4149; 40% of non-emptyGender): un, al, unui, acest, cel, său, ai, același, cei, acestuiEMPTY(1624): lui, lor, orice, unor, fiecare, ei, acestor, niște, tuturor, celor
| Paradigm un | Masc | Fem |
|---|---|---|
| Case=Acc,Nom|ExtPos=ADV|Number=Sing | un | o |
| Case=Acc,Nom|Number=Sing | un, -un | o, -o |
| Case=Acc,Nom|Number=Plur|Person=3|Position=Prenom | unii | unele |
| Case=Dat,Gen|Number=Sing | unui | unei |
VERB
7632 VERB tokens (33% of all VERB tokens) have a non-empty value of Gender.
The most frequent other feature values with which VERB and Gender co-occurred: Mood=EMPTY (7632; 100%), Person=EMPTY (7632; 100%), Tense=EMPTY (7632; 100%), VerbForm=Part (7632; 100%), Number=Sing (5574; 73%).
VERB tokens may have the following values of Gender:
Fem(2844; 37% of non-emptyGender): prevăzute, menționate, prevăzută, stabilite, legate, utilizate, prezentate, asociate, puse, obținuteMasc(4788; 63% of non-emptyGender): avut, făcut, spus, putut, rupt, dat, murit, devenit, luat, rănitEMPTY(15358): poate, trebuie, pot, putea, avea, are, face, era, există, au
| Paradigm avea | Masc | Fem |
|---|---|---|
| Number=Sing | avut | avută |
| Number=Plur | avute |
PRON
3079 PRON tokens (26% of all PRON tokens) have a non-empty value of Gender.
The most frequent other feature values with which PRON and Gender co-occurred: Reflex=EMPTY (3079; 100%), Person=3 (3063; 99%), Variant=EMPTY (2646; 86%), Number=Sing (2173; 71%), Case=Acc,Nom (1930; 63%), PronType=Prs (1614; 52%).
PRON tokens may have the following values of Gender:
Fem(1547; 50% of non-emptyGender): o, le, ea, ceea, aceasta, acestea, -o, una, ele, toateMasc(1532; 50% of non-emptyGender): el, -l, îl, unul, ei, l-, acesta, cel, cei, acestuiaEMPTY(8729): se, care, ce, s-, își, -și, și-, îi, -se, -i
| Paradigm el | Masc | Fem |
|---|---|---|
| Case=Acc,Nom|Number=Sing|Strength=Strong | el | ea |
| Case=Acc,Nom|Number=Plur|Strength=Strong | ei | ele |
| Case=Acc|Number=Sing|Strength=Weak | îl | o |
| Case=Acc|Number=Sing|Strength=Weak|Variant=Short | -l, l-, l | -o |
| Case=Acc|Number=Plur|Strength=Weak | îi, i | le |
| Case=Acc|Number=Plur|Strength=Weak|Variant=Short | -i, i- | le-, -le |
| Case=Dat,Gen|Number=Sing|Strength=Strong | lui | ei |
| Case=Dat|Number=Plur|Strength=Weak|Variant=Short | -i |
NUM
940 NUM tokens (17% of all NUM tokens) have a non-empty value of Gender.
The most frequent other feature values with which NUM and Gender co-occurred: NumForm=Word (892; 95%), Number=Plur (483; 51%), NumType=Ord (472; 50%).
NUM tokens may have the following values of Gender:
Fem(579; 62% of non-emptyGender): două, prima, doua, primele, milioane, o, ambele, mii, treia, ultimeleMasc(361; 38% of non-emptyGender): primul, doi, doilea, ultimii, un, ultimul, unu, primului, amândoi, prim-EMPTY(4609): 1, 2, 3, 4, trei, 5, 6, 7, 8, i
| Paradigm doi | Masc | Fem |
|---|---|---|
| Number=Sing|NumType=Ord | doilea, secund | doua |
| Number=Plur|NumType=Card | doi | două |
AUX
632 AUX tokens (7% of all AUX tokens) have a non-empty value of Gender.
The most frequent other feature values with which AUX and Gender co-occurred: Mood=EMPTY (632; 100%), Number=Sing (632; 100%), Person=EMPTY (632; 100%), Tense=EMPTY (632; 100%), VerbForm=Part (632; 100%).
AUX tokens may have the following values of Gender:
Masc(632; 100% of non-emptyGender): fostEMPTY(7942): a, este, au, sunt, fi, era, va, ar, am, fie
PROPN
322 PROPN tokens (5% of all PROPN tokens) have a non-empty value of Gender.
PROPN tokens may have the following values of Gender:
Fem(254; 79% of non-emptyGender): României, Moldovei, Dunării, Europei, Franței, Italiei, Norvegiei, Rusiei, Ungariei, GermanieiMasc(68; 21% of non-emptyGender): Carpaților, Iașilor, Jiului, Banatul, Iașii, Israelul, Israelului, Aradului, Banatului, BucureștiuluiEMPTY(5563): România, Winston, București, Timișoara, Iași, Ion, Paris, Alexandru, O’Brien, Moldova
Gender seems to be lexical feature of PROPN. 100% lemmas (104) occur only with one value of Gender.
Relations with Agreement in Gender
The 10 most frequent relations where parent and child node agree in Gender:
NOUN –[amod]–> ADJ (11788; 95%),
NOUN –[nmod]–> NOUN (9059; 54%),
NOUN –[det]–> DET (8143; 78%),
NOUN –[conj]–> NOUN (2496; 73%),
VERB –[nsubj:pass]–> NOUN (1025; 60%),
ADJ –[conj]–> ADJ (662; 93%),
VERB –[conj]–> VERB (527; 61%),
ADJ –[nsubj]–> NOUN (383; 91%),
VERB –[obl:agent]–> NOUN (346; 51%),
NOUN –[appos]–> NOUN (308; 57%).