Treebank Statistics: UD_Irish-IDT: Features: Gender
This feature is universal.
It occurs with 2 different values: Fem, Masc.
Some words have combined values of the feature; 1 combinations have been observed: Fem|Masc.
42712 tokens (37%) have a non-empty value of Gender.
11486 types (77%) occur at least once with a non-empty value of Gender.
6390 lemmas (72%) occur at least once with a non-empty value of Gender.
The feature is used with 7 part-of-speech tags: NOUN (28549; 25% instances), PROPN (4694; 4% instances), ADJ (3503; 3% instances), DET (2428; 2% instances), ADP (1794; 2% instances), PRON (1735; 1% instances), AUX (9; 0% instances).
NOUN
28549 NOUN tokens (85% of all NOUN tokens) have a non-empty value of Gender.
The most frequent other feature values with which NOUN and Gender co-occurred: VerbForm=EMPTY (28549; 100%), Number=Sing (23166; 81%), Case=Nom (21926; 77%), Form=EMPTY (20583; 72%), Definite=EMPTY (15469; 54%).
NOUN tokens may have the following values of Gender:
Fem(10889; 38% of non-emptyGender): chuid, réir, leith, bhfeidhm, bliana, cathrach, bhliain, chomhairle, comhairle, bliainMasc(17660; 62% of non-emptyGender): duine, chéile, daoine, rud, cinn, ábhar, lá, údarás, pobail, amEMPTY(5093): chur, dhéanamh, fáil, bheith, féidir, thabhairt, éis, dul, dtí, cur
| Paradigm bás | Masc | Fem |
|---|---|---|
| Case=Gen|Definite=Def|Form=Ecl|Number=Sing | mbáis | |
| Case=Gen|Definite=Def|Form=Len|Number=Sing | bháis | |
| Case=Gen|Form=Len|Number=Sing | bháis | |
| Case=Gen|NounType=Strong|Number=Plur | Básanna | |
| Case=Gen|Number=Sing | báis | |
| Case=Nom|Definite=Def|Form=Ecl|Number=Sing | mbás | |
| Case=Nom|Definite=Def|Form=Len|Number=Sing | bhás | |
| Case=Nom|Definite=Def|Number=Sing | bás, b(h)ás | |
| Case=Nom|Definite=Def|Number=Sing|Typo=Yes | bas | |
| Case=Nom|Form=Len|Number=Sing | bhás | |
| Case=Nom|Form=Len|Number=Plur | bhásanna | |
| Case=Nom|Number=Sing | bás | |
| Case=Nom|Number=Plur | básanna |
Gender seems to be lexical feature of NOUN. 99% lemmas (4216) occur only with one value of Gender.
PROPN
4694 PROPN tokens (86% of all PROPN tokens) have a non-empty value of Gender.
The most frequent other feature values with which PROPN and Gender co-occurred: Definite=Def (4577; 98%), Number=Sing (4336; 92%), Form=EMPTY (3389; 72%).
PROPN tokens may have the following values of Gender:
Fem(1890; 40% of non-emptyGender): cliath, Gaeltachta, Gaeilge, hÉireann, Ghaeltacht, Éirinn, Ghaeilge, hEorpa, Ghaeltachta, ÉireMasc(2804; 60% of non-emptyGender): Átha, Bhaile, Baile, Seán, mBaile, Béarla, Fómhair, Pádraig, Dhún, nGallEMPTY(784): AE, Dé, UNESCO, AIE, BCE, TG4, MABS, RTÉ, Gcom, Irish
| Paradigm Ciarraí | Masc | Fem |
|---|---|---|
| Case=Gen|Form=Len | Chiarraí | |
| Case=Nom|Form=Len | Chiarraí | Chiarraí |
| Case=Nom | Ciarraí | |
| Form=Len | Chiarraí | |
| Ciarraí |
Gender seems to be lexical feature of PROPN. 98% lemmas (1572) occur only with one value of Gender.
ADJ
3503 ADJ tokens (54% of all ADJ tokens) have a non-empty value of Gender.
The most frequent other feature values with which ADJ and Gender co-occurred: Degree=EMPTY (3503; 100%), VerbForm=EMPTY (3503; 100%), Case=Nom (3094; 88%), Form=EMPTY (2895; 83%), NounType=EMPTY (2437; 70%), Number=Sing (2428; 69%).
ADJ tokens may have the following values of Gender:
Fem(1270; 36% of non-emptyGender): nua, náisiúnta, mhór, éagsúla, poiblí, amháin, idirnáisiúnta, mhaith, ildánach, chultúrthaMasc(2233; 64% of non-emptyGender): nua, mór, amháin, sibhialta, céanna, náisiúnta, áirithe, áitiúil, Eorpach, áitiúlaEMPTY(3015): maith, mó, mór, déanta, amháin, léir, háirithe, curtha, fearr, chóir
| Paradigm mór | Masc | Fem |
|---|---|---|
| Case=Gen|Form=Len|Number=Sing | mhóir | |
| Case=Gen|NounType=Strong|Number=Plur | móra | móra |
| Case=Gen|NounType=Weak|Number=Plur | mór | |
| Case=Gen|Number=Sing | Móir | Móire, Móir |
| Case=Nom|Form=Len|NounType=NotSlender|Number=Plur | mhóra | |
| Case=Nom|Form=Len|NounType=Slender|Number=Plur | mhóra | |
| Case=Nom|Form=Len|Number=Sing | mhór | mhór |
| Case=Nom|NounType=NotSlender|Number=Plur | móra | móra |
| Case=Nom|Number=Sing | mór | mór |
| Number=Sing | Mór |
DET
2428 DET tokens (24% of all DET tokens) have a non-empty value of Gender.
The most frequent other feature values with which DET and Gender co-occurred: Number=Sing (2428; 100%), Case=Gen (1896; 78%), Definite=Def (1896; 78%), Person=EMPTY (1896; 78%), Poss=EMPTY (1896; 78%), PronType=Art (1896; 78%).
DET tokens may have the following values of Gender:
Fem(1110; 46% of non-emptyGender): na, a, ‘na, n-aFem,Masc(13; 1% of non-emptyGender): aMasc(1305; 54% of non-emptyGender): an, a, a’EMPTY(7856): an, na, seo, sin, eile, aon, a, gach, do, mo
| Paradigm a | Fem,Masc | Masc | Fem |
|---|---|---|---|
| Form=Ecl | n-a | ||
| a | a | a |
ADP
1794 ADP tokens (10% of all ADP tokens) have a non-empty value of Gender.
The most frequent other feature values with which ADP and Gender co-occurred: Number=Sing (1794; 100%), Person=3 (1794; 100%), PronType=EMPTY (1643; 92%).
ADP tokens may have the following values of Gender:
Fem(337; 19% of non-emptyGender): uirthi, di, ina, aici, á, léi, dá, inti, lena, chuiciFem,Masc(8; 0% of non-emptyGender): ina, dá, lenaMasc(1449; 81% of non-emptyGender): ann, ina, leis, air, á, aige, dá, dó, lena, chuigeEMPTY(16479): ar, i, ag, le, de, sa, chun, do, leis, in
| Paradigm i | Fem,Masc | Masc | Fem |
|---|---|---|---|
| ann | inti | ||
| Poss=Yes | ina | ina, 'na, na | ina |
| Typo=Yes | an |
PRON
1735 PRON tokens (48% of all PRON tokens) have a non-empty value of Gender.
The most frequent other feature values with which PRON and Gender co-occurred: Number=Sing (1735; 100%), Person=3 (1735; 100%), PronType=EMPTY (1701; 98%).
PRON tokens may have the following values of Gender:
Fem(350; 20% of non-emptyGender): sí, í, sise, ise, híMasc(1385; 80% of non-emptyGender): sé, é, seisean, hé, eisean, éard, seEMPTY(1884): sin, féin, iad, siad, mé, seo, tú, muid, cén, siúd
AUX
9 AUX tokens (1% of all AUX tokens) have a non-empty value of Gender.
The most frequent other feature values with which AUX and Gender co-occurred: Form=EMPTY (9; 100%), Polarity=EMPTY (9; 100%), Tense=Pres (9; 100%), VerbForm=Cop (9; 100%).
AUX tokens may have the following values of Gender:
Masc(9; 100% of non-emptyGender): séEMPTY(1548): is, ba, ní, gur, b’, nach, ar, gurb, nár, an
Relations with Agreement in Gender
The 10 most frequent relations where parent and child node agree in Gender:
NOUN –[nmod]–> NOUN (4103; 51%),
NOUN –[amod]–> ADJ (3248; 88%),
NOUN –[conj]–> NOUN (1227; 56%),
PROPN –[det]–> DET (497; 62%),
PROPN –[flat:name]–> PROPN (315; 78%),
PROPN –[conj]–> PROPN (131; 56%),
PROPN –[amod]–> ADJ (108; 90%),
NOUN –[compound]–> NOUN (101; 64%),
ADJ –[conj]–> ADJ (96; 97%),
NOUN –[appos]–> NOUN (92; 59%).