home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Arabic-PUD: Features: Gender

This feature is universal. It occurs with 2 different values: Fem, Masc.

12007 tokens (58%) have a non-empty value of Gender. 6145 types (90%) occur at least once with a non-empty value of Gender. 4070 lemmas (89%) occur at least once with a non-empty value of Gender. The feature is used with 7 part-of-speech tags: NOUN (5493; 26% instances), ADJ (1940; 9% instances), VERB (1707; 8% instances), PROPN (1493; 7% instances), PRON (1146; 6% instances), AUX (150; 1% instances), NUM (78; 0% instances).

NOUN

5493 NOUN tokens (98% of all NOUN tokens) have a non-empty value of Gender.

The most frequent other feature values with which NOUN and Gender co-occurred: Animacy=Nhum (4765; 87%), Definite=Def (4233; 77%), Number=Sing (3967; 72%), Case=Gen (3795; 69%).

NOUN tokens may have the following values of Gender:

Paradigm ra}iys_1MascFem
Case=Acc|Definite=Ind|Number=Singرئيساً
Case=Gen|Definite=Def|Number=Singالرئيس, رئيس
Case=Gen|Definite=Def|Number=Dualالرئيسين
Case=Gen|Definite=Ind|Number=Singرئيسٍ, رئيسرئيسةٍ
Case=Nom|Definite=Def|Number=Singرئيس, الرئيسرئيسة
Case=Nom|Definite=Ind|Number=Plurرؤساء

Gender seems to be lexical feature of NOUN. 97% lemmas (1960) occur only with one value of Gender.

ADJ

1940 ADJ tokens (96% of all ADJ tokens) have a non-empty value of Gender.

The most frequent other feature values with which ADJ and Gender co-occurred: Number=Sing (1865; 96%), Definite=Def (1215; 63%), Case=Gen (1183; 61%).

ADJ tokens may have the following values of Gender:

Paradigm >aw~al_2MascFem
Case=Acc|Definite=Def|Number=Singالأولالأولى
Case=Acc|Definite=Ind|Number=Singأولأولى
Case=Gen|Definite=Def|Number=Singالأولالأولى
Case=Gen|Definite=Def|Number=Plurأوائل
Case=Gen|Definite=Ind|Number=Singأولأولى
Case=Nom|Definite=Def|Number=Singالأول, أولأولى, الأولى
Case=Nom|Definite=Ind|Number=Plurأولى

VERB

1707 VERB tokens (96% of all VERB tokens) have a non-empty value of Gender.

The most frequent other feature values with which VERB and Gender co-occurred: Person=3 (1692; 99%), Number=Sing (1592; 93%), Voice=Act (1531; 90%), Tense=Past (919; 54%), Mood=EMPTY (878; 51%), Aspect=Perf (873; 51%).

VERB tokens may have the following values of Gender:

Paradigm kAn-u_1MascFem
Aspect=Imp|Mood=Ind|Number=Sing|Person=3|Tense=Futيكونتكون
Aspect=Imp|Mood=Ind|Number=Sing|Person=3|Tense=Presيكونتكون, تكن
Aspect=Imp|Mood=Ind|Number=Plur|Person=3|Tense=Presيكونون
Aspect=Imp|Mood=Jus|Number=Sing|Person=3|Tense=Pastيكنتكن
Aspect=Imp|Mood=Sub|Number=Sing|Person=3|Tense=Futيكون, تكون
Aspect=Imp|Mood=Sub|Number=Sing|Person=3|Tense=Presيكونتكون
Aspect=Perf|Number=Sing|Person=2|Tense=Pastكنت
Aspect=Perf|Number=Sing|Person=3|Tense=Pastكانكانت
Aspect=Perf|Number=Dual|Person=3|Tense=Pastكانتا
Aspect=Perf|Number=Plur|Person=3|Tense=Pastكانوا

PROPN

1493 PROPN tokens (87% of all PROPN tokens) have a non-empty value of Gender.

The most frequent other feature values with which PROPN and Gender co-occurred: Number=Sing (1430; 96%), Definite=EMPTY (1174; 79%), Case=EMPTY (1005; 67%), Animacy=Nhum (855; 57%).

PROPN tokens may have the following values of Gender:

Paradigm bikiyn_1MascFem
Animacy=Humبكينبكين
Animacy=Nhumبكين

Gender seems to be lexical feature of PROPN. 97% lemmas (901) occur only with one value of Gender.

PRON

1146 PRON tokens (88% of all PRON tokens) have a non-empty value of Gender.

The most frequent other feature values with which PRON and Gender co-occurred: Number=Sing (1013; 88%), Case=Gen (766; 67%), Person=3 (736; 64%).

PRON tokens may have the following values of Gender:

Paradigm Al~a*iy_1MascFem
Case=Acc|Number=Singالذيالتي
Case=Acc|Number=Plurالذين
Case=Gen|Number=Singالذيالتي
Case=Gen|Number=Dualاللذين
Case=Gen|Number=Plurالذين
Case=Nom|Number=Singالذيالتي
Case=Nom|Number=Sing|Person=3التي
Case=Nom|Number=Dualاللذان
Case=Nom|Number=Plurالذين

AUX

150 AUX tokens (98% of all AUX tokens) have a non-empty value of Gender.

The most frequent other feature values with which AUX and Gender co-occurred: Number=Sing (141; 94%), Person=3 (141; 94%), Voice=Act (135; 90%), Mood=EMPTY (101; 67%), Tense=Past (99; 66%), Aspect=Perf (92; 61%).

AUX tokens may have the following values of Gender:

Paradigm kAn-u_1MascFem
Aspect=Imp|Mood=Ind|Number=Sing|Person=3|Tense=Futيكون
Aspect=Imp|Mood=Ind|Number=Sing|Person=3|Tense=Presيكونتكون
Aspect=Imp|Mood=Jus|Number=Sing|Person=3|Tense=Pastيكنتكن
Aspect=Imp|Mood=Sub|Number=Sing|Person=3|Tense=Presيكونتكون
Aspect=Perf|Number=Sing|Person=2|Tense=Pastكنت
Aspect=Perf|Number=Sing|Person=3|Tense=Pastكانكانت
Aspect=Perf|Number=Plur|Person=3|Tense=Pastكانوا

NUM

78 NUM tokens (21% of all NUM tokens) have a non-empty value of Gender.

The most frequent other feature values with which NUM and Gender co-occurred: Number=Plur (73; 94%), Case=Gen (44; 56%).

NUM tokens may have the following values of Gender:

Paradigm valAv_1MascFem
Case=Accثلاثالثلاثة
Case=Genثلاثثلاثة
Case=Nomثلاثثلاثة

Relations with Agreement in Gender

The 10 most frequent relations where parent and child node agree in Gender: NOUN –[amod]–> ADJ (1109; 81%), NOUN –[nmod]–> NOUN (1015; 54%), VERB –[nsubj]–> NOUN (525; 85%), PROPN –[amod]–> ADJ (235; 98%), VERB –[obj]–> NOUN (218; 52%), NOUN –[conj]–> NOUN (176; 66%), VERB –[nsubj]–> PRON (172; 98%), VERB –[nsubj]–> PROPN (169; 91%), NOUN –[acl:relcl]–> VERB (153; 71%), PROPN –[flat]–> PROPN (137; 75%).