Treebank Statistics: UD_Bhojpuri-BHTB: Features: Gender
This feature is universal.
It occurs with 2 different values: Fem
, Masc
.
3957 tokens (59%) have a non-empty value of Gender
.
1340 types (80%) occur at least once with a non-empty value of Gender
.
1291 lemmas (79%) occur at least once with a non-empty value of Gender
.
The feature is used with 14 part-of-speech tags: NOUN (1621; 24% instances), ADP (571; 9% instances), VERB (513; 8% instances), PROPN (347; 5% instances), AUX (205; 3% instances), DET (187; 3% instances), PRON (172; 3% instances), ADJ (114; 2% instances), PART (96; 1% instances), NUM (90; 1% instances), CCONJ (32; 0% instances), ADV (4; 0% instances), INTJ (4; 0% instances), SCONJ (1; 0% instances).
NOUN
1621 NOUN tokens (87% of all NOUN
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which NOUN
and Gender
co-occurred: Person=3 (1559; 96%), Number=Sing (1502; 93%), Case=Nom (940; 58%).
NOUN
tokens may have the following values of Gender
:
Fem
(394; 24% of non-emptyGender
): जी, बात, बेर, बिआह, भाषा, अश्लीलता, जय, पत्रिका, स, घेराMasc
(1227; 76% of non-emptyGender
): लोग, देश, रंग, साल, आजु, आदमी, लोगन, साहित्य, कार्यक्रम, विश्वासEMPTY
(233): जब, बिआह, तब, अब, पहिले, उहाँ, कथा, गवनई, चीफ, जहाँ
Paradigm बिआह | Masc | Fem |
---|---|---|
Case=Acc|Number=Sing | बिआह | बिआह |
Case=Nom|Number=Sing | बिआह | बिआह |
Case=Nom|Number=Plur | बिआह |
Gender
seems to be lexical feature of NOUN
. 93% lemmas (722) occur only with one value of Gender
.
ADP
571 ADP tokens (58% of all ADP
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which ADP
and Gender
co-occurred: AdpType=Post (555; 97%), Number=Sing (474; 83%), Case=Acc (346; 61%).
ADP
tokens may have the following values of Gender
:
Fem
(4; 1% of non-emptyGender
): के, खातिर, लेMasc
(567; 99% of non-emptyGender
): के, का, वाला, खातिर, ओके, जाके, लेके, उठाके, साथे, हमनीकेEMPTY
(418): में, से, पर, के, ले, तबे, अतने, खातिर, का, अपने
Paradigm का | Masc | Fem |
---|---|---|
Case=Acc|Number=Sing | के | |
Case=Acc|Number=Plur | के | |
Case=Nom|Number=Sing | का, के | के |
Case=Nom|Number=Sing|Person=3|Polite=Form | के | |
Case=Nom|Number=Plur | के | |
Number=Plur|Person=3 | के |
Gender
seems to be lexical feature of ADP
. 94% lemmas (33) occur only with one value of Gender
.
VERB
513 VERB tokens (67% of all VERB
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which VERB
and Gender
co-occurred: Number=Sing (399; 78%), Person=3 (371; 72%), Voice=EMPTY (335; 65%), Aspect=EMPTY (321; 63%), VerbForm=EMPTY (310; 60%).
VERB
tokens may have the following values of Gender
:
Fem
(130; 25% of non-emptyGender
): चाहीं, होई, बा, ह, कतहीं, आईं, भइल, आइल, कइलीं, होखीMasc
(383; 75% of non-emptyGender
): बा, होखे, भइल, आइल, क, कहल, होला, पहिले, हटे, कइलEMPTY
(254): हो, करे, कर, कहले, ना, बा, लागल, ह, करत, चलि
Paradigm हो | Masc | Fem |
---|---|---|
Aspect=Perf|Number=Sing|VerbForm=Part | होखी | |
Aspect=Perf|Number=Sing|VerbForm=Part|Voice=Act | होखी | |
Aspect=Perf|Number=Plur|VerbForm=Part | होई | |
Aspect=Perf|Number=Plur|VerbForm=Part|Voice=Act | होखे | |
Case=Acc|Number=Sing | होखी | |
Case=Acc|Number=Sing|Person=3 | होखे | |
Mood=Ind|Number=Sing|Person=3|Tense=Fut|VerbForm=Fin|Voice=Act | होई | |
Number=Sing|Person=3|VerbForm=Inf|Voice=Act | होखे | |
Number=Sing|Person=3|Voice=Act | हो | |
Number=Sing|Voice=Act | हो | |
Number=Plur|Person=3|Voice=Act | हो |
Gender
seems to be lexical feature of VERB
. 91% lemmas (195) occur only with one value of Gender
.
PROPN
347 PROPN tokens (82% of all PROPN
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which PROPN
and Gender
co-occurred: Number=Sing (345; 99%), Person=3 (344; 99%), Case=Nom (235; 68%).
PROPN
tokens may have the following values of Gender
:
Fem
(122; 35% of non-emptyGender
): भोजपुरी, दिल्ली, खाली, भगवती, पाती, प्रियंका, अउरी, कमलेश, चोली, जीMasc
(225; 65% of non-emptyGender
): सिंह, जी, प्रियंका, द्विवेदी, प्रसाद, हिन्दुस्तान, उदय, चोपड़ा, प्रकाश, अवधेशEMPTY
(74): पाती, डॉ., पाण्डेय, तिवारी, राय, डा॰, लेखको, मिश्र, अंचल, आंजनेय
Paradigm भोजपुरी | Masc | Fem |
---|---|---|
Case=Acc | भोजपुरी | भोजपुरी |
Case=Nom | भोजपुरी |
Gender
seems to be lexical feature of PROPN
. 96% lemmas (155) occur only with one value of Gender
.
AUX
205 AUX tokens (58% of all AUX
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which AUX
and Gender
co-occurred: Number=Sing (192; 94%), Voice=EMPTY (174; 85%), Polite=EMPTY (167; 81%), Person=3 (150; 73%), Aspect=EMPTY (123; 60%), VerbForm=EMPTY (112; 55%), Case=Nom (106; 52%).
AUX
tokens may have the following values of Gender
:
Fem
(76; 37% of non-emptyGender
): गइल, बा, जाई, रहीं, रही, रहलीं, कइला, चाहीं, जाला, दिहनीMasc
(129; 63% of non-emptyGender
): रहे, रहल, बा, जा, सकेला, गइल, जात, जाला, जाव, सकताEMPTY
(150): बा, जा, जाव, गइल, बानी, बाड़न, हो, दिहलसि, रहल, लागल
Paradigm बा | Masc | Fem |
---|---|---|
Aspect=Perf|Number=Plur|VerbForm=Part | बाड़े | |
Case=Nom|Number=Sing|Person=3 | बा, बाड़न, बाड़ | बा, बाड़ी |
DET
187 DET tokens (53% of all DET
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which DET
and Gender
co-occurred: NumType=EMPTY (187; 100%), PronType=EMPTY (181; 97%), Person=3 (180; 96%), Number=Sing (171; 91%), Case=Nom (163; 87%).
DET
tokens may have the following values of Gender
:
Fem
(57; 30% of non-emptyGender
): ई, जवना, अधिका, दूनो, एगो, एही, कवनोMasc
(130; 70% of non-emptyGender
): कवनो, एह, अइसन, जवन, ई, सभे, ओह, कतना, कुछु, आजुEMPTY
(166): एह, ओह, कुछ, ओकर, ई, हर, अब, एही, आजु, ईहो
Paradigm ई | Masc | Fem |
---|---|---|
ई | ई |
Gender
seems to be lexical feature of DET
. 93% lemmas (41) occur only with one value of Gender
.
PRON
172 PRON tokens (51% of all PRON
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which PRON
and Gender
co-occurred: Number=Sing (161; 94%), Aspect=EMPTY (160; 93%), VerbForm=EMPTY (160; 93%), Case=Nom (143; 83%), PronType=EMPTY (129; 75%), Person=3 (121; 70%).
PRON
tokens may have the following values of Gender
:
Fem
(48; 28% of non-emptyGender
): हमरा, हमनी, उनुका, किडनी, जानतानी, रउवा, ससुरा, खानी, जवना, जापानीMasc
(124; 72% of non-emptyGender
): अपना, आपन, रउरा, हमार, ऊ, बिना, जे, केहू, हमरा, ईहेEMPTY
(163): ओकरा, हम, ऊ, केहूँ, काहे, कइसे, आम, ऊहो, एकरा, कहीं
Paradigm हमर | Masc | Fem |
---|---|---|
Case=Acc,Gen|Person=1|Poss=Yes|PronType=Prs | हमरा | |
Case=Acc|Number=Sing|Person=3 | हमरा | |
Case=Nom|Number=Sing|Person=3 | हमरा |
ADJ
114 ADJ tokens (46% of all ADJ
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which ADJ
and Gender
co-occurred: Number=Sing (107; 94%), Person=3 (97; 85%), Case=Nom (96; 84%).
ADJ
tokens may have the following values of Gender
:
Fem
(18; 16% of non-emptyGender
): तरह, नवका, सतरह, जापानी, बेहूदा, ब्रेकिंग, भाग, भोजपुरियन, भोजपुरिया, भोजपुरीMasc
(96; 84% of non-emptyGender
): पूरा, बड़, छोट, बड़हन, अढ़ाई, अश्लील, आधा, खुलल, खेलल, निडिलवालाEMPTY
(135): सांस्कृतिक, तथाकथित, प, खास, चुपचाप, जरूरी, आखिरी, आसान, काव्य, सहज
Paradigm बेहूदा | Masc | Fem |
---|---|---|
Case=Acc | बेहूदा | |
Case=Nom|Person=3 | बेहूदा |
Gender
seems to be lexical feature of ADJ
. 99% lemmas (70) occur only with one value of Gender
.
PART
96 PART tokens (50% of all PART
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which PART
and Gender
co-occurred: Person=3 (91; 95%), Case=Nom (87; 91%), Number=Sing (80; 83%).
PART
tokens may have the following values of Gender
:
Fem
(17; 18% of non-emptyGender
): त, बस, ना, अतना, खाली, जादा, नइखी, नाहीं, पासMasc
(79; 82% of non-emptyGender
): त, नइखे, ना, बहुते, गमगमावे, घटना, अलावे, तिकवते, वां, विस्तारEMPTY
(96): ना, त, नइखे, भर, ढेर, तनिको, बनवले, बिना, भी, सँ
Paradigm त | Masc | Fem |
---|---|---|
Number=Sing | त | त |
Number=Plur | त |
NUM
90 NUM tokens (60% of all NUM
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which NUM
and Gender
co-occurred: NumType=EMPTY (90; 100%), Case=Nom (85; 94%), Person=3 (85; 94%), Number=Sing (70; 78%).
NUM
tokens may have the following values of Gender
:
Fem
(15; 17% of non-emptyGender
): गो, दोसरो, एगो, चार, चारो, जोड़ी, तिसरका, दोसरका, शृंगारMasc
(75; 83% of non-emptyGender
): एगो, लोग, दू, 5, कलिग, छठवां, दोसर, दोसरा, सिलसिला, 2EMPTY
(59): एक, कुछ, अनकस, बाकि, 12, 120, 2011, 75, आठ, एगो
Paradigm एगो | Masc | Fem |
---|---|---|
एगो | एगो |
Gender
seems to be lexical feature of NUM
. 93% lemmas (25) occur only with one value of Gender
.
CCONJ
32 CCONJ tokens (21% of all CCONJ
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which CCONJ
and Gender
co-occurred: Case=Nom (30; 94%), Number=Sing (30; 94%), Person=3 (30; 94%).
CCONJ
tokens may have the following values of Gender
:
Fem
(2; 6% of non-emptyGender
): आ, रउँआMasc
(30; 94% of non-emptyGender
): बाकिर, अउर, भा, राउर, आखिर, खम्भा, आउरEMPTY
(119): आ, फगुआ, बाकिर, अउर, आउर, खैर, बलुक, सचहूं
ADV
4 ADV tokens (13% of all ADV
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which ADV
and Gender
co-occurred: Number=Sing (3; 75%).
ADV
tokens may have the following values of Gender
:
Fem
(1; 25% of non-emptyGender
): जल्दीMasc
(3; 75% of non-emptyGender
): आजुओ, नाहिंए, शुरूEMPTY
(27): जइसे, हिन्दी, गद्य, ललित, सभ्य, आनन्द, आसानी, जरूर, जल्दी, जसहीं
INTJ
4 INTJ tokens (80% of all INTJ
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which INTJ
and Gender
co-occurred: Case=Acc (4; 100%), Number=Sing (4; 100%).
INTJ
tokens may have the following values of Gender
:
Masc
(4; 100% of non-emptyGender
): गहरे, अरे, दोसरेEMPTY
(1): अजी
SCONJ
1 SCONJ tokens (1% of all SCONJ
tokens) have a non-empty value of Gender
.
SCONJ
tokens may have the following values of Gender
:
Masc
(1; 100% of non-emptyGender
): तकलेEMPTY
(117): कि, त, काहेंकि, निकलि, बाकि, लपकि, आँखि, कोच्चि, प्रवृत्ति
Relations with Agreement in Gender
The 10 most frequent relations where parent and child node agree in Gender
:
VERB –[compound]–> NOUN (173; 51%),
NOUN –[nmod]–> NOUN (164; 56%),
NOUN –[compound]–> NOUN (156; 64%),
PROPN –[compound]–> PROPN (78; 54%),
NOUN –[compound]–> DET (54; 57%),
NOUN –[compound]–> NUM (35; 61%),
NOUN –[compound]–> ADJ (33; 62%),
NOUN –[nmod]–> VERB (23; 72%),
NOUN –[obl]–> NOUN (19; 66%),
NOUN –[conj]–> NOUN (17; 65%).