Treebank Statistics: UD_Pashto-Sikaram: Features: Case
This feature is universal.
It occurs with 6 different values: Abl
, Acc
, Gen
, Loc
, Nom
, Voc
.
1537 tokens (61%) have a non-empty value of Case
.
645 types (80%) occur at least once with a non-empty value of Case
.
536 lemmas (84%) occur at least once with a non-empty value of Case
.
The feature is used with 10 part-of-speech tags: NOUN (563; 22% instances), ADP (392; 16% instances), ADJ (261; 10% instances), PROPN (89; 4% instances), DET (88; 3% instances), VERB (57; 2% instances), PRON (48; 2% instances), NUM (30; 1% instances), AUX (8; 0% instances), ADV (1; 0% instances).
NOUN
563 NOUN tokens (100% of all NOUN
tokens) have a non-empty value of Case
.
The most frequent other feature values with which NOUN
and Case
co-occurred: Number=Sing (396; 70%), Gender=Masc (302; 54%).
NOUN
tokens may have the following values of Case
:
Abl
(22; 4% of non-emptyCase
): خوا, مخې, اړخه, کبله, امله, اړخونو, خلکو, دمه, دوده, لاسهAcc
(166; 29% of non-emptyCase
): ژباړې, ژبې, ژبو, خلکو, هېوادونو, کتابونو, ارزښتونو, خبرو, دودونو, ملتونوLoc
(86; 15% of non-emptyCase
): ژبه, برخه, توګه, سیمه, وخت, ژوند, بېخبرۍ, خوا, نړۍ, ودهNom
(289; 51% of non-emptyCase
): ژبه, خبرې, اثر, ارزښت, دود, ستونزه, وده, چاپ, کتابونه, ارزښتونه
Paradigm ژبه | Nom | Acc | Loc |
---|---|---|---|
Number=Sing | ژبه | ژبې | ژبه, ژبې |
Number=Plur | ژبې | ژبو | ژبو |
ADP
392 ADP tokens (100% of all ADP
tokens) have a non-empty value of Case
.
ADP
tokens may have the following values of Case
:
Abl
(32; 8% of non-emptyCase
): له, تر, پرته, پورېAcc
(199; 51% of non-emptyCase
): د, ته, له, څخه, لپاره, سره, تر, پسې, ترمنځ, ترڅنګLoc
(161; 41% of non-emptyCase
): په, کې, پر, پۀ, باندې
Paradigm له | Acc | Abl |
---|---|---|
له | له |
ADJ
261 ADJ tokens (100% of all ADJ
tokens) have a non-empty value of Case
.
The most frequent other feature values with which ADJ
and Case
co-occurred: Gender=Masc (163; 62%), Number=Sing (157; 60%).
ADJ
tokens may have the following values of Case
:
Abl
(3; 1% of non-emptyCase
): بده, لږه, نړیوالوAcc
(58; 22% of non-emptyCase
): نورو, ټولنیزو, پوهنیزو, کلتوري, ادبي, اسلامي, افغاني, انساني, ايرانۍ, بېساروLoc
(32; 12% of non-emptyCase
): نورو, ټولنیز, لره, وروستیو, ايراني, ايرانۍ, بنګالۍ, تولیدي, خلیجي, فرهنګيNom
(168; 64% of non-emptyCase
): جوړ, لږ, نور, هنري, ښه, خپور, زيات, زياتې, سم, ټولنیزEMPTY
(1): خپور
Paradigm نړیوال | Nom | Acc | Loc | Abl |
---|---|---|---|---|
Gender=Masc|Number=Plur | نړیوالو | |||
Gender=Fem|Number=Sing | نړیواله | نړیواله | ||
Gender=Fem|Number=Plur | نړیوالو |
PROPN
89 PROPN tokens (100% of all PROPN
tokens) have a non-empty value of Case
.
The most frequent other feature values with which PROPN
and Case
co-occurred: Number=Sing (73; 82%), Gender=Masc (48; 54%).
PROPN
tokens may have the following values of Case
:
Abl
(1; 1% of non-emptyCase
): پېښورهAcc
(37; 42% of non-emptyCase
): پښتو, پښتنو, پیتر, اردو, ايران, مریم, اسامه, امريکا, ايرانیانو, ايینېLoc
(20; 22% of non-emptyCase
): پښتو, انګرېزۍ, ږوب, اردو, افغانستان, امريکا, لورلايي, هند, هندوستان, پاریسNom
(30; 34% of non-emptyCase
): پښتو, پښتانه, ايرانیان, افغان, ایګوازو, براون, حبیبي, خان, سمیس, طلوعVoc
(1; 1% of non-emptyCase
): سامه
Paradigm پښتو | Nom | Acc | Loc |
---|---|---|---|
پښتو | پښتو | پښتو |
DET
88 DET tokens (83% of all DET
tokens) have a non-empty value of Case
.
The most frequent other feature values with which DET
and Case
co-occurred: Variant=EMPTY (73; 83%), Poss=EMPTY (68; 77%), Reflex=EMPTY (68; 77%), Deixis=EMPTY (50; 57%).
DET
tokens may have the following values of Case
:
Abl
(4; 5% of non-emptyCase
): دې, هغه, همدېAcc
(14; 16% of non-emptyCase
): خپل, هرې, خپلو, همدغو, ځینو, کوم, کومېLoc
(21; 24% of non-emptyCase
): دې, خپله, ټوله, خپل, هره, هماغه, ځینوNom
(49; 56% of non-emptyCase
): هغه, خپل, دغه, همدغه, هر, خپله, دا, ټول, ټولې, ځینېEMPTY
(18): داسې, څو, دغسې, هماغسې, څۀ
Paradigm خپل | Nom | Acc | Loc |
---|---|---|---|
Gender=Masc|Number=Sing | خپل | خپل | خپل |
Gender=Masc|Number=Plur | خپل | خپلو | |
Gender=Fem|Number=Sing | خپله | خپله |
VERB
57 VERB tokens (28% of all VERB
tokens) have a non-empty value of Case
.
The most frequent other feature values with which VERB
and Case
co-occurred: Mood=EMPTY (56; 98%), Person=EMPTY (56; 98%), Gender=EMPTY (34; 60%), Number=EMPTY (34; 60%), Tense=EMPTY (34; 60%), VerbForm=Inf (34; 60%).
VERB
tokens may have the following values of Case
:
Acc
(7; 12% of non-emptyCase
): کولو, شویو, ځلولو, څښلو, څکولو, ړنګېدوNom
(50; 88% of non-emptyCase
): شوي, شوى, ژباړل, ګڼل, شوې, نیول, ويل, کړى, کړي, کړېEMPTY
(148): لري, کوي, کړي, شته, شي, ورکوي, کولاى, کړه, شو, وڅېړو
Paradigm کول | Nom | Acc |
---|---|---|
Aspect=Imp|VerbForm=Inf | کول | کولو |
Gender=Masc|Number=Sing|Tense=Past|VerbForm=Part | کړى | |
Gender=Masc|Number=Plur|Tense=Past|VerbForm=Part | کړي | |
Gender=Fem|Number=Sing|Tense=Past|VerbForm=Part | کړې |
Case
seems to be lexical feature of VERB
. 93% lemmas (28) occur only with one value of Case
.
PRON
48 PRON tokens (35% of all PRON
tokens) have a non-empty value of Case
.
The most frequent other feature values with which PRON
and Case
co-occurred: Poss=EMPTY (48; 100%), Variant=EMPTY (48; 100%), Person=EMPTY (30; 63%).
PRON
tokens may have the following values of Case
:
Abl
(3; 6% of non-emptyCase
): دې, هرڅه, ټولوAcc
(19; 40% of non-emptyCase
): دې, هغوى, هغۀ, چا, بل, ما, هغې, دوىGen
(4; 8% of non-emptyCase
): ستا, زما, زموږLoc
(4; 8% of non-emptyCase
): دې, دوی, هغېNom
(18; 38% of non-emptyCase
): دا, همدا, څوک, دوى, موږ, هغه, همدغه, هیڅوک, ځانEMPTY
(90): يې, چې, ور, ځان, څه, یې, هرڅه, هرڅوک, یوبل
Paradigm دا | Nom | Acc | Loc | Abl |
---|---|---|---|---|
Gender=Fem|Number=Sing | دا | دې | دې | |
دا | دې | دې | دې |
NUM
30 NUM tokens (97% of all NUM
tokens) have a non-empty value of Case
.
The most frequent other feature values with which NUM
and Case
co-occurred: NumType=Card (30; 100%), Gender=Fem (16; 53%).
NUM
tokens may have the following values of Case
:
Acc
(8; 27% of non-emptyCase
): یوه, یوېLoc
(4; 13% of non-emptyCase
): یوه, دوه, دووNom
(18; 60% of non-emptyCase
): یو, یوهEMPTY
(1): 1
Paradigm یو | Nom | Acc | Loc |
---|---|---|---|
Gender=Masc | یو | یوه | |
Gender=Fem | یوه | یوې | یوه |
AUX
8 AUX tokens (6% of all AUX
tokens) have a non-empty value of Case
.
The most frequent other feature values with which AUX
and Case
co-occurred: Aspect=EMPTY (8; 100%), Mood=EMPTY (8; 100%), Person=EMPTY (8; 100%), Tense=Past (8; 100%), VerbForm=Part (8; 100%), Gender=Masc (7; 88%).
AUX
tokens may have the following values of Case
:
Nom
(8; 100% of non-emptyCase
): شوي, شوى, شوېEMPTY
(122): ده, به, وي, شي, دي, کېږي, دى, دﺉ, شو, ونه
ADV
1 ADV tokens (1% of all ADV
tokens) have a non-empty value of Case
.
ADV
tokens may have the following values of Case
:
Abl
(1; 100% of non-emptyCase
): اوسهEMPTY
(101): هم, نو, کله, چېرې, ان, اوس, بیا, وروسته, یوازې, دومره
Relations with Agreement in Case
The 10 most frequent relations where parent and child node agree in Case
:
NOUN –[case]–> ADP (280; 100%),
NOUN –[amod]–> ADJ (170; 100%),
NOUN –[det]–> DET (82; 83%),
NOUN –[conj]–> NOUN (57; 95%),
PROPN –[case]–> ADP (51; 98%),
NOUN –[nummod]–> NUM (27; 100%),
PRON –[case]–> ADP (24; 55%),
ADJ –[conj]–> ADJ (20; 100%),
ADJ –[nsubj]–> NOUN (15; 100%),
NOUN –[nsubj]–> NOUN (12; 75%).