UD Hausa NorthernAutogramm
Language: Hausa (code: ha)
Family: Afro-Asiatic
This treebank has been part of Universal Dependencies since the UD v2.14 release.
The following people have contributed to making this treebank part of UD: Bernard Caron.
Repository: UD_Hausa-NorthernAutogramm
Search this treebank on-line: PML-TQ
Download all treebanks: UD 2.18
License: CC BY-SA 4.0
Genre: spoken
Questions, comments? General annotation questions (either Hausa-specific or cross-linguistic) can be raised in the main UD issue tracker. You can report bugs in this treebank in the treebank-specific issue tracker on Github. If you want to collaborate, please contact [bernard • l • caron (æt) gmail • com]. Development of the treebank happens outside the UD repository. If there are bugs, either the original data source or the conversion procedure must be fixed. Do not submit pull requests against the UD repository.
| Annotation | Source |
|---|---|
| Lemmas | annotated manually |
| UPOS | annotated manually, natively in UD style |
| XPOS | not available |
| Features | annotated manually, natively in UD style |
| Relations | annotated manually, natively in UD style |
Description
This treebank contains data of Northern Autogramm, for the Ader dialect of Niger Republic (Northern Hausa).
The Ader (Northern) Hausa, together with the Sokoto variety, is a more archaic version of Standard Hausa, where some phonological rules have not applied.
The treebank contains 400 sentences and 3,919 tokens.
It is maintained in the SUD framework: SUD_Hausa-NorthernAutogramm and converted automatically in UD.
Acknowledgments
References
- Caron, Bernard. 1991. Le haoussa de l’Ader (Sprache und Oralität in Afrika). Vol. 10. Berlin: D. Reimer. https://www.academia.edu/110044586/Caron_1991_Le_haoussa_de_lAder?sm=b.
Statistics of UD Hausa NorthernAutogramm
POS Tags
ADJ – ADP – ADV – AUX – CCONJ – DET – INTJ – NOUN – NUM – PART – PRON – PROPN – PUNCT – SCONJ – VERB – X
Features
Aspect – Case – Definite – Deixis – Evident – ExtPos – Foreign – Gender – Mood – Number – PartType – Person – Polarity – PronType – Reflex – Tense – VerbForm – Voice
Relations
acl – acl:relcl – advcl – advcl:cleft – advmod – amod – appos – aux – case – cc – cc:preconj – ccomp – compound – compound:prt – conj – cop – dep – det – discourse – dislocated – fixed – flat – flat:foreign – flat:name – iobj – mark – nmod – nmod:poss – nsubj – nummod – obj – obl – obl:arg – obl:mod – parataxis – punct – reparandum – root – vocative – xcomp
Tokenization and Word Segmentation
- This corpus contains 1305 sentences, 15324 tokens and 15424 syntactic words.
- This corpus contains 1332 tokens (9%) that are not followed by a space.
- This corpus does not contain words with spaces.
- This corpus contains 79 types of words that contain both letters and punctuation. Examples: sa'ànnan, ya', 'YabBàraːya, hàlle-hàllan, ta', kaːc'èː, Gilaːgèː-Gilaːgè, aː'àː, Gilaːgè-Gilaːgè, baː'à, c'eːrèː, sa’ànnan, waːc'èː, 'YabBaraːya, a'àː, s'ayàː, Gilaːgèː-Gilaːgèː, c'ìnkai, du', koː'ìnaː, yas', 'yab, 'yan, aː’àː, c'aːwàd, daːma-daːma, duːc'ìː, dà', gùdaː-gudàn, ha', hàlleː-hàllan, hànc'iː, jimma'àː, mòːc'iː, nà'am, taːs'às, wa'ànda, àrbà'in, 'yag, 'yak, 'yam, 'yash, 'yat, Hàwwaː'ù, bà'à, bùːra', c'inoːniː, c’eːrèː, eː'eː, geːmèː!//]
- This corpus contains 100 multi-word tokens. On average, one multi-word token consists of 2.00 syntactic words.
- There are 33 types of multi-word tokens. Examples: gàrai, shikài, sunkài, ankài, kukài, tai, kai, sukài, kunkài, mis, nikài, akài, bân, sàːmai, tassan, abìnga, ai, askaː, bâsshì, ka, kakài, ki, kwak, mukài, nai, nan, santà, shiː, shì, sunkai, wag, wâː, àihwai.
Morphology
Tags
- This corpus uses 16 UPOS tags out of 17 possible: ADJ, ADP, ADV, AUX, CCONJ, DET, INTJ, NOUN, NUM, PART, PRON, PROPN, PUNCT, SCONJ, VERB, X
- This corpus does not use the following tags: SYM
- This corpus contains 33 word types tagged as particles (PART): ba, baːbù, bàƙoː, bàː, bâː, dai, dà, dàk, dàsh, gàː, hwa, kan, koː, kuma, kâm, kèːnan, kòː, maː, mài, màːsu, na, naː, nàː, shîn, ta, taː, tàː, tôː, wai, zan, zâː, àkwai, à~
- This corpus contains 73 lemmas tagged as pronouns (PRON): ", ., can, cân, dukà, indà, ita, ka, kai, karɓ-, keː, ki, koːmi, koːmiː, koːwanèː, koːwaː, ku, kuː, kà, kâi, kâinai, kâinaː, kânka, kânki, mai, maidà, makà, mat, matà, maː, mikì, min, miː, mu, mukù, munà, musù, muː, mìː, nan, nau, naːkà, naːkì, naːkù, naːshì, naːtà, ni, niː, nàː, nân, níː, shi, shiː, shì, su, suwaːnè, suwàː, suː, sà'addà, sù, ta, taːkà, taːsù, wandà, wani, wanèː, waː, waːnèː, wàdà, wàndoː, wàː, wâggàːshi, yà
- This corpus contains 16 lemmas tagged as determiners (DET): ", ., can, cân, dukà, nan, nân, su, waccè, wacèː, wani, wanèː, wasu, wata, wàndoː, wàneː
- Out of the above, 11 lemmas occurred sometimes as PRON and sometimes as DET: ", ., can, cân, dukà, nan, nân, su, wani, wanèː, wàndoː
- This corpus contains 6 lemmas tagged as auxiliaries (AUX): neː, nàː, yaː, yà, yâː, zâi
- Out of the above, 1 lemmas occurred sometimes as AUX and sometimes as VERB: yaː
- There are 2 (de)verbal forms:
- Part
- VERB: tàhe, bìye, tsàye, zàmne, shìrye, gùrhwàːne, bànye, bùːɗe, kwànce, màːlìye
- Vnoun
- VERB: yîː, tàhiyàː, sôn, zakkùwaː, bìɗaː, zuwàː, cîn, sôː, bìyash, gàmuwaː
Nominal Features
- Fem
- ADJ: 'yab, hàlle-hàllan, ƙaːtanyàː, 'yak, 'yash, 'yat, kàrad, wacèː, ƴag, ƴak
- ADP: tsakaɗ
- AUX: tac, tà, taː, kì, tanàː, tay, tak, taz, ta', tas
- DET: wata, tan, wàccân, waccè, wacèː, wàcceː, wàttan
- NOUN: hàukaː, kuːraː, kyàutaː, saːnìyaː, laːhiyàː, gàyyaː, mutuwàː, dàːmisàː, raggàː, duːkìyaː
- NUM: shiddà
- PART: ta, tàː, taː
- PRON: ita, matà, tà, ta, keː, wàccan, naːkì, wàgga, maw, kânki
- PROPN: gaskiyaː
- VERB: tàhiyàː, zakkùwaː, bìyash, gàmuwaː, bìɗaː, hwaːɗùwaː, cêːwaː, kankaryaː, sarɓaː, cèː
- VERB-Vnoun: tàhiyàː, zakkùwaː, bìyash, gàmuwaː, bìɗaː, hwaːɗùwaː, kankaryaː, sarɓaː, cêːwaː, daɗèːwaː
- Masc
- ADJ: ɗan, namijì, ɗam, baƙiː, ƙàramiː, hwarin, hwariː, wajjan, yànkakkeː, ƙaːtòn
- ADP: bàːkin, gàreː
- AUX: yac, yaː, shì, shinàː, kà, yat, yay, yak, yaz, kaː
- DET: wani, wànga, wanèː, wàncân, wàndon, wânga
- NOUN: maulòː, yaːƙìː, gidaː, doːkìː, gàriː, yaːròː, doːkìːnai, sarkiː, mùtun, loːkàcîn
- NUM: buy
- PART: mài, bàƙoː, naː, tôː, àkwai
- PRON: mai, shiː, shì, shi, kai, kà, ka, makà, naːshì, wandà
- PROPN: Ɗiɗìː, Tatìː, Garbaː, Buːzuː, Bàhaushèː, Garbà, Bàgawailèː, zaːrùmiː
- VERB: yîː, sôn, cîn, zuwàː, kwan, sôː, gudùː, zamaː, bugùn, ciːzòn
- VERB-Vnoun: yîː, sôn, cîn, zuwàː, sôː, zamaː, bugùn, ciːzòn, ganin, kwaːnaː
- Plur
- ADJ: jàkkai, maːtaː, hàlle-hàllan, 'yan, hwarhwaruː, mâyyaː, shaːnuː, 'yam, hwarhwarun, hìyayyem
- AUX: sunkà, kù, sunàː, sun, sù, naː, kun, kukà, mun, sukà
- DET: wasu, su, waɗànga, waɗànnan
- NOUN: jàkkai, ruwaː, shaːnuː, mutàːneː, itàːceː, màːlàmmai, baːyuː, yâːra, kaːnuː, mazaː
- PART: màːsu
- PRON: sù, musù, suː, mukù, kù, su, muː, waɗànga, munà, mù
- PROPN: Banìyoːgubà, 'YabBàraːya, Tuːraːwaː, Buːzàːyeː, Hausaːwaː, Baːgayaːwaː, Gàwàllai, Baniyoːgubà
- VERB: tàhiyàkkù, kwaːnankù, shaːnun, yi, ɓaːcìnku
- VERB-Vnoun: kwaːnankù, tàhiyàkkù, ɓaːcìnku
- Sing
- AUX: ìn, naː, baːnì, inàː, bàn, nikà, nikè, niy, bân, nish
- NOUN: shaːnuːnaː, làllaƙàːtaː, makyàːyanaː, maulòːnaː, tsoːhoːnaː, wàndoːnaː, yàboːnaː, zanèːnaː, àbiːnaː, ɓàraːwòːnaː
- PART: mài
- PRON: niː, min, ni, nau, nì, minì, kâinaː, koːwaː, kâina
- VERB: kiràːnaː, shaːnuːnaː
- Acc
- PRON: sù, shì, shi, ni, kà, kù, ka, mù, su, tà
- Dat
- ADP: mà, màː, wà
- PRON: mai, musù, min, matà, makà, mukù, maː, munà, minì, maw
- Gen
- PART: na, ta
- PRON: nau, naːshì, naːkà, naːkì, naːkù, ka, naːtà, taːkà, taːsù, wàndonkà
- Nom
- PRON: shiː, niː, kai, suː, ita, muː, keː, kuː, kêː, shi
- Cons
- ADJ: ɗan, hàlle-hàllan, ɗam, hwarin, jan, 'yab, 'yan, ƙaːtòn, 'yak, 'yam
- DET: wàndon
- NOUN: loːkàcîn, kiːyòn, àbin, kiɗìn, wurin, makyàːyin, ɗan, suːnan, doːkìn, gàrîn
- NUM: buy
- PART: tôː
- PROPN: Banìyoːgubà, Ìlleːlàg, Garbà, Gilaːgyàn, Baniyoːgubà, Garbaː
- VERB: sôn, cîn, bìyash, bugùn, ciːzòn, ganin, tàhiyàkkù, tàhiyàːtai, zaman, kashìn
- VERB-Vnoun: sôn, cîn, bìyash, bugùn, ciːzòn, ganin, tàhiyàːtai, zaman, kashìn, kaːmùn
- Def
- NOUN: dumèn, kòːgôn, naːmàn, tabkìn, tsoːhôn
- Ind
- NOUN: àkaràs, bisàː, asiggìrìː, daːjìː, hwaɗàː, kiːyòː, mutàːneː, takubbàː, ƙwàːriː
- Spec
- DET: wani, wasu, wata
- PRON: wani, wasu, wata
Degree and Polarity
- Neg
- AUX: bài, baːnì, bàn, baːshì, bâi, bàkà, bàsù, baːkà, bàkù, bàmù
- PART: ba, baːbù, bàː, bâː, dàk, dàsh
Verbal Features
- Iter
- PART: ta, kan, zan
- Perf
- AUX: yac, sunkà, yaː, ankà, yat, yay, yak, tac, yaz, kaː
- Prog
- AUX: shinàː, sunàː, kà, anàː, akà, tanàː, baːnì, inàː, shikà, baːshì
- Imp
- VERB: yaː
- Pot
- AUX: amu, ani
- Sub
- AUX: shì, kà, ìn, à, kù, tà, kì, sù, mù, yà
- Fut
- AUX: zâi, zâːki, zaː'à, zân, zâːku, zâːmu, zâːshi, zâːsu, zâːta
- Cau
- VERB: tarsà, tassà, baːsà, taras, tassheː, tarsuwàː, biːsheː, bâssheː, bâːkassà, hîrkassheː
- VERB-Vnoun: baːsuwàː, tarsuwàː
- Stat
- VERB-Part: tàushe, zàmne
- Nfh
- PART: wai
Pronouns, Determiners, Quantifiers
- Dem
- ADV: nan, nân, can, cân, nanânga, ceːnìyaː
- DET: nan, ga, tan, wànga, waɗànga, waɗànnan, wàccân, can, ceːnìyaː, wàncân
- PRON: waɗànga, wàccan, wànga, wànnan, wàgga, wàncân, waɗàncân, wâggàːshi, wannàn, waɗànnan
- Ind
- ADV: koː'ìnaː
- DET: wani, wasu, wata
- PRON: koːwaː, koːmiː, wani, wasu, koːwanèː, suwaːnè, wata, waːnèː
- Int
- ADJ: wacèː
- ADV: ìnaː, ƙàːƙàː, yaushèː
- DET: wanèː, waccè, wacèː, wàcceː
- PRON: miː, waː, mìː, koːwanèː, suwàː, wanèː, koːwaː, wàː
- Prs
- PRON: mai, shiː, niː, shì, sù, shi, musù, kai, min, suː
- Rel
- ADV: indà, wàdà, indàduk, duwwàdà, indàdun, indàdut, koːìnaː, koːƙàːƙàː, kóːkòːindà
- PRON: indà, wandà, waɗàndà, koːmiː, sà'addà, wa'ànda, wàdà, waɗànda
- Tot
- ADV: dukà, duh
- DET: dukà, duk, dug, duy, du', dus, dub, dukàn, dun, dut
- PRON: dukà, duk, duy, dûn, koːwanèː
- Yes
- PRON: kâinai, kânka, kâinaː, kâina, kânki, kânkà
- 1
- AUX: ìn, naː, baːnì, inàː, bàn, nikà, mun, nikè, mù, bàmù
- NOUN: jàkkaina
- PRON: niː, min, ni, nau, muː, munà, mù, nì, minì, kâinaː
- 2
- AUX: kà, kù, kaː, kì, kun, kanàː, bàkà, kaz, kukà, kunkà
- PRON: kai, kà, ka, kù, makà, mukù, maː, naːkà, keː, kânka
- VERB-Vnoun: ɓaːcìnku
- 3
- AUX: yac, sunkà, yaː, shì, shinàː, yat, yay, yak, tac, yaz
- NOUN: màːlàmmai, uwattà
- PRON: mai, shiː, shì, sù, shi, musù, suː, ita, matà, tà
- VERB: yi, jèːwàyèsshi, zamansù
- 4
- AUX: ankà, à, akà, anàː, an, baː'à, baːà, akè, bà'à, zaː'à
Other Features
- Deixis
- Med
- ADV: cân, ceːnìyaː
- DET: wàccân, wàncân
- PRON: wàncân, waɗàncân, wàncéːnìyaː
- ProxH
- ADV: nan
- DET: nan, waɗànnan, tan, wàttan
- PRON: wànnan, waɗànnan, wàttan
- ProxS
- ADV: nân, nanânga
- DET: ga, wànga, waɗànga, wânga
- PRON: waɗànga, wànga, wàgga, wannàn
- Remt
- ADV: can, cân
- DET: can, ceːnìyaː
- PRON: wàccan
- Med
- ExtPos
- ADP
- NOUN: cikinsù, tsàkaːninsù, s'àkaːniːnai
- ADV
- ADV: sai
- VERB: tàhe, bìye, tsàye, kwàːni, taːshì, zàmne, shìrye, bànye, bùːɗe, gùrhwàːne
- VERB-Part: tàhe, bìye, tsàye, zàmne, shìrye, bànye, bùːɗe, gùrhwàːne, kwànce, màːlìye
- NOUN
- NOUN: kiɗìn, kiːyòn, ayaː, baːwàːnai, idòːnai, jiːhwàː, kiːshìn, suːnaː, watàn, yawàn
- VERB: yîː, tàhiyàː, sôn, zakkùwaː, zuwàː, cîn, bìɗaː, kwan, sôː, bìyash
- VERB-Part: gùrhwàːne
- VERB-Vnoun: yîː, tàhiyàː, sôn, zakkùwaː, zuwàː, cîn, bìɗaː, sôː, bìyash, gàmuwaː
- PRON
- ADV: duh
- DET: du', dut, duk, dus, duy
- PRON: dûn
- SCONJ
- ADP: har, sai
- ADP
- Foreign
- Yes
- X: sàlaːmù, àleːkùm
- Yes
- PartType
- Aspect
- PART: ta, kan, zan
- Case
- PART: na, ta
- Der
- PART: mài, màːsu
- Disc
- PART: kuma, shîn
- Evident
- PART: wai
- Foc
- PART: nàː, naː, tàː, kèːnan, taː
- Neg
- PART: bàː, ba, dàk, dàsh
- Pred
- PART: gàː, àkwai, baːbù, zâː, bâː, dà, à~
- Top
- PART: dai, kòː, kuma, hwa, maː, koː, kâm
- Aspect
Syntax
Auxiliary Verbs and Copula
- This corpus uses 1 lemmas as copulas (cop). Examples: neː.
- This corpus uses 6 lemmas as auxiliaries (aux). Examples: yaː, yà, nàː, zâi, yâː, neː.
Core Arguments, Oblique Arguments and Adjuncts
Here we consider only relations between verbs (parent) and nouns or pronouns (child).
- nsubj
- VERB--NOUN (177)
- VERB--PRON (30)
- VERB--PRON-Nom (2)
- VERB-Part--NOUN (3)
- VERB-Part--PRON (1)
- VERB-Vnoun--NOUN (5)
- VERB-Vnoun--PRON (3)
- obj
- VERB--NOUN (431)
- VERB--NOUN-ADP(don) (1)
- VERB--NOUN-ADP(na) (1)
- VERB--PRON (64)
- VERB--PRON-ADP(don) (1)
- VERB--PRON-Acc (80)
- VERB--PRON-Acc-ADP(gà) (1)
- VERB--PRON-Acc-ADP(gàreː) (1)
- VERB--PRON-Gen (2)
- VERB--PRON-Nom (9)
- VERB-Vnoun--NOUN (41)
- VERB-Vnoun--PRON (4)
- iobj
- VERB--NOUN (8)
- VERB--PRON (12)
- VERB--PRON-Acc (35)
- VERB--PRON-Dat (180)
- VERB--PRON-Gen (1)
Verbs with Reflexive Core Objects
- This corpus contains 4 lemmas that occur at least once with a reflexive core object (obj or iobj). Examples: bany- kânki, baː kâinaː, fans- kâinaː, han- kâinai
Relations Overview
- This corpus uses 9 relation subtypes: acl:relcl, advcl:cleft, cc:preconj, compound:prt, flat:foreign, flat:name, nmod:poss, obl:arg, obl:mod
- The following 6 relation types are not used in this corpus at all: csubj, expl, clf, list, orphan, goeswith