UD Hausa NorthernAutogramm
Language: Hausa (code: ha
)
Family: Afro-Asiatic, West Chadic
This treebank has been part of Universal Dependencies since the UD v2.14 release.
The following people have contributed to making this treebank part of UD: Bernard Caron.
Repository: UD_Hausa-NorthernAutogramm
Search this treebank on-line: PML-TQ
Download all treebanks: UD 2.14
License: CC BY-SA 4.0
Genre: spoken
Questions, comments? General annotation questions (either Hausa-specific or cross-linguistic) can be raised in the main UD issue tracker. You can report bugs in this treebank in the treebank-specific issue tracker on Github. If you want to collaborate, please contact [bernard • l • caron (æt) gmail • com]. Development of the treebank happens outside the UD repository. If there are bugs, either the original data source or the conversion procedure must be fixed. Do not submit pull requests against the UD repository.
Annotation | Source |
---|---|
Lemmas | annotated manually |
UPOS | annotated manually, natively in UD style |
XPOS | not available |
Features | annotated manually, natively in UD style |
Relations | annotated manually, natively in UD style |
Description
This treebank contains data of Northern Autogramm, for the Ader dialect of Niger Republic (Northern Hausa).
The Ader (Northern) Hausa, together with the Sokoto variety, is a more archaic version of Standard Hausa, where some phonological rules have not applied.
The treebank contains 400 sentences and 3,919 tokens.
It is maintained in the SUD framework: SUD_Hausa-NorthernAutogramm and converted automatically in UD.
Acknowledgments
References
- Caron, Bernard. 1991. Le haoussa de l’Ader (Sprache und Oralität in Afrika). Vol. 10. Berlin: D. Reimer. https://www.academia.edu/110044586/Caron_1991_Le_haoussa_de_lAder?sm=b.
Statistics of UD Hausa NorthernAutogramm
POS Tags
ADJ – ADP – ADV – AUX – CCONJ – DET – INTJ – NOUN – NUM – PART – PRON – PROPN – PUNCT – SCONJ – VERB – X
Features
Aspect – Case – Definite – Deixis – ExtPos – Gender – Number – PartType – Person – Polarity – PronType – Reflex – Tense – VerbForm – Voice
Relations
acl – acl:relcl – advcl – advcl:cleft – advmod – amod – appos – aux – case – cc – cc:preconj – ccomp – compound – conj – cop – csubj – dep – det – discourse – dislocated – fixed – flat:name – iobj – mark – nmod – nsubj – nummod – obj – obl – obl:arg – parataxis – punct – reparandum – root – vocative – xcomp
Tokenization and Word Segmentation
- This corpus contains 400 sentences, 3807 tokens and 3919 syntactic words.
- All tokens in this corpus are followed by a space.
- This corpus does not contain words with spaces.
- This corpus contains 41 types of words that contain both letters and punctuation. Examples: sa'ànnan, aː'àː, sa’ànnan, waːc'èː, ya', baː'à, du', ta', aː’àː, gàyyaː., kaːc'èː, koː'ìnaː, s'ayàː, wa'ànda, yas', àrbà'in, a', c’eːrèː, c’ìnka, duːc'ìː, dà', geːmèː!//], gùdaː-gudàn, ha', he'è:, his'ariː, ki', kwànce-kwànce, kàmas', láːtà'addù, nà:, nân., s'aisà, s'àkaːniː, s'àkaːnìn, sà'addà, taːs'àt, wàhalà', ƴa', ɗan'ubancìː, ḿː'm̀ː
- This corpus contains 112 multi-word tokens. On average, one multi-word token consists of 2.00 syntactic words.
- There are 80 types of multi-word tokens. Examples: ankài, kukài, shikài, mis, uwattà, uwaːtai, yâːtaː, àbinkà, bambanciyassù, baːyuːnai, bân, cikinsù, rânta, sukài, sunkài, sàːmai, tai, tassan, ƙarhiːnai, abìnga, akài, askaː, bambanciyaːtai, baːyunkì, baːyuːna, biyànka, bàːkinsù, bâsshì, c’ìnkai, dumèːnai, duːkìyattà, ganai, ganiːnai, gidankà, gidansù, indà, iːkòːnai, jàkkainaː, jàkkankì, jàlloːnai, jèːwàyèsshi, ka, kai, ki, kirànka, kyaːwònsu, kânka, maƙaːraːtai, mutàːneː, nai.
Morphology
Tags
- This corpus uses 16 UPOS tags out of 17 possible: ADJ, ADP, ADV, AUX, CCONJ, DET, INTJ, NOUN, NUM, PART, PRON, PROPN, PUNCT, SCONJ, VERB, X
- This corpus does not use the following tags: SYM
- This corpus contains 26 word types tagged as particles (PART): a', ab, ad, am, an, at, ba, baːbù, bàː, bâː, dai, dà, gàː, hwa, koː, kòː, mài, na, naː, nà:, nàː, shìn, ta, wai, zâː, àkwai
- This corpus contains 59 lemmas tagged as pronouns (PRON): =ai, =ka, =ki, =ku, =kà, =kù, =nai, =naː, =shi, =su, =sù, =ta, =tai, =tay, =taː, =tà, =ya, eː, hm̂ː, ita, ka, kai, keː, koːmi, koːmiː, koːwaː, kâinai, kânka, mai, mat, matà, maː, min, miː, musù, naːkù, naːshì, naːtà, ni, niː, níː, shi, shiː, su, suː, sà'addà, sù, ta, taːkà, taːsù, wandà, wani, wàccan, wàgga, wàncân, wàncéːnìyaː, wànga, wànnan, wâggàːshi
- This corpus contains 9 lemmas tagged as determiners (DET): can, dukà, ga, nan, su, waccè, wani, wanèː, wàccân
- Out of the above, 2 lemmas occurred sometimes as PRON and sometimes as DET: su, wani
- This corpus contains 1 lemmas tagged as auxiliaries (AUX): _
- There are 2 (de)verbal forms:
- Part
- VERB: bìye, tàhe, kwànce, màːlìye, tsàye, tàushe, zàmne
- Vnoun
- VERB: tàhiyàː, yîː, zakkùwaː, sôn, gàmuwaː, cîn, gudùː, hwaːɗùwaː, kwaːnaː, taːshìː
Nominal Features
- Fem
- ADJ: wacèː, ƴag, ƴak
- AUX: tac, tanàː, tà, taː, tas, tay, tab, tak, takè, tat
- DET: wata, wàccân, waccè
- NOUN: dàːmisàː, kuːraː, gàyyaː, duːniyàː, yâː, raːnaː, kwaːnaː, uwaː, rân, uwat
- PART: ta
- PRON: ita, tà, =tà, =ta, wàccan, matà, =kì, wàgga, keː, maw
- VERB: tàhiyàː, zakkùwaː, gàmuwaː, gàyyaː., hwaːɗùwaː, bìɗaː, cêːwaː, daɗèːwaː, kankaryaː, kaːwoːwàː
- VERB-Vnoun: tàhiyàː, zakkùwaː, gàmuwaː, hwaːɗùwaː, bìɗaː, cêːwaː, daɗèːwaː, kankaryaː, kaːwoːwàː, tàhiyàk
- Masc
- ADJ: ɗan, baƙiː, hwarin, hwariː, namijì, ƙàramiː, baƙin, hìyayyem, jan
- AUX: yac, yaː, kaː, shinàː, shì, yaz, kà, yat, yah, yay
- DET: wani, wanèː
- NOUN: kàreː, gidaː, sâː, yaːɗaː, zàkaràː, ɓiki, mùtun, àbin, mùzuːruː, bàːkin
- PRON: shiː, shi, shì, =nai, =tai, mai, =ka, =kà, kà, maː
- PROPN: Buːzuː, Bàhaushèː, Bàgawailèː
- VERB-Vnoun: yîː, sôn, cîn, gudùː, kwaːnaː, taːshìː, hwaɗìː, yîn, zaman, zamaː
- Plur
- ADJ: maːtaː
- AUX: sunkà, sun, sunàː, sukà, sù, kukà, kun, bàkù, kù, mukà
- DET: su, wasu
- NOUN: ruwaː, mutàːneː, ƴan, giːwàːyeː, zàːrùmmai, baːyuː, kuɗɗiː, maːtaː, ƙwàːriː, cinàn
- PRON: =sù, suː, musù, sù, wa'ànda, =kù, =su, su, wasu, =ku
- PROPN: Tuːraːwaː, Buːzàːyeː, Hausaːwaː, Baːgayaːwaː, Gàwàllai
- Sing
- AUX: ìn, naː, inàː, bàn, ani, nikà, nim, nis, nish, niy
- PART: mài
- PRON: niː, min, =na, ni, =taː, =naː, nì, shiː
- Dat
- ADP: mà
- PRON: mai, musù, maː, min, matà, maw
- Gen
- PRON: =nai, =kà, =kù, =ta, naːshì, =kì, =su, =tai, naːkù
- Nom
- PRON: shiː, ita, niː, suː, kai, keː, shi
- Cons
- ADJ: ɗan, hwarin, jan, baƙin, hìyayyem, ƴag, ƴak
- ADV: hakàn
- NOUN: àbin, bàːkin, ƴan, kàram, ɗan, loːkàcin, wurin, yâː, baːyuː, cinàn
- PART: na, ta
- PROPN: Ìlleːlàg
- VERB-Vnoun: sôn, cîn, tàhiyàː, tàhiyàk, yîn, zaman, ɓaːcìn, aihùwaz, biyàn, bìyam
- Def
- ADV: nan
- DET: nan
- NOUN: abìn, dumèn, gàrîn, rân, sân, ɗan, duːniyàg, loːkàcîn, làːbàːrûn, lòːkacìn
- PRON: wànnan
- Spec
- DET: wani, wasu
- PRON: wani, wasu
Degree and Polarity
- Neg
- AUX: bài, baː'à, baːkà, bàkà, bàkù, baːmù, bàn, baːkù, bàsù, bàtà
- PART: baːbù, ba, bâː, bàː
Verbal Features
- Aor
- AUX: shì, à, ìn, tà, kà, sù, kì, kù, mù
- Iter
- PART: ta
- Perf
- AUX: yaː, kaː, naː, sun, taː, kun, mun, an, kyaː
- PerfBkg
- AUX: yac, ankà, sunkà, tac, yaz, yat, yah, yay, yas, yag
- PerfNeg
- AUX: bài, bàkà, bàkù, bàn, bàsù, bàtà
- Prog
- AUX: shinàː, anàː, tanàː, sunàː, inàː, kanàː, nàː, kunàː
- PART: nàː
- ProgBkg
- AUX: akà, sukà, kukà, kà, shikà, mukà, kakà, nikà, takà
- ProgLocBkg
- AUX: takè, shikè, sukè, akè, kukè, kakè
- ProgNeg
- AUX: baː'à, baːkà, baːmù, baːkù
- Fut
- AUX: zâki, zâːku, zâːshi
- Pred
- AUX: ani
- Cau
- VERB: tassheː, bâsshee, hîrkassheː, jèːwàyès, s'aisà, tassà, tàssa, tâːkassà
- Stat
- VERB-Part: bìye, tàhe, kwànce, màːlìye, tsàye, tàushe, zàmne
Pronouns, Determiners, Quantifiers
- Art
- DET: nan
- Dem
- ADV: nân, nanânga, cân, can, nân.
- DET: ga, wàccân
- PRON: wàccan, wàgga, wàncân, wàncéːnìyaː, wànga, wànnan
- Ind
- ADV: koː'ìnaː
- DET: wata, wasu, wani
- PRON: koːmiː, koːwaː, wani, wasu, wâggàːshi, wata
- Int
- ADJ: wacèː
- ADV: ƙàːƙàː, ìnaː
- DET: wanèː, waccè
- PRON: miː
- Prs
- PRON: shiː, shi, shì, =nai, =tai, ita, =sù, niː, suː, tà
- Rel
- ADV: indà, indàduk, duwwàdà, indàdun, indàdut, koːìnaː, koːƙàːƙàː, wàdà
- PRON: sà'addà, wandà, wa'ànda
- Tot
- ADV: duh
- DET: dug, du', dus, dub, dukà, dun, dut, duy, duk, dûn
- Yes
- PRON: kâinai, kânka
- 1
- AUX: ìn, naː, inàː, mukà, mù, baːmù, bàn, mun, ani, munkà
- PRON: niː, min, =na, ni, =taː, =naː, nì, shi
- 2
- AUX: kaː, kà, kukà, kun, bàkà, bàkù, kanàː, kat, kì, kù
- PRON: =ka, =kà, kà, maː, kai, =kì, ka, =kù, keː, =ku
- 3
- AUX: yac, sunkà, yaː, shì, shinàː, tac, tà, yaz, tanàː, yat
- PRON: shiː, shì, shi, =nai, =tai, ita, =sù, suː, tà, =tà
- 4
- AUX: ankà, à, akà, anàː, baː'à, an
Other Features
- Deixis
- Prox
- ADV: nân, nanânga, nân.
- DET: ga, wàccân
- PRON: wàgga, wànga, wànnan
- Remt
- ADV: cân, can, ceːnìyaː
- DET: can, wàccân
- PRON: wàccan, wàncân, wàncéːnìyaː
- Prox
- ExtPos
- ADJ
- ADJ: hìyayyem
- ADV
- ADV: sai, tsàye
- VERB: bìye, tàhe, kwàːni, kwànce, màːlìye, tsàye, tàushe
- VERB-Part: bìye, tàhe, kwànce, màːlìye, tsàye, tàushe
- NOUN
- PART: mài
- PRON: =ta
- VERB-Vnoun: tàhiyàː, yîː, zakkùwaː, sôn, gàmuwaː, cîn, gudùː, hwaːɗùwaː, kwaːnaː, bìɗaː
- PRON
- ADV: duh
- DET: du', dut, duk, dus, dûn
- ADJ
- PartType
- Int
- PART: ba, hwa, koː, shìn
- Int
Syntax
Auxiliary Verbs and Copula
- This corpus uses 1 lemmas as copulas (cop). Examples: _.
- This corpus uses 1 lemmas as auxiliaries (aux). Examples: _.
Core Arguments, Oblique Arguments and Adjuncts
Here we consider only relations between verbs (parent) and nouns or pronouns (child).
- nsubj
- VERB--NOUN (27)
- VERB--PRON (8)
- VERB-Vnoun--NOUN (1)
- VERB-Vnoun--PRON (1)
- obj
- VERB--NOUN (109)
- VERB--NOUN-ADP(dà) (1)
- VERB--PRON (42)
- VERB--PRON-Nom (1)
- VERB-Vnoun--NOUN (2)
- VERB-Vnoun--PRON (1)
- iobj
- VERB--PRON (10)
- VERB--PRON-Dat (28)
Verbs with Reflexive Core Objects
- This corpus contains 1 lemmas that occur at least once with a reflexive core object (obj or iobj). Examples: han- kâinai