UD Hausa NorthernAutogramm
Language: Hausa (code: ha)
Family: Afro-Asiatic
This treebank has been part of Universal Dependencies since the UD v2.14 release.
The following people have contributed to making this treebank part of UD: Bernard Caron.
Repository: UD_Hausa-NorthernAutogramm
Search this treebank on-line: PML-TQ
Download all treebanks: UD 2.17
License: CC BY-SA 4.0
Genre: spoken
Questions, comments? General annotation questions (either Hausa-specific or cross-linguistic) can be raised in the main UD issue tracker. You can report bugs in this treebank in the treebank-specific issue tracker on Github. If you want to collaborate, please contact [bernard • l • caron (æt) gmail • com]. Development of the treebank happens outside the UD repository. If there are bugs, either the original data source or the conversion procedure must be fixed. Do not submit pull requests against the UD repository.
| Annotation | Source |
|---|---|
| Lemmas | annotated manually |
| UPOS | annotated manually, natively in UD style |
| XPOS | not available |
| Features | annotated manually, natively in UD style |
| Relations | annotated manually, natively in UD style |
Description
This treebank contains data of Northern Autogramm, for the Ader dialect of Niger Republic (Northern Hausa).
The Ader (Northern) Hausa, together with the Sokoto variety, is a more archaic version of Standard Hausa, where some phonological rules have not applied.
The treebank contains 400 sentences and 3,919 tokens.
It is maintained in the SUD framework: SUD_Hausa-NorthernAutogramm and converted automatically in UD.
Acknowledgments
References
- Caron, Bernard. 1991. Le haoussa de l’Ader (Sprache und Oralität in Afrika). Vol. 10. Berlin: D. Reimer. https://www.academia.edu/110044586/Caron_1991_Le_haoussa_de_lAder?sm=b.
Statistics of UD Hausa NorthernAutogramm
POS Tags
ADJ – ADP – ADV – AUX – CCONJ – DET – INTJ – NOUN – NUM – PART – PRON – PROPN – PUNCT – SCONJ – VERB – X
Features
Aspect – Case – Definite – Deixis – ExtPos – Gender – Mood – Number – PartType – Person – Polarity – PronType – Reflex – Tense – VerbForm – Voice
Relations
acl – acl:relcl – advcl – advcl:cleft – advmod – amod – appos – aux – case – cc – cc:preconj – ccomp – compound – compound:prt – conj – cop – dep – det – discourse – dislocated – fixed – flat:name – iobj – mark – nmod – nsubj – nummod – obj – obl – obl:arg – parataxis – punct – reparandum – root – vocative – xcomp
Tokenization and Word Segmentation
- This corpus contains 423 sentences, 4116 tokens and 4158 syntactic words.
- All tokens in this corpus are followed by a space.
- This corpus does not contain words with spaces.
- This corpus contains 41 types of words that contain both letters and punctuation. Examples: sa'ànnan, aː'àː, sa’ànnan, s'ayàː, ta', waːc'èː, ya', baː'à, du', yas', aː’àː, duːc'ìː, gàyyaː., kaːc'èː, koː'ìnaː, taːs'às, wa'ànda, àrbà'in, c’eːrèː, c’ìnkai, dà', geːmèː!//], gùdaː-gudàn, ha', he'è:, his'ariː, ki', kwànce-kwancèn, kwànce-kwànce, kàmas', láːtà'addù, nân., s'aisà, s'àkaːniːnai, s'àkaːnìn, sà'addà, taːs'àtta, wàhalà', ƴa', ɗan'ubancìː, ḿː'm̀ː
- This corpus contains 42 multi-word tokens. On average, one multi-word token consists of 2.00 syntactic words.
- There are 21 types of multi-word tokens. Examples: ankài, kukài, shikài, mis, bân, sukài, sunkài, sàːmai, tai, tassan, abìnga, akài, askaː, bâsshì, ka, kai, ki, nai, nan, santà, àihwai.
Morphology
Tags
- This corpus uses 16 UPOS tags out of 17 possible: ADJ, ADP, ADV, AUX, CCONJ, DET, INTJ, NOUN, NUM, PART, PRON, PROPN, PUNCT, SCONJ, VERB, X
- This corpus does not use the following tags: SYM
- This corpus contains 14 word types tagged as particles (PART): ba, baːbù, bàː, bâː, dai, dà, gàː, hwa, kuma, kòː, ta, zâː, àkwai, à~
- This corpus contains 42 lemmas tagged as pronouns (PRON): dukà, ita, ka, kai, keː, koːmi, koːmiː, koːwaː, kâinai, kânka, mai, mat, matà, maː, min, miː, musù, naːkù, naːshì, naːtà, ni, niː, níː, shi, shiː, shì, su, suː, sà'addà, ta, taːkà, taːsù, wandà, wani, wanèː, wàccan, wàgga, wàncân, wàncéːnìyaː, wànga, wànnan, wâggàːshi
- This corpus contains 10 lemmas tagged as determiners (DET): can, dukà, ga, nan, su, waccè, wani, wanèː, wàccân, yak
- Out of the above, 4 lemmas occurred sometimes as PRON and sometimes as DET: dukà, su, wani, wanèː
- This corpus contains 1 lemmas tagged as auxiliaries (AUX): _
- There are 2 (de)verbal forms:
- Part
- VERB: bìye, tàhe, kwànce, màːlìye, tsàye, tàushe, zàmne
- Vnoun
- VERB: tàhiyàː, yîː, zakkùwaː, sôn, bìyash, gàmuwaː, cîn, kwaːnaː, gudùː, hwaːɗùwaː
Nominal Features
- Fem
- ADJ: wacèː, ƴag, ƴak
- ADP: ta
- AUX: tac, tanàː, tà, taː, tay, tas, ta', tab, tak, takè
- DET: wata, tan, wàccân, waccè
- NOUN: dàːmisàː, kuːraː, gàyyaː, shìgattà, duːniyàː, rân, raːnakkà, raːnaː, uwattà, uwaːtai
- PRON: ita, tà, wàccan, matà, ta, wàgga, keː, maw, naːtà, wata
- VERB: tàhiyàː, zakkùwaː, bìyash, gàmuwaː, gàyyaː., hwaːɗùwaː, kankaryaː, sarɓaː, bìɗaː, cêːwaː
- VERB-Vnoun: tàhiyàː, zakkùwaː, bìyash, gàmuwaː, hwaːɗùwaː, kankaryaː, sarɓaː, bìɗaː, cêːwaː, daɗèːwaː
- Masc
- ADJ: ɗan, baƙiː, hwarin, hwariː, namijì, ƙàramiː, baƙin, hìyayyem, jan
- AUX: yac, yaː, shinàː, kaː, shì, yaz, kà, yat, yay, yah
- DET: wani, wanèː
- NOUN: kàreː, sarkin, gidaː, bàːkin, sâː, àbin, ɗan, yaːɗaː, kàram, mùtun
- PRON: shiː, shì, shi, mai, kai, maː, ka, kà, wani, wànnan
- PROPN: Buːzuː, Bàhaushèː, Bàgawailèː
- VERB: yîː, sôn, cîn, kwaːnaː, gudùː, sauraːreː, taːshìː, hwaɗìː, yîn, zaman
- VERB-Vnoun: yîː, sôn, cîn, kwaːnaː, gudùː, taːshìː, hwaɗìː, yîn, zaman, zamaː
- Plur
- ADJ: maːtaː
- ADP: cikinsù, tsàkaːninsù
- AUX: sunkà, sunàː, sun, sukà, sù, kukà, kun, bàkù, kù, mukà
- DET: su, wasu
- NOUN: ruwaː, ƴan, mutàːneː, cinàn, giːwàːyeː, ruwan, zàːrùmmai, bambanciyassù, baːyuːnai, gardamàssu
- PRON: suː, musù, sù, su, wa'ànda, wasu, naːkù, taːsù
- PROPN: Tuːraːwaː, Buːzàːyeː, Hausaːwaː, Baːgayaːwaː, Gàwàllai
- VERB-Vnoun: tàhiyàkkù, ɓaːcìnku, ɓaːcìnsu
- Sing
- AUX: ìn, naː, inàː, bàn, ani, nikà, nim, nis, nish, niy
- PRON: niː, min, ni, nì, shiː
- Dat
- ADP: mà
- PRON: mai, musù, maː, min, matà, maw
- Gen
- ADP: cikinsù, tsàkaːninsù
- NOUN: shìgattà, raːnakkà, uwaːtai, yâːtaː, bambanciyassù, baːyuːnai, gardamàssu, rânta, uwattà, àbinkà
- PRON: naːshì, ka, naːkù, naːtà, taːkà, taːsù
- VERB: tàhiyàːtai, biyànka, ganiːnai, kirànka, sônta, taːs'àtta, tàhiyàkkà, tàhiyàkkù, tàhiyàːtay, tàmbayàːtai
- VERB-Vnoun: biyànka, ganiːnai, kirànka, sônta, taːs'àtta, tàhiyàkkà, tàhiyàkkù, tàhiyàːtai, tàhiyàːtay, tàmbayàːtai
- Nom
- PRON: shiː, ita, niː, suː, kai, keː, shi
- Cons
- ADJ: ɗan, hwarin, jan, baƙin, hìyayyem, ƴag, ƴak
- NOUN: sarkin, bàːkin, àbin, ɗan, ƴan, kàram, shìgattà, loːkàcin, wurin, wurîn
- PROPN: Ìlleːlàg
- VERB: sôn, bìyash, cîn, taːs'às, tàhiyàːtai, yîn, zaman, aihùwaz, biyànka, bìyam
- VERB-Vnoun: sôn, bìyash, cîn, taːs'às, yîn, zaman, aihùwaz, biyànka, bìyam, ganiːnai
- Def
- ADV: nan
- DET: nan, tan
- NOUN: dumèn, kòːgôn, naːmàn
- PRON: wànnan
- Ind
- NOUN: ruwaː, kàreː, ayaː, dàːmisàː, gidaː, kuːraː, gàyyaː, sâː, yaːɗaː, mutàːneː
- Spec
- DET: wani, wasu
- PRON: wani, wasu
Degree and Polarity
- Neg
- AUX: bài, baːkà, baː'à, bàkà, bàkù, baːmù, bàn, bàtà, baːkù, bàsù
- PART: ba, baːbù, bâː, bàː
Verbal Features
- Aor
- AUX: shì, à, ìn, tà, kà, sù, kì, kù, mù
- Iter
- PART: ta
- Perf
- AUX: yaː, kaː, naː, taː, sun, kun, mun, an, kyaː
- PerfBkg
- AUX: yac, ankà, sunkà, tac, yaz, yat, yay, yah, yas, yak
- PerfNeg
- AUX: bài, bàkà, bàkù, bàn, bàtà, bàsù
- Prog
- AUX: shinàː, tanàː, anàː, inàː, sunàː, kanàː, nàː, kukà, kunàː
- ProgBkg
- AUX: akà, sukà, kukà, kà, shikà, mukà, kakà, nikà, takà
- ProgNeg
- AUX: baː'à, baːkà, baːmù, baːkù
- Jus
- VERB: bàri, shìga, dìːba, i, rùmaː, tàhi, wùceː, saː, shìryaː, ƙàːraː
- Fut
- AUX: zâːki, zâːku, zâːshi
- Pred
- AUX: ani
- Cau
- VERB: tassheː, bâsshee, hîrkassheː, s'aisà, tassà, tâːkassà, jèːwàyèsshi, tàssa
- Stat
- VERB-Part: bìye, tàhe, kwànce, màːlìye, tsàye, tàushe, zàmne
Pronouns, Determiners, Quantifiers
- Dem
- ADV: nan, nân, cân, nanânga, can, ceːnìyaː, nân.
- DET: ga, wàccân, can, nan, tan
- PRON: wàccan, wàgga, wànnan, wàncân, wâggàːshi, wàncéːnìyaː, wànga
- Ind
- ADV: koː'ìnaː
- DET: wata, wani, wasu
- PRON: koːmiː, koːwaː, wani, wasu, wata
- Int
- ADJ: wacèː
- ADV: ƙàːƙàː, ìnaː
- DET: wanèː, waccè
- PRON: miː, wanèː
- Prs
- PRON: shiː, ita, shì, shi, mai, niː, suː, musù, tà, sù
- Rel
- ADV: indà, indàduk, duwwàdà, indàdun, indàdut, koːìnaː, koːƙàːƙàː, kóːkòːindà, wàdà
- PRON: sà'addà, wa'ànda, wandà
- Tot
- ADV: duh
- DET: dug, du', dus, duy, dub, dukà, dun, dut, duk
- PRON: dûn
- Yes
- PRON: kâinai, kânka
- 1
- AUX: ìn, naː, inàː, mukà, mù, baːmù, bàn, mun, ani, munkà
- NOUN: yâːtaː, baːyuːna, jàkkaina, shaːnuːna, àlkaːwàliːnaː, ƙanèːna
- PRON: niː, min, ni, nì, shi
- 2
- AUX: kaː, kà, kanàː, kukà, kun, bàkà, bàkù, kat, kì, kù
- NOUN: raːnakkà, àbinkà, baːyunkì, gidankà, jàkkankì, kânka, shaːnunkì, tàːkàlminkà, àbinka, ƙarhinkù
- PRON: kai, maː, ka, kà, keː, kânka, naːkù, taːkà
- VERB-Vnoun: biyànka, kirànka, tàhiyàkkà, tàhiyàkkù, ɓaːcìnku
- 3
- ADP: cikinsù, tsàkaːninsù
- AUX: yac, sunkà, yaː, shinàː, shì, tac, tanàː, tà, yaz, sunàː
- NOUN: shìgattà, uwattà, uwaːtai, bambanciyassù, baːyuːnai, gardamàssu, rânta, ƙarhiːnai, bambanciyaːtai, bàːkinsù
- PRON: shiː, ita, shì, shi, mai, suː, musù, tà, sù, matà
- VERB: tàhiyàːtai, c’ìnkai, ganai, ganiːnai, jèːwàyèsshi, sônta, taːs'àtta, tàhiyàːtay, tàmbayàːtai, àːgìzonta
- VERB-Vnoun: ganiːnai, sônta, taːs'àtta, tàhiyàːtai, tàhiyàːtay, tàmbayàːtai, àːgìzonta, ɓaːcìnsu
- 4
- AUX: ankà, anàː, à, akà, baː'à, akè, an
Other Features
- Deixis
- Prox
- ADV: nân, nanânga, nân.
- DET: ga
- PRON: wàgga, wànga, wànnan
- Remt
- ADV: cân, can, ceːnìyaː
- DET: wàccân, can
- PRON: wàccan, wàncân, wàncéːnìyaː
- Prox
- ExtPos
- ADJ
- ADJ: hìyayyem
- ADV
- ADV: sai, tsàye
- SCONJ: ha', har
- VERB: bìye, tàhe, kwàːni, taːshì, kwànce, màːlìye, tsàye, tàushe
- VERB-Part: bìye, tàhe, kwànce, màːlìye, tsàye, tàushe
- NOUN
- NOUN: ayaː
- VERB-Vnoun: tàhiyàː, yîː, zakkùwaː, sôn, bìyash, gàmuwaː, cîn, kwaːnaː, gudùː, hwaːɗùwaː
- PRON
- ADV: duh
- DET: du', dut, duk, dus, duy
- PRON: dûn
- ADJ
- PartType
- Adv
- PART: ta
- Neg
- PART: ba, bàː
- Pred
- PART: gàː, àkwai, baːbù, bâː, dà, zâː, à~
- Top
- PART: kòː, dai, kuma, hwa
- Adv
Syntax
Auxiliary Verbs and Copula
- This corpus uses 1 lemmas as copulas (cop). Examples: _.
- This corpus uses 1 lemmas as auxiliaries (aux). Examples: _.
Core Arguments, Oblique Arguments and Adjuncts
Here we consider only relations between verbs (parent) and nouns or pronouns (child).
- nsubj
- VERB--NOUN (21)
- VERB--NOUN-Gen (7)
- VERB--PRON (9)
- VERB-Part--NOUN (2)
- VERB-Vnoun--NOUN (1)
- obj
- VERB--NOUN (104)
- VERB--NOUN-ADP(dà) (1)
- VERB--NOUN-ADP(mài) (1)
- VERB--NOUN-Gen (10)
- VERB--PRON (40)
- VERB--PRON-Nom (1)
- VERB-Vnoun--NOUN (2)
- VERB-Vnoun--PRON (1)
- iobj
- VERB--NOUN (1)
- VERB--PRON (10)
- VERB--PRON-Dat (31)
- VERB--PRON-Gen (1)
Relations Overview
- This corpus uses 6 relation subtypes: acl:relcl, advcl:cleft, cc:preconj, compound:prt, flat:name, obl:arg
- The following 1 main types are not used alone, they are always subtyped: flat
- The following 6 relation types are not used in this corpus at all: csubj, expl, clf, list, orphan, goeswith