home edit page issue tracker

This page pertains to UD version 2.

UD for Hausa

The Hausa language is represented by two treebanks: Northern Autogramm, for the Ader dialect of Niger Republic (Northern Hausa), and Southern Autogramm, for the Zaria dialect of Nigeria (Southern Hausa). Both are different from the Kano variety, generally accepted as Standard Hausa. The Ader (Northern) Hausa, together with the Sokoto variety, is a more archaic version of Standard Hausa, where some phonological rules have not applied. The Zaria (Southern) Hausa, on the other hand, is a “modern” version of the language where the 3-way opposition (masculine / feminine / plural) has been abandoned in the noun system, and only the plurality feature is maintained, while the feminine gender is kept in the pronominal and TAM system.

In Hausa, the TAM system is marked by pre-verbal Auxiliaries that combine Tense, Aspect, Mood, and subject agreement in Person, Number and Gender. These Auxiliaries can vary noticeably from one dialect to another. For example, the backgrounded progressive aspect in marked by // in Ader and /kèː/ in Zaria, while the same // marks the backgrounded perfect in Zaria.

Due to the difficulty of selecting a single lemma to account for these dialectal variation, we have chosen to postpone the lemmatisation of auxiliaries until a solution is found to organise related treebanks of a same language.

Tokenization and Word Segmentation


This is an overview only. For more detailed discussion and examples, see the list of Hausa POS tags and Hausa features.





There are 2 Hausa UD treebanks: