UD Western Sierra Puebla Nahuatl ITML
Language: Western Sierra Puebla Nahuatl (code: nhi
)
Family: Uto-Aztecan
This treebank has been part of Universal Dependencies since the UD v2.11 release.
The following people have contributed to making this treebank part of UD: Robert Pugh, Marivel Huerta Mendez, Mitsuya Sasaki, Francis Tyers.
Repository: UD_Western_Sierra_Puebla_Nahuatl-ITML
Search this treebank on-line: PML-TQ
Download all treebanks: UD 2.15
License: CC BY-SA 4.0
Genre: spoken, fiction, grammar-examples, nonfiction
Questions, comments? General annotation questions (either Western Sierra Puebla Nahuatl-specific or cross-linguistic) can be raised in the main UD issue tracker. You can report bugs in this treebank in the treebank-specific issue tracker on Github. If you want to collaborate, please contact [pughrob (æt) iu • edu]. Development of the treebank happens outside the UD repository. If there are bugs, either the original data source or the conversion procedure must be fixed. Do not submit pull requests against the UD repository.
Annotation | Source |
---|---|
Lemmas | annotated manually |
UPOS | annotated manually, natively in UD style |
XPOS | not available |
Features | annotated manually, natively in UD style |
Relations | annotated manually, natively in UD style |
Description
UD Western Sierra Puebla Nahuatl-IU is a treebank consisting of sentences from written fiction and non-fiction, spontaenous speech, and grammar examples.
The treebank was pre-annotated for morphology using the apertium-nhi
(Pugh et al, 2021).
The morphological analyses were disambiguated and annotated for dependency structure by hand.
Acknowledgments
We would like to thank the following for giving permission to use their sentences.
- Elizabeth Márquez Hernández
- Jaime Hernández Juárez
- Ubaldo Márquez Pérez
- Petra Schroeder
References
- Pugh, R., Tyers, F., and Huerta Mendez, M. (2021). Towards an open source finite-state morphological analyzer for Zacatlán-Ahuacatlán-Tepetzintla Nahuatl. In Proceedings of the 4th Workshop on the Use of Computational Methods in the Study of Endangered Languages Volume 1 (Papers), pages 80–85.
Statistics of UD Western Sierra Puebla Nahuatl ITML
POS Tags
ADJ – ADP – ADV – AUX – CCONJ – DET – INTJ – NOUN – NUM – PRON – PROPN – PUNCT – SCONJ – VERB – X
Features
Animacy[obj] – Aspect – Case – Degree – Foreign – Gender – Mood – Movement – NounType – Number – Number[dat] – Number[obj] – Number[psor] – Number[subj] – Person – Person[dat] – Person[obj] – Person[psor] – Person[subj] – Polarity – Polite – PronType – Reflex – Subcat – Tense – Typo – VerbForm – Voice
Relations
acl – acl:relcl – advcl – advmod – advmod:neg – amod – appos – aux – case – cc – ccomp – compound – conj – cop – csubj – dep – det – discourse – dislocated – fixed – flat – goeswith – iobj – mark – nmod – nsubj – nummod – obj – obl – orphan – parataxis – punct – reparandum – root – vocative – xcomp
Tokenization and Word Segmentation
- This corpus contains 909 sentences, 9224 tokens and 10132 syntactic words.
- This corpus contains 1714 tokens (19%) that are not followed by a space.
- This corpus does not contain words with spaces.
- This corpus does not contain words that contain both letters and punctuation.
- This corpus contains 835 multi-word tokens. On average, one multi-word token consists of 2.09 syntactic words.
- There are 583 types of multi-word tokens. Examples: oquihtoh, onauat, okatka, oyah, opeu, yen, ocatca, opew, nisihtzin, onipiyaya, oyahkeh, octlahtlanih, oniniquiya, owits, den, nakin, nicniu, ocahsito, occahcalaquito, omonacasmahman, onikoh, oniquihtoh, ontlasojtlaskia, oquitac, owalahkeh, oyaj, mai, mokiseguiro, oajsik, ochanchiwatoh, ocholoh, ocmatia, ocpiyaya, octlatlahtlanih, oehcoc, okichihchiwkeh, okichiw, okichiwkeh, okikwah, omochih, onechilhuaya, oniya, opanok, oseahsik, oyajkej, santipitzin, yotiehkokeh, Opanoc, Yomononotskeh, del.
Morphology
Tags
- This corpus uses 15 UPOS tags out of 17 possible: ADJ, ADP, ADV, AUX, CCONJ, DET, INTJ, NOUN, NUM, PRON, PROPN, PUNCT, SCONJ, VERB, X
- This corpus does not use the following tags: PART, SYM
- This corpus contains 87 lemmas tagged as pronouns (PRON): akaj, akih, akij, akin, akinoh, algo, aquih, aquin, ce, ciqui, ese, eso, itlah, les, lo, miak, miki, mowisiotzin, nada, nakin, namehuan, namejwan, nehhuatl, nehuatl, nehuatluatl, nehwatl, nej, nejuatl, nejwatl, nicanca, nicancah, nikanka, nin, ninih, nochi, nomehwah, non, nonoh, noso, notewah, ocsiquin, ok, okse, oksikin, que, se, semeh, siki, sikin, tehhuan, tehhuatl, tehuan, tehuat, tehuatl, tehwa, tehwah, tehwan, tehwatl, tej, tejuatl, tejwan, tercero, tlan, tleh, tlen, tleno, tlenoh, tlenohoh, tlenoj, tlensaso, todo, touatzin, ye, yehhuan, yehhuatl, yehua, yehuan, yehuatl, yehwa, yehwah, yehwatl, yehwuatl, yej, yejuatl, yejwan, yejwatl, yo
- This corpus contains 42 lemmas tagged as determiners (DET): cada, catqui, ce, ciqui, cualquier, det, dion, el, in, incoyotl, la, las, miac, miak, mic, mik, miqui, nakin, neca, necah, nicanca, nicancah, nikanka, nikankah, nin, nochi, nochtin, non, nonoh, occe, occiqui, okse, oksiki, quesqui, se, siki, sikin, siqui, tlen, tlenoh, un, uno
- Out of the above, 17 lemmas occurred sometimes as PRON and sometimes as DET: ce, ciqui, miak, nakin, nicanca, nicancah, nikanka, nin, nochi, non, nonoh, okse, se, siki, sikin, tlen, tlenoh
- This corpus contains 18 lemmas tagged as auxiliaries (AUX): _, catqui, estar, haber, huili, i, katki, ma, mach, mo, nimi, o, ok, pehua, pewi, ser, uili, wili
- Out of the above, 7 lemmas occurred sometimes as AUX and sometimes as VERB: _, catqui, huili, katki, pewi, ser, wili
- There are 2 (de)verbal forms:
- Fin
- VERB: katki, quihtoh, nauat, yah, mota, katka, niquihtoz, peu, yuwi, nesi
- Inf
- VERB: ver, dar
Nominal Features
- Fem
- NOUN: escuela, rana, danzas, irana, guerra, fiesta, máquina, conchas, días, historia
- Masc
- ADJ: mexicano, chistoso
- NOUN: pueblo, topueblo, burro, años, frasco, ejemplo, mardomos, pollito, Rancho, amigos
- PRON: lo
- PROPN: estados, unidos
- Plur
- ADJ: huehhueyen, tlaltitikten, tzahtzayactique, wihwinyeh, amables, malos, titzocotzitzin, tzocotzitzin
- NOUN: tokniwah, ceraokwilimeh, coyomeh, tipemeh, ichcame, siwameh, danzas, años, mopiluan, niconeuan
- NUM: nahuen, naweh, yeyen, millones
- PRON: tehwah, yejwan, tehuan, yehuan, yehwah, tehhuan, notewah, tehwan, tejwan, yehhuan
- PROPN: estados, unidos
- Sing
- ADJ: cualli, kwali, igual, chikawak, kwalli, atrasado, chicahuac, chicauac, cuali, kuali
- NOUN: ich, itich, tonal, atl, ica, miston, ika, itzcuintli, pueblo, telpukatl
- PRON: neh, yeh, teh, yej, touatzin, ye, nehwatl, nej, tej, yehwatl
- PROPN: Ticpintzin
- Abs
- NOUN: atl, itzcuintli, telpukatl, ilwitl, itskwintli, tlakatl, masatl, altipetl, kowatl, tikitl
- Acc
- PRON: lo
Degree and Polarity
- Dim
- ADJ: chihchikichih, hueyihtzin, kualtzin, tzocotzin, titzocotzitzin, tzocotzitzin
- ADV: tzocotzin, kwaltsin, tipitzin, tzokotzitzin
- NOUN: tenantzin, isihtzin, tipitzin, guitarritas, moawitzin, namokniwantzitzin, nisijtzin, nowewetzin, tesihtzin, totlaoltzin
- PRON: namejwantzitzin, mowisiotzin
- PROPN: Ticpintzin
- Neg
- ADV: amo, akmo
Verbal Features
- Imp
- AUX: katka, catca, nicatca, Okatka, catcah, ticatca, ticatcah, uilia, uiliah
- VERB-Fin: katka, nipiyaya, catca, niniquiya, cmatia, cpiyaya, nechilhuaya, niquilhuaya, tiquitiya, yaya
- Perf
- VERB-Fin: quihtoh, nauat, yah, peu, pew, yahkeh, choloh, ctlahtlanih, wits, cahsito
- Prog
- VERB-Fin: tikitok, kichihchiwtok, kipixtok, nitiquitoc, tentok, tsikwintok, cchixtoya, cholohtokeh, cualantoc, ilpitoya
- Cnd
- VERB-Fin: ntlasojtlaskia, xnechoncaquinih, cchiuilsquiah, kyektlalani, nmitzwalikiliskia, nmokowani, nmokowiskia, nyani, nyaskia, okaltlachixtoskia
- Imp
- VERB-Fin: xiyo, ixquita, Ixcochi, ixmeua, ixtlaocoya, nikixmati, xiyahkah, xoncualani, Ixcaqui, Ixnechmaka
- Ind
- VERB-Fin: katki, quihtoh, nauat, yah, mota, katka, niquihtoz, peu, yuwi, nesi
- VERB-Inf: ver, dar
- Opt
- AUX: ito
- VERB-Fin: kiseguiro, Chaueh, cequitta, moskalti, motlamochiwa, tiakan, Chuhue, ceicxipalti, cequimpiya, cpiyacan
- Prp
- VERB-Fin: moweyilihtih, cahsito, ccahcalaquito, chanchiwatoh, kipantlatiwitih, cnaliluih, ctemoto, Iktato, Mopatlatiweh, ahcito
- Sub
- VERB-Fin: sea
- Fut
- AUX: wilis, niiski, pewis, uilis
- VERB-Fin: niquihtoz, niyaz, tiyas, itmatis, niyas, tlamis, icchiuas, itkwalikas, Tikwikas, atliz
- Past
- AUX: katka, catca, nicatca, peuh, Okatka, catcah, peh, ticatca, ticatcah, uilia
- VERB-Fin: quihtoh, nauat, yah, peu, katka, pew, nipiyaya, yahkeh, catca, choloh
- Pqp
- VERB-Fin: cholohca, yomikka
- Pres
- AUX: pewi, wili, huili, katej, katki, nica, peweh
- VERB-Fin: katki, yuwi, kah, yuweh, kateh, moweyilihtih, nesi, tikitok, ehko, kinchiwaj
- Act
- VERB-Fin: tehco, yah
Pronouns, Determiners, Quantifiers
- Prs
- PRON: neh, yeh, teh, yej, tehwah, yejwan, touatzin, tlenoh, ye, nehwatl
- Yes
- VERB-Fin: moniki, monacasmahman, mocelebraroa, mochih, mochiwa, mokawa, inmokawas, ixmeua, moapareserowa, mocaquiya
- 1
- PRON: neh, tehwah, nej, tehuan, nehwatl, tehhuan, notewah, tehwan, tejwan
- 2
- PRON: teh, touatzin, tej
- 3
- PRON: yeh, yej, yejwan, ye, yehuan, yehwah, yehwatl, lo, yehhuan
- Form
- NOUN: tonnomaman
- PRON: touatzin
- VERB-Fin: xnechoncaquinih, xoncualani, Inmitzontlasoj, Inmitzontlatlawtia, Innamechonnonotzas, Itkomonikiltijtzinowa, Ixnechonmaka, cmopaleuilih, itconchiuas, itkonikis
- Plur
- NOUN: topueblo, tokniwah, iwah, inchan, inuan, iuan, ninmaman, totikih, ima, intempileh
- Sing
- NOUN: ich, itich, ica, ika, itoka, itskwih, irana, nomama, mopiluan, niconeuan
Other Features
- Animacy[obj]
- Hum
- VERB-Fin: Intekakilia, cectemacaz, ntecaquilih
- Nhum
- VERB-Fin: inquintlamaca, tlakwa, tlatemoa, sekintlanamakiltihtok, titlapiyayah, tlahtoua, tlajtowaj, ictlamacasqueh, ictlamacatiueh, ittlacuasqueh
- Hum
- Foreign
- Yes
- ADJ: atrasado, Primera, mexicano, patronal, primer, reconocido, tranquilo, Pobre, amables, chistoso
- ADP: de, para, a, por, hasta, en, desde, como, sin
- ADV: después, entonces, pues, ahorita, siempre, ahora, igual, bueno, más, casi
- AUX: es
- CCONJ: pero, y, o
- DET: cada, l, las, cualquier, un
- INTJ: bueno, sí, A
- NOUN: pueblo, topueblo, escuela, rana, danzas, irana, burro, guerra, vez, años
- NUM: ocho, quince, dieciocho, nueve, siete, veinte, millones
- PRON: eso, que, nada, tercero, todo
- PROPN: estados, unidos, español, dios
- SCONJ: porque, que, como, cuando, hasta, para, Mejor
- VERB-Fin: sé, Anda, ponen, sale, sea, sirves
- VERB-Inf: ver, dar
- Yes
- Movement
- Ven
- VERB-Fin: walah, walkisa, chualitac, cualan, hualah, ixquimahsiqui, kanakih, kiyakatitiki, nakwalnotzkej, namechonajsikiw
- Ven
- NounType
- Relat
- NOUN: ich, itich, ica, ika, iwah, inuan, iuan, icah, ikaj, itch
- Relat
- Number[dat]
- Plur
- VERB-Fin: inquintlamaca, techmaka, itquinmaca, sekintlanamakiltihtok, techchiuilis, techilweh, kimlwih, namechchiuilisqueh, namechpixquiltisqueh, nankimonchiwiliaj
- Sing
- VERB-Fin: nechilhuaya, niquilhuaya, cnaliluih, icchiuilia, kilwia, kilwiah, nechmakakeh, quiluih, Inkakiltia, Inmitzilwia
- Plur
- Number[obj]
- Plur
- VERB-Fin: kinchiwaj, kinkowaj, quinpiyaya, quintlasohtlaya, Innamechonnonotzas, Namechcuecuechosque, Techmojmojtijtos, Techtrataroa, cequimpiya, inquincuis
- Sing
- VERB-Fin: quihtoh, niquihtoz, niquihlnamiqui, sekichiwa, kiniki, kita, nipiyaya, ctlahtlanih, kitlasohtla, nicpiya
- Plur
- Number[subj]
- Plur
- AUX: katka, catca, nicatca, Okatka, catcah, katej, peweh, ticatca, ticatcah, uiliah
- NOUN: itich, mexicanos, tiautoridades
- NUM: tiochoque
- VERB-Fin: katka, sekichiwa, yahkeh, yuweh, yuwi, catca, kateh, cahsito, ccahcalaquito, chanchiwatoh
- Sing
- AUX: pewi, wili, huili, peuh, wilis, ito, katki, nica, niiski, peh
- NOUN: tonnomaman, intemaman, itconetl, itnoconeu, ittlaol, ticol, tipresidente
- VERB-Fin: katki, quihtoh, nauat, yah, mota, niquihtoz, peu, nesi, niquihlnamiqui, niyaz
- Plur
- Person[dat]
- 1
- VERB-Fin: techmaka, nechilhuaya, nechmakakeh, techchiuilis, techilweh, Ixnechmaka, Ixnechonmaka, Nechilwiaj, nechchiwilayaj, nechihlhuaya
- 2
- VERB-Fin: Inmitzilwia, Inmitzontlasoj, inmitziluis, inmitzwalikilis, mitzonilwij, mitztlamacas, namechchiuilisqueh, namechpixquiltisqueh, nimitzcohuiz, nimitzincohuiliz
- 3
- VERB-Fin: inquintlamaca, itquinmaca, niquilhuaya, cnaliluih, icchiuilia, kilwia, kilwiah, quiluih, sekintlanamakiltihtok, Inkakiltia
- 1
- Person[obj]
- 1
- VERB-Fin: nechkokowa, xnechoncaquinih, Ixnechniltocacan, Onechajsik, Techmojmojtijtos, Techtrataroa, Tinechcactoc, itnechowikas, ittechonmatlanis, ixnechcaqui
- 2
- VERB-Fin: nimitzpalehuiz, Inmitzontlatlawtia, Innamechonnonotzas, Itmitzmochiyalia, Namechcuecuechosque, mitzijijtowaj, mitzonwikak, mitzpactiya, namechcuasqueh, namechonajsikiw
- 3
- VERB-Fin: quihtoh, niquihtoz, niquihlnamiqui, sekichiwa, kiniki, kita, nipiyaya, ctlahtlanih, kitlasohtla, nicpiya
- 1
- Person[psor]
- 1
- NOUN: topueblo, tokniwah, nomama, tonnomaman, notlahuical, nochan, notzcuin, totikih, nocnihuan, nocompañeros
- 2
- NOUN: mopiluan, mochah, mochantzin, mocniuan, moixkuya, mopaleta, mopapan, mopilhuan, motareas, moticachuan
- 3
- NOUN: ich, itich, ica, ika, itoka, iwah, itskwih, irana, niconeuan, itoca
- 1
- Person[subj]
- 1
- AUX: nica, niiski, ticatca, ticatcah
- NOUN: itich, intemaman, tiautoridades
- NUM: tiochoque
- VERB-Fin: niquihtoz, niquihlnamiqui, niyaz, sekichiwa, nipiyaya, inhuili, nicpiya, niniquiya, inquintlamaca, nikoh
- 2
- AUX: nicatca
- NOUN: tonnomaman, itconetl, itnoconeu, ittlaol, ticol, tipresidente
- VERB-Fin: tiyas, itmatis, xiyo, itkwalikas, itquinmaca, ixquita, Ixcochi, Tikwikas, itmati, itpiya
- 3
- AUX: katka, pewi, wili, catca, huili, peuh, wilis, Okatka, catcah, ito
- NOUN: mexicanos
- VERB-Fin: katki, quihtoh, nauat, yah, mota, katka, peu, yuwi, nesi, pew
- 1
- Subcat
- Intr
- AUX: katka, pewi, wili, catca, huili, nicatca, peuh, wilis, Okatka, catcah
- VERB-Fin: katki, nauat, yah, katka, mota, peu, yuwi, nesi, niyaz, pew
- Tran
- AUX: ito
- VERB-Fin: quihtoh, niquihtoz, niquihlnamiqui, sekichiwa, kiniki, kita, nipiyaya, ctlahtlanih, itmatis, kitlasohtla
- VERB-Inf: ver, dar
- Intr
- Typo
- Yes
- ADV: Kanji, Amo
- NOUN: conhas
- VERB-Fin: in, cnamiqia, ik, ittalpachosqueh, ixtzitzqui, kinuij, monquis, nimomochilia, ompoliwia
- Yes
Syntax
Auxiliary Verbs and Copula
- This corpus uses 11 lemmas as copulas (cop). Examples: katki, catqui, yehwatl, ye, ser, yehuatl, yejuatl, i, yehhuatl, yej, yejwatl.
- This corpus uses 14 lemmas as auxiliaries (aux). Examples: o, ma, mo, pewi, _, mach, wili, uili, huili, pehua, estar, haber, nimi, ok.
Core Arguments, Oblique Arguments and Adjuncts
Here we consider only relations between verbs (parent) and nouns or pronouns (child).
- nsubj
- VERB--NOUN (3)
- VERB--NOUN-Abs (1)
- VERB--PRON (4)
- VERB-Fin--NOUN (214)
- VERB-Fin--NOUN-ADP(de) (2)
- VERB-Fin--NOUN-Abs (109)
- VERB-Fin--NOUN-Abs-ADP(de) (1)
- VERB-Fin--PRON (212)
- obj
- VERB--NOUN (3)
- VERB--NOUN-Abs (1)
- VERB--PRON (2)
- VERB-Fin--NOUN (214)
- VERB-Fin--NOUN-ADP(de) (1)
- VERB-Fin--NOUN-Abs (99)
- VERB-Fin--NOUN-Abs-ADP(quemeh) (1)
- VERB-Fin--PRON (64)
- VERB-Fin--PRON-Acc (1)
- iobj
- VERB-Fin--NOUN (10)
- VERB-Fin--NOUN-Abs (1)
- VERB-Fin--PRON (5)
- VERB-Inf--NOUN-ADP(a) (1)
- VERB-Inf--PRON (1)
Relations Overview
- This corpus uses 2 relation subtypes: acl:relcl, advmod:neg
- The following 3 relation types are not used in this corpus at all: expl, clf, list