home edit page issue tracker

This page pertains to UD version 2.

UD Western Sierra Puebla Nahuatl MesoTree

Language: Western Sierra Puebla Nahuatl (code: nhi)
Family: Uto-Aztecan

This treebank has been part of Universal Dependencies since the UD v2.11 release.

The following people have contributed to making this treebank part of UD: Robert Pugh, Marivel Huerta Mendez, Mitsuya Sasaki, Francis Tyers, María Ximena Juarez Huerta, Ángeles Márquez Hernández.

Repository: UD_Western_Sierra_Puebla_Nahuatl-MesoTree
Search this treebank on-line: PML-TQ
Download all treebanks: UD 2.18

License: CC BY-SA 4.0

Genre: spoken, fiction, grammar-examples, nonfiction

Questions, comments? General annotation questions (either Western Sierra Puebla Nahuatl-specific or cross-linguistic) can be raised in the main UD issue tracker. You can report bugs in this treebank in the treebank-specific issue tracker on Github. If you want to collaborate, please contact [pughrob (æt) iu • edu]. Development of the treebank happens outside the UD repository. If there are bugs, either the original data source or the conversion procedure must be fixed. Do not submit pull requests against the UD repository.

Annotation Source
Lemmas annotated manually
UPOS annotated manually, natively in UD style
XPOS not available
Features annotated manually, natively in UD style
Relations annotated manually, natively in UD style

Description

UD Western Sierra Puebla Nahuatl-MesoTree is a combination of the existing UD Western Sierra Puebla Nahuatl-IU treebank (ITML) (with some updates to annotations due to caught errors or changes annotation decisions) and new sentences annotated as part of the NSF-funded project, “Syntactically-annotated corpora for endangered languages in areal contact” (MesoTree).

The ITML treebank was pre-annotated for morphology using the apertium-nhi (Pugh et al, 2021). The morphological analyses were disambiguated and annotated for dependency structure by hand. The MesoTree data does not include morphological analyses at this time.

The treebank consists of sentences from written fiction and non-fiction, spontanenous speech, and grammar examples. The new additions also consist of a large chunk of sentences (ALIMG) translated into two subvarieties of the language, one from San Miguel Tenango, Zacatlán, and another from Omitlán, Tepetzintla.

Acknowledgments

We would like to thank the following for giving permission to use their sentences.

References

Statistics of UD Western Sierra Puebla Nahuatl MesoTree

POS Tags

ADJADPADVAUXCCONJDETINTJNOUNNUMPARTPRONPROPNPUNCTSCONJVERBX

Features

AspectCaseDegreeExtPosForeignGenderMoodMovementNounTypeNumberNumber[obj]Number[psor]Number[subj]PersonPerson[obj]Person[psor]Person[subj]PolarityPolitePronTypeReflexSubcatTenseTypoVerbFormVoice

Relations

aclacl:relcladvcladvmodadvmod:negamodapposauxcaseccccompcompoundconjcopcsubjdepdetdiscoursedislocatedfixedflatgoeswithiobjmarknmodnsubjnummodobjoblorphanparataxispunctreparandumrootvocativexcomp

Tokenization and Word Segmentation

Morphology

Tags

Nominal Features

Degree and Polarity

Verbal Features

Pronouns, Determiners, Quantifiers

Other Features

Syntax

Auxiliary Verbs and Copula

Core Arguments, Oblique Arguments and Adjuncts

Here we consider only relations between verbs (parent) and nouns or pronouns (child).

Relations Overview