home edit page issue tracker

This page pertains to UD version 2.

UD Mbya Guarani Thomas

Language: Mbya Guarani (code: gun)
Family: Tupian

This treebank has been part of Universal Dependencies since the UD v2.4 release.

The following people have contributed to making this treebank part of UD: Guillaume Thomas.

Repository: UD_Mbya_Guarani-Thomas
Search this treebank on-line: PML-TQ
Download all treebanks: UD 2.15

License: CC BY-NC-SA 4.0

Genre: nonfiction

Questions, comments? General annotation questions (either Mbya Guarani-specific or cross-linguistic) can be raised in the main UD issue tracker. You can report bugs in this treebank in the treebank-specific issue tracker on Github. If you want to collaborate, please contact [guillaume • thomas (æt) utoronto • ca]. Development of the treebank happens outside the UD repository. If there are bugs, either the original data source or the conversion procedure must be fixed. Do not submit pull requests against the UD repository.

Annotation Source
Lemmas assigned by a program, not checked manually
UPOS assigned by a program, with some manual corrections, but not a full manual verification
XPOS annotated manually
Features annotated manually in non-UD style, automatically converted to UD, with some manual corrections of the conversion
Relations assigned by a program, with some manual corrections, but not a full manual verification

Description

UD Mbya_Guarani-Thomas is a corpus of Mbyá Guaraní (Tupian) texts collected by Guillaume Thomas. The current version of the corpus consists of three speeches by Paulina Kerechu Núñez Romero, a Mbyá Guaraní speaker from Ytu, Caazapá Department, Paraguay.

UD Mbya_Guarani-Thomas is a corpus of Mbyá Guaraní (Tupian) texts collected by Guillaume Thomas. The current version of the corpus consists of three speeches by Paulina Kerechu Núñez Romero, a Mbyá Guaraní speaker from Paraguay. These speeches were recorded in August 2017 in the Mbyá Guaraní community Ytu, Caazapá Department, Paraguay. They were transcribed by Ronaldi Recalde Centurion (Ytu community) and translated into Brazilian Portuguese by Alberto Álvares. The texts were interlinearized in SIL FieldWorks Language Explorer (Black and Simons 2006) and manually annotated in UD in Arborator (Gerdes 2013) by Guillaume Thomas. Features were converted automatically from the morphological glosses added in SIL FieldWorks Language Explorer.

Consider using the development version of the corpus, which contains the latest improvements, while the official release is updated every 6 months:

Acknowledgments

The development of the corpus was supported by a Connaught New Researcher Award to Guillaume Thomas at the University of Toronto.

Special thanks are due to Paulina Kerechu Núñez Romero for allowing us to use these recordings, and to Ronaldi Recalde Centurion and Alberto Álvares for their essential role in transcribing and translating these recordings.

References

Statistics of UD Mbya Guarani Thomas

POS Tags

ADJADPADVAUXCCONJDETINTJNOUNNUMPARTPRONPROPNPUNCTSCONJVERB

Features

ClusivityClusivity[obj]Clusivity[psor]Clusivity[subj]MoodNumberNumber[psor]NumTypePersonPerson[obj]Person[subj]PolarityPronTypeSubcatVerbForm

Relations

acladvcladvmodamodapposcaseccccompcompoundcompound:svcconjcopcsubjdep:moddetdiscoursedislocateddislocated:cleftfixedflatlistmarknmodnsubjnummodobjoblobl:sentconparataxisparataxis:reppunctreparandumrootvocativexcomp

Tokenization and Word Segmentation

Morphology

Tags

Nominal Features

Degree and Polarity

Verbal Features

Pronouns, Determiners, Quantifiers

Other Features

Syntax

Auxiliary Verbs and Copula

Core Arguments, Oblique Arguments and Adjuncts

Here we consider only relations between verbs (parent) and nouns or pronouns (child).

Relations Overview