home edit page issue tracker

This page pertains to UD version 2.

It appears that you have Javascript disabled. Please consider enabling Javascript for this page to see the visualizations.

UD Nheengatu CompLin

Language: Nheengatu (code: yrl)
Family: Tupian

This treebank has been part of Universal Dependencies since the UD v2.11 release.

The following people have contributed to making this treebank part of UD: Leonel Figueiredo de Alencar, Dominick Maia Alexandre.

Repository: UD_Nheengatu-CompLin
Search this treebank on-line: PML-TQ
Download all treebanks: UD 2.18

License: CC BY-NC-SA 4.0

Genre: spoken, bible, fiction, nonfiction, grammar-examples

Questions, comments? General annotation questions (either Nheengatu-specific or cross-linguistic) can be raised in the main UD issue tracker. You can report bugs in this treebank in the treebank-specific issue tracker on Github. If you want to collaborate, please contact [leonel • de • alencar (æt) ufc • br]. Development of the treebank happens outside the UD repository. If there are bugs, either the original data source or the conversion procedure must be fixed. Do not submit pull requests against the UD repository.

Annotation	Source
Lemmas	annotated manually
UPOS	annotated manually, natively in UD style
XPOS	annotated manually
Features	annotated manually, natively in UD style
Relations	annotated manually, natively in UD style

Description

UD_Nheengatu-CompLin is a treebank of Nheengatu, also known as Modern Tupi and Língua Geral Amazônica (ISO 639: yrl). It comprises sentences drawn from a wide range of published sources, including spontaneous speech, grammatical descriptions, fables, myths, coursebooks, and dictionaries.

This is the first morphosyntactic treebank of Nheengatu. It remains a work in progress, with ongoing expansion planned for the coming months.

The treebank comprises sentences from a wide range of published sources freely available online, including grammatical descriptions, fables, myths, coursebooks, and dictionaries. The sentences were extracted either from PDF text files, transcribed from non-searchable (image-only) PDFs, or manually converted from phonetic transcriptions into orthography. Throughout the treebank, we generally adopt the spelling system proposed by Avila (2021), diverging from it only in a few cases.

The annotation was performed semi-automatically: we first applied the Yauti morphosyntactic analyzer (de Alencar 2023, 2025) to each sentence and then manually revised the output.

The development of this treebank and related tools is part of the research activities of the Research Group on Computation and Natural Language (Computação e Linguagem Natural — CompLin) at the Humanities Center of the Federal University of Ceará, Brazil. The main contributor to this effort is Leonel Figueiredo de Alencar, coordinator of the CompLin group. Additional annotators include Dominick Maia Alexandre, Hélio Leonam Barroso Silva, and Juliana Lopes Gurgel, who was a scholarship holder in the DACILAT project funded by the São Paulo Research Foundation (Fundação de Amparo à Pesquisa do Estado de São Paulo — FAPESP), Process No. 22/09158-5.

The following repository contains the most up-to-date development version of the treebank, as well as related tools and resources:

https://github.com/CompLin/nheengatu

The treebank currently includes examples from Seixas (1853), Hartt (1872), Magalhães (1876), Sympson (1877), Rodrigues (1890), Aguiar (1898), Costa (1909), Studart (1926), Amorim (1928), Hartt (1938), Moore, Facundes, and Pires (1994), Casasnovas (2006), Cruz (2011), Comunidade de Terra Preta (2013), Stradelli (1929/2014), Navarro (2016), Melgueiro, Câmara, and Martins (2019), Muller et al. (2019), de Alencar (2021), Avila (2021), and Melgueiro (2022), as well as from the Novo Testamento na língua Nyengatu (1973/2019) and issues 3 and 17 of the Leetra Indígena journal (Universidade Federal de São Carlos, 2014, 2015).

Acknowledgments

We thank Eduardo de Almeida Navarro (University of São Paulo) for kindly allowing us to use examples and texts from his coursebook (Navarro 2016), whose glossary served as the initial basis for the morphological analyzer used to annotate the UD_Nheengatu-CompLin treebank.

We are greatly indebted to Avila (2021)’s dictionary, from which numerous treebank sentences are drawn. This resource also provided invaluable lexical, grammatical, and semantic information for the further development of the morphological analyzer and related annotation tools. We are especially grateful to its author, Marcel Twardowsky Avila, for making the XML version of the dictionary available to us and for clarifying many questions regarding its entries.

We gratefully acknowledge the scholarships awarded to annotators by the São Paulo Research Foundation (FAPESP), through the DACILAT project (Process No. 22/09158-5), and by the Foundation for the Support and Development of Research in the State of Ceará (FUNCAP).

We are indebted to Gabriela Lourenço Fernandes and Susan Gabriela Huallpa Huanacuni, interns at the Biblioteca Brasiliana Guita e José Mindlin of the University of São Paulo (USP), as well as to its research specialist and curator, João Marcos Cardoso, for their transcriptions of stories from Amorim (1928) and Rodrigues (1890).

We also thank the Federal University of Amazonas Press (Editora da Universidade Federal do Amazonas — UFAM), particularly its director, Sérgio Freire, for granting permission to incorporate texts from Casasnovas (2006) into the treebank.

License

The copyright of the treebank sentences and their translations remains with their respective authors. This data is made available solely to support research, teaching, and the learning of the Nheengatu language. It should not be used for commercial purposes. For more information, see LICENSE.txt.

References

Aguiar, Costa. (1898). Doutrina christã destinada aos naturaes do Amazonas em nhihingatu com traducção portugueza em face. Pap. e Tip. Pacheco, Silva & C.
Avila, Marcel Twardowsky. (2021). Proposta de dicionário nheengatu-português (Doctoral dissertation, University of São Paulo). https://doi.org/10.11606/T.8.2021.tde-10012022-201925
Casasnovas, Afonso. (2016). Noções de língua geral ou nheengatú: Gramática, lendas e vocabulário (2nd ed.). Editora da Universidade Federal do Amazonas; Faculdade Salesiana Dom Bosco.
Comunidade de Terra Preta. (2013). Fábulas de Terra Preta: Uma coletânea bilíngue.
Costa, D. Frederico. (1909). Carta pastoral de D. Frederico Costa bispo do Amazonas a seus amados diocesanos. Typ. Minerva.
Cruz, Aline da. (2011). Fonologia e gramática do nheengatú: A língua falada pelos povos Baré, Warekena e Baniwa. Netherlands National Graduate School of Linguistics.
de Alencar, Leonel Figueiredo. (2021). Uma gramática computacional de um fragmento do nheengatu / A computational grammar for a fragment of Nheengatu. Revista de Estudos da Linguagem, 29(3), 1717–1777. http://dx.doi.org/10.17851/2237-2083.29.3.1717-1777
de Amorim, Antonio Brandão. (1928). Lendas em nheêngatú e em portuguez. Revista do Instituto Historico e Geographico Brasileiro, 154(100), 9–475.
de Magalhães, J. V. C. (1876). O selvagem. Typographia da Reforma.
Hartt, Charles Frederick. (1872). Notes on the Lingoa Geral or Modern Tupi of the Amazonas. Transactions of the American Philological Association, 3, 58–76. https://www.jstor.org/stable/310258
Hartt, Charles Frederick. (1938). Notas sobre a língua geral, ou tupí moderno do Amazonas. Anais da Biblioteca Nacional do Rio de Janeiro, 51, 305–390. Rio de Janeiro: M. E. S. Serviço Gráfico.
Maslova, Irina. (2018). Tradução comentada de mitos e lendas amazônicas do nheengatu para o russo (Master’s thesis, University of São Paulo). https://doi.org/10.11606/D.8.2019.tde-22022019-175350
Melgueiro, Edilson Martins, Câmara, Ana Suelly Arruda, & Martins, Marci Fileti. (2019). Orações relativas em Nheengatú ou Ingatú. Revista Brasileira de Linguística Antropológica, 11(2), 16. https://doi.org/10.26512/rbla.v11i02.28115
Melgueiro, Edilson Martins. (2022). O Nheengatu de Stradelli aos dias atuais: uma contribuição aos estudos lexicais de línguas Tupí-Guaraní em perspectiva diacrônica (Doctoral dissertation, University of Brasília). http://repositorio2.unb.br/jspui/handle/10482/44655
Moore, Denny, Facundes, Sidney, & Pires, Nádia. (1994). Nheengatu (Língua Geral Amazônica), its history, and the effects of language contact. Department of Linguistics, University of California, Berkeley. https://escholarship.org/uc/item/7tb981s1
Muller, Jean-Claude, Dietrich, Wolf, Monserrat, Ruth, Barros, Cândida, Arenz, Karl-Heinz, & Prudente, Gabriel (Eds.). (2019). Dicionário de língua geral amazônica. Universitätsverlag Potsdam; Museu Paraense Emílio Goeldi.
Navarro, Eduardo de Almeida. (2016). Curso de língua geral (nheengatu ou tupi moderno): A língua das origens da civilização amazônica (2nd ed.). Centro Angel Rama, FFLCH, Universidade de São Paulo.
Novo Testamento na língua Nyengatu (2nd ed.). (2019). Missão Novas Tribos do Brasil. (Original work published 1973)
Rodrigues, João Barbosa. (1890). Poranduba amazonense ou kochiyma-uara porandub, 1872–1887. Typ. de G. Leuzinger & Filhos.
Seixas, Manoel Justiniano de. (1853). Vocabulario da lingua indigena geral para o uso do Seminario Episcopal do Pará. Typ. de Mattos e Compª.
Stradelli, Ermanno. (2014). Vocabulário português-nheengatu, nheengatu-português. Ateliê Editorial. (Original work published 1929)
Studart, Jorge. (1926). Ligeiras noções de língua geral. Revista do Instituto do Ceará, 40, 26–38.
Sympson, Pedro Luiz. (1877). Grammatica da lingua brazilica geral, fallada pelos aborigines das provincias do Pará e Amazonas. Typographia do Commercio do Amazonas.
Universidade Federal de São Carlos, Laboratório de Linguagens LEETRA. (2014). Leetra Indígena, 3(3) [Edição especial: Yasú Yapurũgtitá Yẽgatú]. São Carlos, SP: UFSCar.
Universidade Federal de São Carlos, Laboratório de Linguagens LEETRA. (2015). Leetra Indígena, 1(17) [Edição especial: Escola Kariamã conta umbuesá]. São Carlos, SP: UFSCar.

Statistics of UD Nheengatu CompLin

POS Tags

ADJ – ADP – ADV – AUX – CCONJ – DET – INTJ – NOUN – NUM – PART – PRON – PROPN – PUNCT – SCONJ – VERB – X

Features

AdpType – AdvType – Aspect – Case – Clitic – Compound – Definite – Degree – Deixis – Derivation – Evident – ExtPos – Foc – Modality – Mood – Number – Number[grnd] – Number[psor] – NumType – PartType – Person – Person[grnd] – Person[psor] – Polarity – Poss – PronType – PunctType – Red – Rel – Style – Tense – Typo – VerbForm – Voice

Relations

acl – acl:relcl – advcl – advcl:relcl – advmod – amod – appos – aux – case – cc – ccomp – compound – conj – cop – csubj – dep – det – discourse – dislocated – expl – fixed – flat – goeswith – iobj – mark – nmod – nmod:poss – nsubj – nummod – obj – obl – orphan – parataxis – punct – reparandum – root – vocative – xcomp

Tokenization and Word Segmentation

This corpus contains 2839 sentences, 26444 tokens and 26848 syntactic words.

This corpus contains 7905 tokens (30%) that are not followed by a space.

This corpus does not contain words with spaces.

This corpus contains 210 types of words that contain both letters and punctuation. Examples: waá-itá, mira-itá, kwá-itá, amú-itá, kunhã-itá, apigawa-itá, anama-itá, maã-itá, nhaã-itá, kunhã-etá, taína-itá, pirá-itá, raíra-itá, rimirikú-itá, kamarara-itá, kariwa-itá, tayera-itá, yawé-yawé, mirá-piranga, pindá-itá, rundewara-itá, wirá-itá, amú-etá, apigawa-etá, kunhamukú-itá, mimbira-itá, mira-etá, mirá-itá, mú-itá, pirá-mirĩ, suú-itá, taria-itá, taíra-itá, tuixawa-etá, wirá-mirĩ, yepé-yepé, amú-tetamawara, amú-wirandé, arú-itá, ikewara-itá, iwá-itá, kunawarú-etá, kurabí-itá, kurasí-ara, kurumiwasú-itá, kurumĩ-itá, kurupira-itá, kuẽma-piranga, mbira-itá, mena-itá

This corpus contains 404 multi-word tokens. On average, one multi-word token consists of 2.00 syntactic words.
There are 221 types of multi-word tokens. Examples: árupi, pitérupi, maita, wírupi, iwí-pe, kaá-pe, resú-putari, asú-putari, kupixá-pe, paraname, Maã-ta, igarupá-pe, ipí-pe, kupé-pe, rembií-pe, ukwáu-putari, xamunhã-kwáu, Tupayú-pe, asú-kwáu, gantime, marã, pausá-pe, putiá-pe, remenari-putari, rupitá-pe, usú-putari, uwatá-kwáu, uyuká-putari, uyuyuká-putari, xasú-putari, Piauíwara, Ukiririntu, ambaú-putari, amuriwera, apurakí-putari, awá-ta, mixukúi, pawasá-pe, piá-pe, rasú-kwáu, resá-pe, resú-kwáu, ripí-pe, rumasá-pe, tatá-pe, unheẽwera, upisika-putari, xaseruka-kari, xawitá-kwáu, xibentu.

Morphology

Nominal Features

Number

Plur
- AUX-Fin: yasú, yaikú, pesú, yapuderi, Pekũi, Pepuderi, peikú, Tausú, taikú, tasú
- DET: kwá-itá, nhaã-itá, amú-itá
- NOUN: mira-itá, kunhã-itá, apigawa-itá, anama-itá, maã-itá, kunhã-etá, taína-itá, pirá-itá, kamarara-itá, kariwa-itá
- PRON: aintá, yané, ta, yandé, penhẽ, waá-itá, pe, amú-itá, kwá-itá, nhaã-itá
- VERB-Fin: yamunhã, yasú, yamaã, pemunhã, yaú, pemaã, yamanú, taunheẽ, pesendú, yayuká
- VERB-Vnoun: pemanduarisawa, yamanduarisawa, pekwasawa

Sing
- AUX-Fin: asú, xaikú, xasú, aikú, reikú, Ekũi, resú, Kũi, Hapuderi, Hasú
- DET: nhaã, kwá, kwaá, amú, amu
- NOUN: ara, mira, apigawa, manha, igara, tupana, paraná, kunhã, pituna, yautí
- PRON: i, se, waá, aé, ne, ixé, indé, nhaã, kwá, amú
- PROPN: Tupayú
- VERB-Fin: xasú, reputari, rerikú, asú, resú, remaã, remunhã, xarikú, amaã, amunhã
- VERB-Vnoun: remanduarisawa, hamanduarisawa, rekwawasawa, xarikusawa

Case

Acc,Nom
- PRON: aé, aintá, ixé, indé, ta, penhẽ, yandé, iné, yané, aúna

Dat
- PRON: ixéu, inéu, yanéu, indéu

Gen
- PRON: i, se, ne, yané, aintá, pe, ta, xe, yandé, U

Definite

Ind
- DET: yepé, muyepé
- PRON: yepé

Degree and Polarity

Degree

Aug
- ADJ: Sepiasú, panemawasú, pixeasú
- NOUN: buyawasú, miráwasú, pitunawasú, kiririwasú, iwawasú, iwiwasú, marikawasú, piawasú, tiapuwasú, yawaratewasú-itá
- VERB-Fin: kirimawausú, xirĩwasú

Cmp
- ADV: piri

Dim
- ADJ: purangamirĩ
- NOUN: Abumirĩ, fardamirĩ, kunhamirĩ, kurumirĩ, kurusamirĩ-etá, makakaí, wirawasumirĩ-etá, yasimirĩ-itá
- PRON: setaíra

Sup
- ADV: piri

Polarity

Neg
- PART: ti, intí, te, nẽ, nti, tenhẽ, umbaá, Teẽ, intíu, inté

Pos
- PART: eré, Eẽ, Aé

Verbal Features

Aspect

Compl
- PART: pawa, pá, páu, p

Cont
- PART: wé

Freq
- ADV: Asuiwara, Ikewara, kwayewara, sewara, yawewara
- NOUN: arawara, rukawara
- PART: wera, aikwewara
- VERB-Fin: Amanduariwara, Asuwara

Frus
- PART: yepé

Hab
- SCONJ: rametiwa
- VERB-Fin: ambautiwa, ukanhemutiwa, umundutiwa, upinaitikatiwa, upurungitatiwa, usutiwa, uyukatiwa

Imp
- PART: rẽ, ranhẽ, raẽ, raĩ, saĩ

Iter
- AUX-Fin: yayuíri
- VERB-Fin: uyuíri, xayuíri

Perf
- PART: ana, ã, wã, wana

Mood

Imp
- AUX-Fin: Ekũi, Kũi, resú, Pekũi, pesú
- VERB-Fin: remaã, yuri, Ekũi, Epurú, Iruri, retirika, eyuri, ikũi, pemunhã, remeẽ

Imp,Ind
- AUX-Fin: reikú, resú, pesú, Pepuderi
- VERB-Fin: rerikú, remunhã, resú, Remaã, remundú, rembeú, reruri, pemaã, pemunhã, pewatá

Ind
- AUX-Fin: uikú, usú, yasú, asú, xaikú, xasú, aikú, yaikú, reikú, upuderi
- VERB-Fin: unheẽ, usú, usika, umaã, umunhã, urikú, upitá, upisika, umbeú, uri

Tense

Fut
- PART: kurí, arama, arã, ku, rã, warama

Past
- PART: kwera, wera

Pres
- ADV: Asuiwara, Ikewara, kwayewara, sewara, yawewara
- NOUN: arawara, rukawara
- PART: aikwewara
- VERB-Fin: Amanduariwara, Asuwara

Voice

Mid,Pass
- VERB-Fin: uyumunhã, uyuyuká, uyuyumimi, xayumumeú, xayuruyari, Reyumupuranga, Reyuyumimi, Uyupurungitá, Xayumusakú, hayumukwaíra
- VERB-Inf: yumunhã, Yukindawa, Yumuatiri, yemuí, yumumeú, yumuseruka, yumuí, yupiruka, yusalvari

Evident

Nfh
- PART: paá

Pronouns, Determiners, Quantifiers

PronType

Art
- DET: yepé, muyepé
- PRON: yepé

Dem
- ADV: iké, ape, kwá, akití, aape, mi, kí, Mimi, mikití, ké
- DET: nhaã, kwá, kwaá, kwá-itá, aé, nhaã-itá
- PRON: nhaã, kwá, kwá-itá, kwaá, nhaã-itá, aé

Emp
- DET: aité
- PRON: aité

Ind
- ADV: mairamé, makití, mayé, masuí, marupí
- DET: amú, siiya, maã, siía, muíri, setá, yawé, siya, turusú, yawé-yawé
- PRON: maã, awá, amú, manungara, amú-itá, siya, siiya, mukũi-itá, setá, amú-etá

Int
- ADV: mayé, mamé, makití, mairamé, marama, marupí, mayawé, masuí, maita, Maí
- DET: maã, muíri, awá, Mawaá
- PRON: maã, awá, Muíri

Prs
- PRON: i, se, aintá, aé, ne, ixé, indé, yané, ta, yandé

Rel
- ADV: mamé, makití, mayé, marupí, masuí, mairamé
- DET: maã
- PRON: waá, waá-itá, awá, maã

Tot
- DET: panhẽ, upaĩ, muíri, upanhẽ, paiu
- PRON: panhẽ, upaĩ, muíri, pawé, upawé

NumType

Card
- NUM: mukũi, musapiri, yepé, pú-mukũi, sete, 1930, Oito, irundí, kwaru, nove

Ord
- ADV: mukũisawa, primeru

Poss

Yes
- PRON: se, i, ne, yané, aintá, ta, pe, xe, yandé

Person

1
- AUX-Fin: yasú, asú, xaikú, xasú, aikú, yaikú, yapuderi, Hapuderi, Hasú, apuderi
- PRON: se, ixé, yané, yandé, ixéu, xe, yanéu, su
- VERB-Fin: xasú, asú, xarikú, yamunhã, yasú, amaã, amunhã, aputari, xamunhã, yamaã
- VERB-Vnoun: yamanduarisawa, hamanduarisawa, xarikusawa

2
- AUX-Fin: reikú, Ekũi, resú, pesú, Kũi, Pekũi, Pepuderi, peikú, repuderi
- PRON: ne, indé, penhẽ, pe, iné, inéu, indéu, intí, n
- VERB-Fin: reputari, rerikú, resú, remaã, remunhã, pemunhã, reyuri, pemaã, renheẽ, rembeú
- VERB-Vnoun: pemanduarisawa, remanduarisawa, pekwasawa, rekwawasawa

3
- AUX-Fin: uikú, usú, upuderi, Tausú, taikú, tasú, taupuderi, urikú
- PRON: i, aintá, aé, ta, aúna, U, intá
- VERB-Fin: unheẽ, usú, usika, umaã, umunhã, urikú, upitá, upisika, umbeú, uri
- VERB-Vnoun: ukwawasawa, uyumimisawa, ukaamunusawa, umundisá, upukasawa, uputarisá, usikiesá, uyanasá

Number[psor]

Sing
- NOUN: suka, sera, ximirikú, taíra, ximiára, sesá, suaxara, sakakwera, sawa, sukwera

Other Features

AdpType
- Post
  - ADP: upé, kití, suí, irumu, rupí, supé, arama, xupé, resé, ramé
- Prep
  - ADP: até, té

AdvType
- Cau
  - ADV: aresé, ape, aramé, nhaãsé, kurumú, marama, Mairamé, marã
- Con
  - ADV: Ma, nuká
- Deg
  - ADV: reté, katú, piri, xinga, retana, yuíri, mirĩ, turusú, retã, puru
- Loc
  - ADV: iké, apekatú, mamé, ape, makití, marupí, kwá, masuí, akití, arupí
- Man
  - ADV: yawé, mayé, puranga, kwayé, kutara, puxí, kirimbawa, amurupí, katú, kurutẽi
- Mod
  - ADV: kuité, kuté
- Tim
  - ADV: asuí, kuíri, ape, aramé, aiwana, yeperesé, wirandé, ariré, kuxiima, aape

Clitic
- Yes
  - ADP: pe, upé, wara, me, arã
  - ADV: ntu, mi
  - PART: taá, wera, ta

Compound
- Yes
  - AUX-Inf: putari, kwáu, kari, kwá, vutari

Deixis
- Prox
  - ADV: iké, kwá, kí, ké, kwaá
  - DET: kwá, kwaá, kwá-itá
  - PRON: kwá, kwá-itá, kwaá
- Remt
  - ADV: ape, akití, aape, mi, Mimi, mikití, mumi
  - DET: nhaã, aé, nhaã-itá
  - PRON: nhaã, nhaã-itá, aé

Derivation
- Coll
  - NOUN: itatiwa, kapĩtiwa, mirawasutiwa, sakaitiwa, siringatiwa, wakutiwa
- Priv
  - ADJ: iwasuíma, santaíma, uyiima
  - ADV: tiapuíma
  - NOUN: Adana-ima, apisaíma, ara-ima, kiinha-ima, payaíma, sawa-ima, seraíma, tĩ-ima, ximirikú-ima
  - VERB-Fin: kiaíma

ExtPos
- ADV
  - ADV: yawé, Kutara, yawewara
  - PART: intí, ti, aikwé, nẽ
  - PRON: maã
- CCONJ
  - CCONJ: u
- DET
  - ADV: mayé
- PRON
  - ADV: Maí
  - PART: nẽ
- SCONJ
  - PRON: waá

Foc
- Yes
  - PART: tẽ, tenhẽ, té, katú, ra

Modality
- Cond
  - PART: maã, imú, amú, emú
- Proh
  - PART: te, tenhẽ, Teẽ

Number[grnd]
- Sing
  - ADP: sesé, suakí, sakakwera, sesewara, suaxara

PartType
- Emp
  - PART: tẽ, tenhẽ, té, katú, ra
- Exs
  - PART: aikwé, Aikuré, aikwewara
- Int
  - PART: taá, será, ta, taé, tu
- Mod
  - PART: paá, pu, maã, supí, te, eré, tenhẽ, tenupá, tenki, ipú
- Neg
  - PART: ti, intí, nẽ, nti, umbaá, intíu, inté
- Prs
  - PART: xukúi, Kusukúi, Masekúi

Person[grnd]
- 3
  - ADP: sesé, suakí, sakakwera, sesewara, suaxara

Person[psor]
- 3
  - NOUN: suka, sera, ximirikú, taíra, ximiára, sesá, suaxara, sakakwera, sawa, sukwera

PunctType
- Elip
  - PUNCT: [...]

Red
- Yes
  - ADJ: purapuranga, aíwa-aíwa, pixuna-pixuna
  - DET: yawé-yawé
  - NOUN: tapurú-tapurú
  - PRON: yawé-yawé
  - VERB-Fin: uyawiyawika, Akaá-kaá, Tasuú-suú, Utuká-tuká, aganaganari, atuká-tuká, ipukukapukuka, takaú-kaú, ukaúkaú, ukikiri

Rel
- Abs
  - NOUN: uka, tatá, ukara, tuixawa, tetama, timbiú, ukena, pé, teapú, tendawa
- Cont
  - ADP: resé, resewara, ruakí, rakakwera, renundé, aresé, rakwera, ruaxara, renuné, rikuyara
  - NOUN: ruka, raíra, ramunha, retama, rimirikú, rapé, rera, rupitá, rangawa, resá
  - SCONJ: resewara
  - VERB-Fin: rurí, resarái, raisú, rikwé, ranhẽ, rakú, rapí, rawa, renúi
  - VERB-Inf: ripiaka
- NCont
  - ADP: sesé, suakí, sakakwera, sesewara, suaxara
  - NOUN: suka, sera, ximirikú, taíra, ximiára, sesá, suaxara, sakakwera, sawa, sukwera
  - VERB-Fin: surí, sasí, tiapú, sakú, sikwé, setá, tipí, Ikupukú, sesaíma, sawa

Style
- Arch
  - ADP: aresé, resewara
  - AUX-Fin: xaikú, xasú
  - AUX-Inf: ikú
  - NOUN: suá, tuixawa, ukena, rangawa, Rapé, imirikú, ií, sakapira, sapiá, sera
  - PRON: se, yané, ne, yandé, aé, maã, i, pe, ixé, aúna
  - SCONJ: kurumu
  - VERB-Fin: xasú, xarikú, xamunhã, xaputari, xaú, xakwáu, xanheẽ, xawasemu, xayuíri, raisú
  - VERB-Inf: yumunhã, putari, rasú, yuká, munhã, nupá, watá, kutuka, kwáu, yakáu
  - VERB-Vnoun: xarikusawa
- Rare
  - ADP: renuné
  - NOUN: Yukasara, teapú
  - PRON: Se, ixé
  - VERB-Fin: Ururi, upiama, Uxipiá, umunhã, upena-upena
  - VERB-Inf: piamu, Xari, maramunhã, piama, puapuãmu

Typo
- Yes
  - ADJ: Iwatí, katú, menasara, puranga-itá, puriaisúa, suai, xapanema, xapuriaisúa
  - ADP: rũ, aresé, pu, suí, aramé, iruma, pipé, rumu
  - ADV: Maí, Mairamé, inte, Arareneíma, Mamé, Maramé, iramé, maãkití, mené, mumi
  - AUX-Fin: urikú
  - AUX-Inf: vutari
  - CCONJ: yuri
  - DET: muyepé, Maã, amu, paiu, riya
  - NOUN: kaxiwer, kunhã, kunhãbukú, mirikú, rupirunawa, uka, Mukura, Sumura-etá, Tapayuma, ara
  - NUM: Yepé, muyepé
  - PART: Aé, Ti, p, Aikuré, Intí, Masekúi, inté, maã, saĩ, tu
  - PRON: maã, i, Nhaã, U, intá, intí, n, se, su
  - SCONJ: Sa
  - VERB-Fin: ipiama, Humbú, Pempisasúa, Xamuarpakwári, Xapetika, Xenheẽ, Yapituú, a, akanhemu, imaasí

Syntax

Auxiliary Verbs and Copula

This corpus uses 1 lemmas as copulas (cop). Examples: ikú.

This corpus uses 8 lemmas as auxiliaries (aux). Examples: sú, ikú, putari, kwáu, puderi, kari, kwá, yuíri.

Core Arguments, Oblique Arguments and Adjuncts

Here we consider only relations between verbs (parent) and nouns or pronouns (child).

nsubj
- VERB-Fin--NOUN (922)
- VERB-Fin--PRON (227)
- VERB-Fin--PRON-Acc,Nom (571)
- VERB-Fin--PRON-Gen (96)
- VERB-Inf--NOUN (40)
- VERB-Inf--PRON (4)
- VERB-Inf--PRON-Acc,Nom (15)
- VERB-Vnoun--NOUN (1)
- VERB-Vnoun--PRON (1)

obj
- VERB-Fin--NOUN (1057)
- VERB-Fin--NOUN-ADP(resé) (3)
- VERB-Fin--PRON (212)
- VERB-Fin--PRON-ADP(irũ) (1)
- VERB-Fin--PRON-Acc,Nom (225)
- VERB-Fin--PRON-Gen (8)
- VERB-Fin--PRON-Gen-ADP(irumu) (1)
- VERB-Inf--NOUN (8)
- VERB-Inf--PRON (1)
- VERB-Inf--PRON-Acc,Nom (4)
- VERB-Inf--PRON-Gen (25)
- VERB-Vnoun--PRON (1)

iobj
- VERB-Fin--NOUN (2)
- VERB-Fin--NOUN-ADP(resé) (1)
- VERB-Fin--NOUN-ADP(rã) (1)
- VERB-Fin--NOUN-ADP(supé) (43)
- VERB-Fin--NOUN-ADP(supé)-ADP(arama) (1)
- VERB-Fin--NOUN-ADP(xupé) (6)
- VERB-Fin--NOUN-ADP(xupé)-ADP(arama) (2)
- VERB-Fin--PRON (3)
- VERB-Fin--PRON-ADP(supé) (1)
- VERB-Fin--PRON-ADP(supé)-ADP(arama) (1)
- VERB-Fin--PRON-Acc,Nom (11)
- VERB-Fin--PRON-Acc,Nom-ADP(arama) (27)
- VERB-Fin--PRON-Acc,Nom-ADP(arã) (18)
- VERB-Fin--PRON-Dat (25)
- VERB-Fin--PRON-Gen-ADP(arama) (2)
- VERB-Fin--PRON-Gen-ADP(supé) (17)
- VERB-Fin--PRON-Gen-ADP(xupé) (48)
- VERB-Fin--PRON-Gen-ADP(xupé)-ADP(arã) (2)
- VERB-Inf--PRON-Acc,Nom-ADP(supé) (1)

Relations Overview

This corpus uses 3 relation subtypes: acl:relcl, advcl:relcl, nmod:poss
The following 2 relation types are not used in this corpus at all: clf, list