UD for Bororo 
Tokenization and Word Segmentation
- Bororo uses all 18 UPOS.
- Tokenization and semgmentatoin in Bororo is straightforward. There are no multiwords that require spaces or dashes.
- Words are delimited by whitespace characters. .
- According to typographical rules, many punctuation marks are attached to a neighboring word. These are tokenized as separate tokens (words).
Mapping UPOS to XPOS Akuntsú
UPOS | XPOS |
---|---|
ADJ | adj |
ADV | adv |
INTJ | intj |
NOUN | n |
PROPN | ppn |
VERB | v, vi, vt |
ADP | pp |
AUX | aux |
CCONJ | cc |
DET | det |
NUM | num |
PART | pcl |
PRON | pron, bi |
SCONJ | sc |
PUNCT | punct |
SYM | sym |
X | x |
Morphology
Nouns
Gender
The `gender of nouns in Bororo follow the natural gender of the animate nouns, i.e., males take masculine gender and females take feminine gender or the word ‘female,woman’. Inanimate nouns are genderless but morphologically they follow the masculine pattern. Based on natural gender, some nouns may be marked as feminine. Modifiers may take gender mark (agreement) only in the feminine singular.
Number
There are different ways of forming the plural
of nouns in Bororo: deleting the last syllables of nouns ending in -edu, substituting the last vowel by -e, adding e to the singular form, adding -doge to the stem, adding -ge to nouns ending in -rewy, -wy, -epa, -are. There are also instances of irregular plural forms, ablaut with change of final vowel, and some forms that do not vary in the plural.
Tags
Person indexes
Person | Before consonant | Before vowel |
---|---|---|
1S | i- | it-, in-, ik- |
2S | a- | ak- |
3S | ∅, u- | |
3Anaf | tu-, pu- | t-, tud-, pud-, |
1PL.EX | ce- | ced-, cen-, ceg- |
1PL.IN | pa- | pag- |
2PL | ta- | tag- |
3PL | e- | et-, en-, ek- |
3Anaf | tu-, pu- | t-, tud-, pud-, |
The first plural of person indexes distinguish between the values Ex
(exclusive) and In
(inclusive) for the feature Clusivity
- Nouns are either possessed or unpossessed. Possessed nouns are either alienably o inalienably possessed. Inalienably possessed nouns in Bororo are kinship terms and body parts.
Iia
i=ia
1SG=mouth
my mouth
Aparo
a=paro
2SG=axe
Your axe
Instruction: Specify any unused tags. Explain what words are tagged as PART. Describe how the AUX-VERB and DET-PRON distinctions are drawn, and specify whether there are (de)verbal forms tagged as ADJ, ADV or NOUN. Include links to language-specific tag definitions if any.
Bororo has no copula an no auxiliary verbs.
Features
*
Instruction: Describe inherent and inflectional features for major word classes (at least NOUN and VERB). Describe other noteworthy features. Include links to language-specific feature definitions if any.
Syntax
Bororo is an ergative language. S, A, and O are marked by the same set of bound indexes. But the construction where S and O appear are the same, i.e, they attach to the predicate. A is always marked by a bound index which carries TMA and negation markers, detached from the predicate.
Imaragodyre
i-maragody=re
1SG-work=ASS
The A argument of transitive verbs is indexed on the mood or aspect marker, and the O argumend is bound to verb.
adygore emage ewido
adygo=re emage e=bito
jaguar=IND they 3.PL=kill
The jaguar killed them
adygore ewido
adygo=re e=bito
jaguar=IND 3.PL=kill
The jaguar killed them
Ure ewido
u=re e=bito
3.SG=IND 3.PL=kill
The jaguar killed them
In transitive clauses, nothing may intervene between the A argument and the O-predicate slot. Adjuncts follow the predicate and if they are fronted, the are morphologically marked.
There is a clear preference for subordinate clauses to precede main clauses, although this is not obbligatory.
Instruction: Give criteria for identifying core arguments (subjects and objects), and describe the range of copula constructions in nonverbal clauses. List all subtype relations used. Include links to language-specific relations definitions if any.
- Nonverbal predication distinguish the following semantic types:
Treebanks
There are N Bororo UD treebanks:
Instruction: Treebank-specific pages are generated automatically from the README file in the treebank repository and
from the data in the latest release. Link to the respective *-index.html
page in the treebanks
folder, using the language code
and the treebank code in the file name.