Treebank Statistics: UD_Bororo-BDT: POS Tags: X
There are 399 X lemmas (3%), 578 X types (3%) and 2974 X tokens (2%).
Out of 17 observed tags, the rank of X is: 7 in number of lemmas, 7 in number of types and 9 in number of tokens.
The 10 most frequent X lemmas: _, nure, duji, boeji, ta, gu, jiboe, dukeje, piji, bororoji
The 10 most frequent X types: nure, m, duji, boeji, jiboe, ta, gu, Om, e, piji
The 10 most frequent ambiguous lemmas: _ (NOUN 5910, VERB 3398, ADV 1856, PRON 1359, ADP 1308, PROPN 1165, X 926, PUNCT 459, DET 149, INTJ 122, SCONJ 55, CCONJ 30, PART 29), duji (X 182, ADP 161), boeji (X 111, ADP 18), ta (VERB 134, PRON 104, X 94, NOUN 13, ADV 10), gu (X 83, NOUN 12, VERB 8), jiboe (X 72, PRON 12, VERB 4, SCONJ 1), dukeje (ADV 133, SCONJ 82, X 50, VERB 20, NOUN 4), piji (ADP 484, X 45, ADV 19, VERB 16, NOUN 6), bororoji (X 42, ADP 14), po (X 35, NOUN 7, PROPN 5, VERB 4, ADV 2)
The 10 most frequent ambiguous types: nure (X 470, ADV 26, NOUN 10), m (X 169, INTJ 27), duji (X 184, ADP 168), boeji (X 113, ADP 18), jiboe (X 102, ADV 15, PRON 12), ta (X 91, VERB 65, NOUN 8, ADV 3), gu (X 80, NOUN 1), Om (X 66, INTJ 4, NOUN 3), e (X 55, NOUN 11, ADV 5, INTJ 1), piji (ADP 386, X 53, ADV 12)
- nure
- m
- duji
- boeji
- jiboe
- ta
- gu
- Om
- e
- piji
Morphology
The form / lemma ratio of X is 1.448622 (the average of all parts of speech is 1.360106).
The 1st highest number of forms (227) was observed with the lemma “_”: !’rnhm, ‘m, ,’m, …‘m’o’m, …‘rnhm, 1Pagagomodukare, Akore’m, Anastácio, Baadojeba, Bakorokuduji, Boiboe, Bokodoribaruji, Bokuojeba, Boroge, Burekiabeio, Butoregaduji, Cemanamage, Cibaio, Cláudio, Eceraeduji, Egídio, Esauji, Gaio, Garimperuji, Geralduji, Goio, Goredu, Guio, He, Imuga, Iwiemage, Kaboreuji, Kakoduwubao, Koge, Kuogorewu, Kurirewuji, Loga, Om, Otojiwuji, Pagagarewu, Tugokiarewuji, Umanarewu, Urubarewu, Urucuiao, aidu, akore’rnhm, aroe, ba, bakaruji, bakurireuji, bakuruji, bao, baruji, bataruji, bem, betuji, biadodurewu, biaganure, bibokwarewu, biegarewu, biredu, bitodurewuji, boebao, boeji, boekaguruji, boetoji, boetojiwuji, boetuguji, bogaiboe, boiaruruji, boio, bokwareuji, bokwarewu, bom, botumodedu, braedu, braeduji, bubutuji, butao, butorerewu, buturedu, cedaregoduji, cegire, cegudure’chá’roguji, cereuji, cerewu, cewadaruji, cewiemage, coreuji, duji, e, fariseuji, ga, gae’m, gao, gigudureuji, girirewu, grao, gurao, hm, hum, hum’m, imago, imarugo, imedago, io, ioga, irago, itugo, jameduji, jaruruji, jeduji, jereduduji, jetorogoduji, jetorokareuji, jetororeuji, jetotoreuji, jiboe, jio, jituji, jiwuji, jokiwuji, jokuji, jokurega, joruguji, joruji, jui, juireuji, juji, kaborewu, kakodiboe, kanao, kao, karadega, karegao, kejeboe, kejewuji, kodiboe, koriwuji, krao, kuguji, kui, kuji, kuricigorewuji, kurireuji, kurirewu, kuruji, m, maereuji, maigoduwuji, maiwuji, marenaruji, mariguduwuji, meardu, meruji, moriboe, motuji, motureuji, moturewu, nowuji, o’piga’kuruji, oecerewuji, oeceruji, oinoduji, okoge, okogerewu, okoridowuji, okwabijire, okwamagudui, okwaruji, onaregedu, onaregeduji, onawuio, otobijiboe, otoguruji, padui, pagapagareuji, pagareuji, paruji, pawadaruji, pawobeba, pegadowuji, pegareuji, peguruji, pemegaguraga, pemegareuji, pemegarewu, pemegarewuji, piji, poruji, prefeituji, pugejewuji, rairewuji, rakakareuji, rakareuji, raruji, rekodaji, rekodajiwuji, remage, remawuji, reorewu, reraka-rewuji, reruio, rikireuji, rikirewu, roguji, tu’m, tuku’rnhm, ucemage, uiaduji, uiadumage, um, umanamage, umuguio, uomage, ureboeba, uruguduji, uruguji, uwaborewu, uwagedu, uwaraiarewu, uwiemage, uwirerewu, veio, wao, woe, índio.
The 2nd highest number of forms (2) was observed with the lemma “du”: du, dure.
The 3rd highest number of forms (2) was observed with the lemma “ji”: ji, jire.
X occurs with 8 features: Mood (485; 16% instances), Aspect (471; 16% instances), Polarity (33; 1% instances), Number (31; 1% instances), Person (10; 0% instances), Clusivity (4; 0% instances), Nomzr (4; 0% instances), Poss (4; 0% instances)
X occurs with 10 feature-value pairs: Aspect=Prog, Clusivity=Ex, Mood=Ind, Nomzr=Rel, Number=Plur, Number=Sing, Person=1, Person=3, Polarity=Neg, Poss=Yes
X occurs with 12 feature combinations.
The most frequent feature combination is _ (2433 tokens).
Examples: m, duji, boeji, jiboe, ta, gu, Om, e, piji, dukeje
Relations
X nodes are attached to their parents using 14 different relations: dep (1994; 67% instances), root (182; 6% instances), conj (156; 5% instances), nsubj (139; 5% instances), ccomp (135; 5% instances), obl (106; 4% instances), parataxis (98; 3% instances), nmod (69; 2% instances), case (62; 2% instances), advcl (10; 0% instances), obj (9; 0% instances), mark (7; 0% instances), discourse (5; 0% instances), flat (2; 0% instances)
Parents of X nodes belong to 17 different parts of speech: VERB (1627; 55% instances), NOUN (540; 18% instances), X (248; 8% instances), (182; 6% instances), ADV (157; 5% instances), PROPN (89; 3% instances), ADP (44; 1% instances), PRON (36; 1% instances), NUM (16; 1% instances), INTJ (14; 0% instances), CCONJ (4; 0% instances), PART (4; 0% instances), PUNCT (4; 0% instances), AUX (3; 0% instances), DET (3; 0% instances), ADJ (2; 0% instances), SCONJ (1; 0% instances)
1946 (65%) X nodes are leaves.
552 (19%) X nodes have one child.
280 (9%) X nodes have two children.
196 (7%) X nodes have three or more children.
The highest child degree of a X node is 10.
Children of X nodes are attached using 15 different relations: punct (511; 28% instances), nsubj (440; 24% instances), dep (276; 15% instances), advmod (162; 9% instances), case (95; 5% instances), nmod (80; 4% instances), obj (67; 4% instances), det (63; 3% instances), conj (59; 3% instances), obl (51; 3% instances), parataxis (31; 2% instances), mark (11; 1% instances), advcl (4; 0% instances), discourse (4; 0% instances), cc (3; 0% instances)
Children of X nodes belong to 15 different parts of speech: PUNCT (511; 28% instances), NOUN (332; 18% instances), X (248; 13% instances), PRON (173; 9% instances), ADV (172; 9% instances), PROPN (165; 9% instances), ADP (110; 6% instances), DET (65; 4% instances), VERB (44; 2% instances), SCONJ (14; 1% instances), CCONJ (11; 1% instances), NUM (6; 0% instances), PART (3; 0% instances), INTJ (2; 0% instances), AUX (1; 0% instances)