home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Xavante-XDT: POS Tags: NOUN

There are 130 NOUN lemmas (38%), 173 NOUN types (39%) and 369 NOUN tokens (21%). Out of 15 observed tags, the rank of NOUN is: 1 in number of lemmas, 1 in number of types and 2 in number of tokens.

The 10 most frequent NOUN lemmas: marĩ, pi’õ, aibö, ‘watébrémi, a’uwẽ, wapté, buru, ba’õtõre, mama, höimanadzé

The 10 most frequent NOUN types: marĩ, aibö, ‘watébrémi, pi’õ, a’uwẽ, buru, wapté, ba’õtõ, bötö, Mare

The 10 most frequent ambiguous lemmas: na (ADP 32, NOUN 8, X 1), mare (NOUN 6, ADV 2), mreme (NOUN 4, VERB 1), uptabi (NOUN 4, ADV 1), romhuri (VERB 30, NOUN 3), wẽ (VERB 3, NOUN 2, ADV 1), _ (NOUN 1, VERB 1), (NOUN 1, X 1), höiwahö (ADV 2, NOUN 1), mro (NOUN 1, VERB 1)

The 10 most frequent ambiguous types: Mare (NOUN 6, ADV 1), uptabi (NOUN 4, ADV 1), romhuri (VERB 16, NOUN 2), tete (NOUN 2, AUX 1), Höiwahö (ADV 2, NOUN 1), (NOUN 1, X 1), wẽ (VERB 2, ADV 1, NOUN 1)

Morphology

The form / lemma ratio of NOUN is 1.330769 (the average of all parts of speech is 1.294461).

The 1st highest number of forms (7) was observed with the lemma “’ra”: ‘ra, Ti’ra, ti’ranorĩ, ti’ranorĩhã, wa’ranorĩhã, Ĩ’ranorĩ, ĩ’ra.

The 2nd highest number of forms (6) was observed with the lemma “mama”: Aimama, Wamama, timama, wamamanorĩhã, ĩmama, ĩĩmama.

The 3rd highest number of forms (5) was observed with the lemma “tsa”: datsa, tsa, watsa, watsai, ĩtsa.

NOUN occurs with 11 features: Person (74; 20% instances), Number (43; 12% instances), PronType (9; 2% instances), Emph (5; 1% instances), Reflex (5; 1% instances), Degree (3; 1% instances), Poss (3; 1% instances), Number[psor] (2; 1% instances), Case (1; 0% instances), Nomzr (1; 0% instances), Polarity (1; 0% instances)

NOUN occurs with 15 feature-value pairs: Case=Ins, Degree=Dim, Emph=Yes, Nomzr=Ag, Number=Coll, Number=Plur, Number=Sing, Number[psor]=Sing, Person=1, Person=2, Person=3, Polarity=Neg, Poss=Yes, PronType=Gnc, Reflex=Yes

NOUN occurs with 24 feature combinations. The most frequent feature combination is _ (270 tokens). Examples: marĩ, aibö, ‘watébrémi, pi’õ, a’uwẽ, buru, wapté, ba’õtõ, bötö, Mare

Relations

NOUN nodes are attached to their parents using 14 different relations: nsubj (106; 29% instances), obj (68; 18% instances), obl (61; 17% instances), nmod (42; 11% instances), root (31; 8% instances), dislocated (19; 5% instances), parataxis (13; 4% instances), vocative (9; 2% instances), conj (8; 2% instances), advcl (7; 2% instances), iobj (2; 1% instances), acl (1; 0% instances), case (1; 0% instances), ccomp (1; 0% instances)

Parents of NOUN nodes belong to 5 different parts of speech: VERB (264; 72% instances), NOUN (71; 19% instances), (31; 8% instances), ADV (2; 1% instances), ADP (1; 0% instances)

153 (41%) NOUN nodes are leaves.

118 (32%) NOUN nodes have one child.

58 (16%) NOUN nodes have two children.

40 (11%) NOUN nodes have three or more children.

The highest child degree of a NOUN node is 7.

Children of NOUN nodes are attached using 19 different relations: case (83; 22% instances), det (61; 16% instances), dep (59; 15% instances), nmod (51; 13% instances), punct (40; 10% instances), advmod (17; 4% instances), nsubj (15; 4% instances), discourse (13; 3% instances), conj (8; 2% instances), mark (8; 2% instances), parataxis (8; 2% instances), obl (6; 2% instances), advcl (5; 1% instances), cop (3; 1% instances), dislocated (3; 1% instances), nummod (3; 1% instances), acl (1; 0% instances), aux (1; 0% instances), obj (1; 0% instances)

Children of NOUN nodes belong to 14 different parts of speech: ADP (90; 23% instances), NOUN (71; 18% instances), DET (61; 16% instances), PART (56; 15% instances), PUNCT (40; 10% instances), ADV (19; 5% instances), X (16; 4% instances), PRON (10; 3% instances), VERB (9; 2% instances), SCONJ (5; 1% instances), AUX (4; 1% instances), NUM (3; 1% instances), ADJ (1; 0% instances), INTJ (1; 0% instances)