home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Xibe-XDT: POS Tags: NOUN

There are 795 NOUN lemmas (34%), 831 NOUN types (27%) and 4606 NOUN tokens (30%). Out of 17 observed tags, the rank of NOUN is: 1 in number of lemmas, 2 in number of types and 1 in number of tokens.

The 10 most frequent NOUN lemmas: ᠠᠨᡞᠶᠠ, ᠨᡞᠶᠠᠯᠮᠠ, ᡞᠷᡤᡝᠨ, ᡥᡝᡨᡥᡝ, ᡞᠰᠠᠨ, ᡤᡠᠷᡠᠨ, ᠪᠠᡞᡨᠠ, ᠰᡞᠶᠠᠨ, ᡠᠰᡞᠨ, ᠸᡝᡞᠯᡝᠷᠠᠨ

The 10 most frequent NOUN types: ᠠᠨᡞᠶᠠ, ᠨᡞᠶᠠᠯᠮᠠ, ᡞᠷᡤᡝᠨ, ᡞᠰᠠᠨ, ᡤᡠᠷᡠᠨ, ᠪᠠᡞᡨᠠ, ᡥᡝᡨᡥᡝ, ᠰᡞᠶᠠᠨ, ᡠᠰᡞᠨ, ᠸᡝᡞᠯᡝᠷᠠᠨ

The 10 most frequent ambiguous lemmas: ᠰᡞᠶᠠᠨ (NOUN 51, X 6, PROPN 2), ᡩᠠᡢ (NOUN 44, X 1), ᠪᠣᠣ (NOUN 37, X 1), ᡩᠠ (NOUN 30, PROPN 1), ᠠᠷᠪᡠᠨ (NOUN 29, VERB 1), ᡧᡠ (NOUN 28, PROPN 3, X 1), ᠪᠠ (NOUN 27, X 3, PART 2), ᠸᡝᠨ (NOUN 21, PROPN 1), ᡪᡝᠨ (NOUN 19, PROPN 5, X 1), ᡪᡠᡢᡤᡠᡢ (NOUN 19, PROPN 1)

The 10 most frequent ambiguous types: ᠰᡞᠶᠠᠨ (NOUN 51, X 6, PROPN 2), ᡩᠠᡢ (NOUN 44, X 1), ᡩᠠ (NOUN 30, PROPN 1), ᠠᠷᠪᡠᠨ (NOUN 29, VERB 1), ᠪᠣᠣ (NOUN 29, X 1), ᡧᡠ (NOUN 28, PROPN 3, X 1), ᠪᠠ (NOUN 22, X 3, PART 2), ᠸᡝᠨ (NOUN 21, PROPN 1), ᡪᡝᠨ (NOUN 19, PROPN 5, X 1), ᡪᡠᡢᡤᡠᡢ (NOUN 19, PROPN 1)

Morphology

The form / lemma ratio of NOUN is 1.045283 (the average of all parts of speech is 1.310593).

The 1st highest number of forms (4) was observed with the lemma “ᠨᡞᠶᠠᠯᠮᠠ”: ᠨᡞᠶᠠᠯᠮᠠ, ᠨᡞᠶᠠᠯᠮᠠᠰᠠ, ᠨᡞᠶᠠᠯᠮᠠᡞ, ᠨᡞᠶᠠᠯᠮᠠᡠ.

The 2nd highest number of forms (4) was observed with the lemma “ᠪᠠ”: ᠪᠠ, ᠪᠠᠪᡝ, ᠪᠠᡞ, ᠪᠠᡩᡝ.

The 3rd highest number of forms (3) was observed with the lemma “ᠠᠨᡞᠶᠠ”: ᠠᠨᡞᠶᠠ, ᠠᠨᡞᠶᠠᠴᡞ, ᠠᠨᡞᠶᠠᡞ.

NOUN occurs with 5 features: Case (227; 5% instances), Number (34; 1% instances), Abbr (14; 0% instances), Typo (2; 0% instances), Foreign (1; 0% instances)

NOUN occurs with 10 feature-value pairs: Abbr=Yes, Case=Abl, Case=Acc, Case=Dat, Case=Gen, Case=Lat, Case=Loc, Foreign=Yes, Number=Plur, Typo=Yes

NOUN occurs with 11 feature combinations. The most frequent feature combination is _ (4328 tokens). Examples: ᠠᠨᡞᠶᠠ, ᠨᡞᠶᠠᠯᠮᠠ, ᡞᠷᡤᡝᠨ, ᡞᠰᠠᠨ, ᡤᡠᠷᡠᠨ, ᠪᠠᡞᡨᠠ, ᡥᡝᡨᡥᡝ, ᠰᡞᠶᠠᠨ, ᡠᠰᡞᠨ, ᠸᡝᡞᠯᡝᠷᠠᠨ

Relations

NOUN nodes are attached to their parents using 25 different relations: nmod (1385; 30% instances), obj (886; 19% instances), nsubj (573; 12% instances), obl (491; 11% instances), compound (449; 10% instances), conj (289; 6% instances), obl:tmod (112; 2% instances), obl:lmod (92; 2% instances), clf (66; 1% instances), root (63; 1% instances), xcomp (40; 1% instances), flat (39; 1% instances), appos (34; 1% instances), parataxis (21; 0% instances), flat:name (12; 0% instances), advcl (11; 0% instances), acl (10; 0% instances), nmod:poss (7; 0% instances), amod (6; 0% instances), iobj (5; 0% instances), nsubj:pass (5; 0% instances), vocative (5; 0% instances), ccomp (2; 0% instances), fixed (2; 0% instances), acl:relcl (1; 0% instances)

Parents of NOUN nodes belong to 13 different parts of speech: VERB (2150; 47% instances), NOUN (2068; 45% instances), ADJ (127; 3% instances), X (70; 2% instances), NUM (69; 1% instances), (63; 1% instances), PROPN (20; 0% instances), PRON (14; 0% instances), DET (12; 0% instances), AUX (7; 0% instances), ADP (3; 0% instances), ADV (2; 0% instances), PART (1; 0% instances)

1508 (33%) NOUN nodes are leaves.

1400 (30%) NOUN nodes have one child.

979 (21%) NOUN nodes have two children.

719 (16%) NOUN nodes have three or more children.

The highest child degree of a NOUN node is 11.

Children of NOUN nodes are attached using 33 different relations: nmod (1443; 24% instances), case (1403; 23% instances), amod (524; 9% instances), compound (500; 8% instances), punct (347; 6% instances), nummod (314; 5% instances), conj (295; 5% instances), acl (276; 5% instances), acl:relcl (246; 4% instances), det (182; 3% instances), nmod:poss (101; 2% instances), cc (72; 1% instances), appos (71; 1% instances), nsubj (59; 1% instances), advmod (40; 1% instances), aux (29; 0% instances), obl (26; 0% instances), cop (24; 0% instances), advcl (22; 0% instances), flat (22; 0% instances), mark:adv (20; 0% instances), discourse (18; 0% instances), mark (17; 0% instances), parataxis (12; 0% instances), flat:name (6; 0% instances), mark:plur (5; 0% instances), obl:tmod (4; 0% instances), clf (3; 0% instances), obj (3; 0% instances), csubj (2; 0% instances), mark:rel (2; 0% instances), fixed (1; 0% instances), obl:lmod (1; 0% instances)

Children of NOUN nodes belong to 16 different parts of speech: NOUN (2068; 34% instances), ADP (1406; 23% instances), VERB (578; 9% instances), ADJ (506; 8% instances), PUNCT (347; 6% instances), NUM (332; 5% instances), PRON (255; 4% instances), PROPN (222; 4% instances), DET (89; 1% instances), CCONJ (71; 1% instances), AUX (54; 1% instances), X (52; 1% instances), PART (50; 1% instances), ADV (47; 1% instances), SCONJ (12; 0% instances), INTJ (1; 0% instances)