home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Xibe-XDT: POS Tags: NUM

There are 98 NUM lemmas (4%), 100 NUM types (3%) and 489 NUM tokens (3%). Out of 17 observed tags, the rank of NUM is: 6 in number of lemmas, 6 in number of types and 9 in number of tokens.

The 10 most frequent NUM lemmas: ᡝᠮᡠ, ᡪᡠᠸᠠᠨ, ᡪᡠᠸᡝ, ᡨᡠᠮᡝᠨ, 2019, ᡞᠯᠠᠨ, 2018, ᠮᡞᡢᡤᠠᠨ, ᡞᠯᠠᠴᡞ, ᡝᠮᡣᡝᠨ

The 10 most frequent NUM types: ᡝᠮᡠ, ᡪᡠᠸᠠᠨ, ᡪᡠᠸᡝ, ᡨᡠᠮᡝᠨ, 2019, ᡞᠯᠠᠨ, 2018, ᠮᡞᡢᡤᠠᠨ, ᡞᠯᠠᠴᡞ, ᡝᠮᡣᡝᠨ

The 10 most frequent ambiguous lemmas: ᡝᠮᡠ (NUM 91, ADJ 1, ADV 1), ᡪᡠᠸᡝ (NUM 29, ADJ 1), ᡨᡠᠮᡝᠨ (NUM 16, ADV 2), ᡞᠯᠠᠴᡞ (NUM 11, ADJ 2), ᡠᠶᡠᠴᡞ (NUM 7, ADJ 2), ᡠᡩᡠ (NUM 6, SCONJ 5, ADV 3, DET 3, NOUN 1), ᡠᡪᡠ (NOUN 14, NUM 6), ᠨᠠᡩᠠᠨ (NUM 5, NOUN 1), ᡪᠠᡞ (CCONJ 94, ADV 4, NUM 4, ADP 1), ᠨᠠᡩᠠᠴᡞ (NUM 3, ADJ 1, PROPN 1)

The 10 most frequent ambiguous types: ᡝᠮᡠ (NUM 91, ADJ 1, ADV 1), ᡪᡠᠸᡝ (NUM 29, ADJ 1), ᡨᡠᠮᡝᠨ (NUM 16, ADV 2), ᡞᠯᠠᠴᡞ (NUM 11, ADJ 2), ᡠᡪᡠᡞ (NUM 9, NOUN 3), ᡠᠶᡠᠴᡞ (NUM 7, ADJ 2), ᡠᡩᡠ (NUM 6, SCONJ 5, ADV 3, DET 3, NOUN 1), ᠨᠠᡩᠠᠨ (NUM 5, NOUN 1), ᠨᠠᡩᠠᠴᡞ (NUM 3, ADJ 1, PROPN 1), ᡪᠠᡞ (CCONJ 94, ADV 4, NUM 3, ADP 1)

Morphology

The form / lemma ratio of NUM is 1.020408 (the average of all parts of speech is 1.310593).

The 1st highest number of forms (2) was observed with the lemma “ᡠᡪᡠ”: ᡠᡪᡠ, ᡠᡪᡠᡞ.

The 2nd highest number of forms (2) was observed with the lemma “ᡪᠠᡞ”: ᡪᠠᡞ, ᡪᠠᡞᡩᡝ.

The 3rd highest number of forms (2) was observed with the lemma “ᡪᡠᠸᠠᠨ”: ᡪᡠᠸᠠ, ᡪᡠᠸᠠᠨ.

NUM occurs with 3 features: NumType (435; 89% instances), Case (7; 1% instances), Typo (1; 0% instances)

NUM occurs with 8 feature-value pairs: Case=Dat, Case=Gen, NumType=Card, NumType=Frac, NumType=Mult, NumType=Ord, NumType=Sets, Typo=Yes

NUM occurs with 10 feature combinations. The most frequent feature combination is NumType=Card (375 tokens). Examples: ᡝᠮᡠ, ᡪᡠᠸᠠᠨ, ᡪᡠᠸᡝ, ᡨᡠᠮᡝᠨ, 2019, ᡞᠯᠠᠨ, 2018, 5, ᠮᡞᡢᡤᠠᠨ, 1

Relations

NUM nodes are attached to their parents using 15 different relations: nummod (339; 69% instances), flat (52; 11% instances), obj (27; 6% instances), compound (19; 4% instances), obl (18; 4% instances), obl:tmod (11; 2% instances), conj (6; 1% instances), nmod (6; 1% instances), nsubj (3; 1% instances), amod (2; 0% instances), appos (2; 0% instances), case (1; 0% instances), flat:name (1; 0% instances), parataxis (1; 0% instances), xcomp (1; 0% instances)

Parents of NUM nodes belong to 10 different parts of speech: NOUN (332; 68% instances), NUM (75; 15% instances), VERB (67; 14% instances), ADJ (5; 1% instances), X (3; 1% instances), PRON (2; 0% instances), PROPN (2; 0% instances), ADV (1; 0% instances), AUX (1; 0% instances), SYM (1; 0% instances)

312 (64%) NUM nodes are leaves.

135 (28%) NUM nodes have one child.

23 (5%) NUM nodes have two children.

19 (4%) NUM nodes have three or more children.

The highest child degree of a NUM node is 7.

Children of NUM nodes are attached using 20 different relations: case (59; 24% instances), flat (50; 20% instances), clf (40; 16% instances), punct (24; 10% instances), nmod (16; 6% instances), conj (11; 4% instances), compound (10; 4% instances), nummod (10; 4% instances), advmod (8; 3% instances), amod (6; 2% instances), nsubj (4; 2% instances), det (3; 1% instances), appos (2; 1% instances), mark:adv (2; 1% instances), advcl (1; 0% instances), cc (1; 0% instances), fixed (1; 0% instances), flat:name (1; 0% instances), obl (1; 0% instances), obl:tmod (1; 0% instances)

Children of NUM nodes belong to 11 different parts of speech: NUM (75; 30% instances), NOUN (69; 27% instances), ADP (60; 24% instances), PUNCT (24; 10% instances), ADJ (8; 3% instances), ADV (7; 3% instances), DET (3; 1% instances), PART (2; 1% instances), CCONJ (1; 0% instances), SYM (1; 0% instances), X (1; 0% instances)