home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Buryat-BDT: POS Tags: NUM

There are 82 NUM lemmas (3%), 114 NUM types (3%) and 223 NUM tokens (2%). Out of 16 observed tags, the rank of NUM is: 6 in number of lemmas, 6 in number of types and 8 in number of tokens.

The 10 most frequent NUM lemmas: нэгэ, хоёр, гурбан, зуун, мянган, арбан, нэгэн, табан, юһэн, 1

The 10 most frequent NUM types: нэгэ, зуун, хоёр, арбан, гурбан, юһэн, долоон, мянга, хоер, 1-дэхи

The 10 most frequent ambiguous lemmas: нэгэ (NUM 20, NOUN 1), хоёр (NUM 16, CCONJ 1), нэгэн (NUM 8, PRON 4), гурба (ADJ 4, NUM 1)

The 10 most frequent ambiguous types: хоёр (NUM 7, CCONJ 1), гурбадахи (ADJ 2, NUM 1)

Morphology

The form / lemma ratio of NUM is 1.390244 (the average of all parts of speech is 1.635385).

The 1st highest number of forms (4) was observed with the lemma “гурбан”: гурбадахи, гурбан, гурбанһаа, гурбахан.

The 2nd highest number of forms (4) was observed with the lemma “нэгэ”: нэгые, нэгэ, нэгэдэхи, нэгэмнай.

The 3rd highest number of forms (4) was observed with the lemma “хоёр”: Хоёрдохи, хоер, хоердохи, хоёр.

NUM occurs with 2 features: NumType (221; 99% instances), Case (9; 4% instances)

NUM occurs with 3 feature-value pairs: Case=Acc, Case=Nom, NumType=Card

NUM occurs with 4 feature combinations. The most frequent feature combination is NumType=Card (212 tokens). Examples: нэгэ, зуун, хоёр, арбан, гурбан, юһэн, долоон, мянга, хоер, 1-дэхи

Relations

NUM nodes are attached to their parents using 10 different relations: nummod (179; 80% instances), compound (31; 14% instances), nmod (4; 2% instances), flat (2; 1% instances), nsubj (2; 1% instances), appos (1; 0% instances), case (1; 0% instances), conj (1; 0% instances), det (1; 0% instances), root (1; 0% instances)

Parents of NUM nodes belong to 10 different parts of speech: NOUN (167; 75% instances), NUM (39; 17% instances), SYM (5; 2% instances), ADJ (3; 1% instances), PRON (2; 1% instances), PUNCT (2; 1% instances), VERB (2; 1% instances), ADV (1; 0% instances), PROPN (1; 0% instances), (1; 0% instances)

169 (76%) NUM nodes are leaves.

47 (21%) NUM nodes have one child.

5 (2%) NUM nodes have two children.

2 (1%) NUM nodes have three or more children.

The highest child degree of a NUM node is 7.

Children of NUM nodes are attached using 12 different relations: compound (27; 40% instances), nummod (13; 19% instances), punct (7; 10% instances), conj (5; 7% instances), case (3; 4% instances), det (3; 4% instances), advmod (2; 3% instances), cc (2; 3% instances), goeswith (2; 3% instances), nmod (2; 3% instances), amod (1; 1% instances), nsubj (1; 1% instances)

Children of NUM nodes belong to 10 different parts of speech: NUM (39; 57% instances), PUNCT (10; 15% instances), NOUN (6; 9% instances), ADP (3; 4% instances), ADJ (2; 3% instances), ADV (2; 3% instances), CCONJ (2; 3% instances), DET (2; 3% instances), INTJ (1; 1% instances), PRON (1; 1% instances)