home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Buryat-BDT: POS Tags: NUM

There are 82 NUM lemmas (3%), 114 NUM types (3%) and 223 NUM tokens (2%). Out of 16 observed tags, the rank of NUM is: 6 in number of lemmas, 6 in number of types and 9 in number of tokens.

The 10 most frequent NUM lemmas: нэгэ, хоёр, гурбан, зуун, мянган, арбан, нэгэн, табан, юһэн, 1

The 10 most frequent NUM types: нэгэ, зуун, хоёр, арбан, гурбан, юһэн, долоон, мянга, хоер, 1-дэхи

The 10 most frequent ambiguous lemmas: нэгэ (NUM 20, NOUN 1), хоёр (NUM 16, CCONJ 1), нэгэн (NUM 8, PRON 4), гурба (ADJ 4, NUM 1)

The 10 most frequent ambiguous types: хоёр (NUM 7, CCONJ 1), гурбадахи (ADJ 2, NUM 1)

Morphology

The form / lemma ratio of NUM is 1.390244 (the average of all parts of speech is 1.633129).

The 1st highest number of forms (4) was observed with the lemma “гурбан”: гурбадахи, гурбан, гурбанһаа, гурбахан.

The 2nd highest number of forms (4) was observed with the lemma “нэгэ”: нэгые, нэгэ, нэгэдэхи, нэгэмнай.

The 3rd highest number of forms (4) was observed with the lemma “хоёр”: Хоёрдохи, хоер, хоердохи, хоёр.

NUM occurs with 2 features: NumType (221; 99% instances), Case (9; 4% instances)

NUM occurs with 3 feature-value pairs: Case=Acc, Case=Nom, NumType=Card

NUM occurs with 4 feature combinations. The most frequent feature combination is NumType=Card (212 tokens). Examples: нэгэ, зуун, хоёр, арбан, гурбан, юһэн, долоон, мянга, хоер, 1-дэхи

Relations

NUM nodes are attached to their parents using 8 different relations: nummod (181; 81% instances), compound (31; 14% instances), nmod (4; 2% instances), flat (2; 1% instances), nsubj (2; 1% instances), appos (1; 0% instances), conj (1; 0% instances), root (1; 0% instances)

Parents of NUM nodes belong to 8 different parts of speech: NOUN (168; 75% instances), NUM (40; 18% instances), SYM (5; 2% instances), ADJ (3; 1% instances), PRON (3; 1% instances), VERB (2; 1% instances), PROPN (1; 0% instances), (1; 0% instances)

168 (75%) NUM nodes are leaves.

51 (23%) NUM nodes have one child.

3 (1%) NUM nodes have two children.

1 (0%) NUM nodes have three or more children.

The highest child degree of a NUM node is 4.

Children of NUM nodes are attached using 11 different relations: compound (26; 43% instances), nummod (13; 21% instances), punct (7; 11% instances), case (4; 7% instances), det (3; 5% instances), advmod (2; 3% instances), nmod (2; 3% instances), amod (1; 2% instances), cc (1; 2% instances), conj (1; 2% instances), nsubj (1; 2% instances)

Children of NUM nodes belong to 9 different parts of speech: NUM (40; 66% instances), PUNCT (7; 11% instances), ADP (4; 7% instances), NOUN (3; 5% instances), ADV (2; 3% instances), DET (2; 3% instances), ADJ (1; 2% instances), CCONJ (1; 2% instances), PRON (1; 2% instances)