home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Vietnamese-VTB: POS Tags: NUM

There are 308 NUM lemmas (4%), 308 NUM types (4%) and 1828 NUM tokens (3%). Out of 17 observed tags, the rank of NUM is: 5 in number of lemmas, 5 in number of types and 9 in number of tokens.

The 10 most frequent NUM lemmas: một, hai, ba, 10, bốn, 2, mỗi, 20, 1, 30

The 10 most frequent NUM types: một, hai, ba, 10, bốn, 2, mỗi, 20, 1, 30

The 10 most frequent ambiguous lemmas: một (NUM 535, DET 7, ADV 1), hai (NUM 235, NOUN 2), ba (NUM 78, PROPN 2, NOUN 1), 2 (NUM 29, PROPN 6), mỗi (DET 36, NUM 29, PART 1), 1 (NUM 25, PROPN 5, NOUN 1), 3 (NUM 22, PROPN 1), 5 (NUM 22, PROPN 2), đôi (NUM 21, DET 4, NOUN 1), năm (NOUN 151, NUM 19)

The 10 most frequent ambiguous types: một (NUM 496, DET 6, ADV 1), ba (NUM 69, PROPN 2, NOUN 1), 2 (NUM 29, PROPN 6), mỗi (DET 28, NUM 24, PART 1), 1 (NUM 25, PROPN 5, NOUN 1), 3 (NUM 22, PROPN 1), 5 (NUM 22, PROPN 2), đôi (NUM 19, DET 4, NOUN 1), năm (NOUN 137, NUM 17), 9 (NUM 16, PROPN 1)

Morphology

The form / lemma ratio of NUM is 1.000000 (the average of all parts of speech is 1.001997).

The 1st highest number of forms (1) was observed with the lemma “1”: 1.

The 2nd highest number of forms (1) was observed with the lemma “1 , 1”: 1 , 1.

The 3rd highest number of forms (1) was observed with the lemma “1 , 2”: 1 , 2.

NUM does not occur with any features.

Relations

NUM nodes are attached to their parents using 24 different relations: nummod (1402; 77% instances), flat:date (100; 5% instances), nummod:det (71; 4% instances), flat:number (64; 4% instances), nmod (60; 3% instances), compound (39; 2% instances), conj (18; 1% instances), obj (14; 1% instances), obl:tmod (14; 1% instances), nsubj (10; 1% instances), flat:time (9; 0% instances), appos (6; 0% instances), flat:name (3; 0% instances), obl (3; 0% instances), appos:nmod (2; 0% instances), clf:det (2; 0% instances), compound:verbnoun (2; 0% instances), nsubj:nn (2; 0% instances), parataxis (2; 0% instances), amod (1; 0% instances), clf (1; 0% instances), nsubj:pass (1; 0% instances), obl:comp (1; 0% instances), root (1; 0% instances)

Parents of NUM nodes belong to 10 different parts of speech: NOUN (1646; 90% instances), NUM (94; 5% instances), VERB (51; 3% instances), PROPN (15; 1% instances), ADJ (12; 1% instances), PRON (4; 0% instances), ADV (2; 0% instances), DET (2; 0% instances), ADP (1; 0% instances), (1; 0% instances)

1475 (81%) NUM nodes are leaves.

263 (14%) NUM nodes have one child.

51 (3%) NUM nodes have two children.

39 (2%) NUM nodes have three or more children.

The highest child degree of a NUM node is 6.

Children of NUM nodes are attached using 29 different relations: clf (133; 26% instances), flat:number (131; 26% instances), advmod:adj (63; 12% instances), punct (45; 9% instances), case (18; 4% instances), advmod (17; 3% instances), nmod (17; 3% instances), conj (12; 2% instances), flat:time (12; 2% instances), det (9; 2% instances), nummod (8; 2% instances), cop (5; 1% instances), compound (4; 1% instances), flat:date (4; 1% instances), acl:subj (3; 1% instances), amod (3; 1% instances), cc (3; 1% instances), flat (3; 1% instances), mark (3; 1% instances), obl (3; 1% instances), obl:tmod (3; 1% instances), advmod:neg (2; 0% instances), clf:det (2; 0% instances), det:pmod (2; 0% instances), discourse (2; 0% instances), nsubj (2; 0% instances), advcl (1; 0% instances), nsubj:nn (1; 0% instances), obl:adj (1; 0% instances)

Children of NUM nodes belong to 15 different parts of speech: NOUN (180; 35% instances), NUM (94; 18% instances), ADJ (67; 13% instances), SYM (54; 11% instances), PUNCT (45; 9% instances), ADV (19; 4% instances), ADP (18; 4% instances), PRON (8; 2% instances), DET (7; 1% instances), AUX (5; 1% instances), VERB (5; 1% instances), CCONJ (3; 1% instances), SCONJ (3; 1% instances), PART (2; 0% instances), PROPN (2; 0% instances)