home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Cantonese-HK: POS Tags: NUM

There are 32 NUM lemmas (3%), 33 NUM types (2%) and 232 NUM tokens (2%). Out of 15 observed tags, the rank of NUM is: 7 in number of lemmas, 12 in number of types and 10 in number of tokens.

The 10 most frequent NUM lemmas: _、 一、 兩、 三、 三十、 一百、 十、 四、 幾、 二十

The 10 most frequent NUM types: 一、 兩、 三、 幾、 三十、 五、 十、 一百、 四、 半

The 10 most frequent ambiguous lemmas: _ (PUNCT 1377, VERB 1352, NOUN 1283, ADV 853, PART 764, PRON 662, AUX 335, DET 217, ADJ 209, ADP 140, NUM 124, SCONJ 101, CCONJ 93, INTJ 92, PROPN 52), 一 (NUM 32, ADV 1), 幾 (NUM 5, ADV 4)

The 10 most frequent ambiguous types: 一 (NUM 124, ADV 1), 幾 (NUM 8, ADV 4, DET 2), 七十一 (NUM 1, PROPN 1)

Morphology

The form / lemma ratio of NUM is 1.031250 (the average of all parts of speech is 1.624294).

The 1st highest number of forms (11) was observed with the lemma “_”: 一, 七十一, 三, 三十, 三十五, 五, 兩, 六, 十, 半, 幾.

The 2nd highest number of forms (1) was observed with the lemma “一”: 一.

The 3rd highest number of forms (1) was observed with the lemma “一百”: 一百.

NUM does not occur with any features.

Relations

NUM nodes are attached to their parents using 8 different relations: nummod (206; 89% instances), conj (15; 6% instances), reparandum (3; 1% instances), nsubj (2; 1% instances), obj (2; 1% instances), root (2; 1% instances), compound (1; 0% instances), parataxis (1; 0% instances)

Parents of NUM nodes belong to 5 different parts of speech: NOUN (208; 90% instances), NUM (10; 4% instances), VERB (10; 4% instances), ADJ (2; 1% instances), (2; 1% instances)

102 (44%) NUM nodes are leaves.

114 (49%) NUM nodes have one child.

8 (3%) NUM nodes have two children.

8 (3%) NUM nodes have three or more children.

The highest child degree of a NUM node is 5.

Children of NUM nodes are attached using 14 different relations: clf (117; 74% instances), punct (10; 6% instances), case (8; 5% instances), conj (7; 4% instances), cc (4; 3% instances), compound (3; 2% instances), nmod (2; 1% instances), advmod (1; 1% instances), det (1; 1% instances), discourse:sp (1; 1% instances), nsubj (1; 1% instances), nummod (1; 1% instances), parataxis (1; 1% instances), reparandum (1; 1% instances)

Children of NUM nodes belong to 10 different parts of speech: NOUN (122; 77% instances), NUM (10; 6% instances), PUNCT (10; 6% instances), PART (7; 4% instances), CCONJ (4; 3% instances), ADP (1; 1% instances), ADV (1; 1% instances), DET (1; 1% instances), SCONJ (1; 1% instances), VERB (1; 1% instances)