home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Chinese-HK: POS Tags: NUM

There are 16 NUM lemmas (3%), 20 NUM types (4%) and 49 NUM tokens (3%). Out of 17 observed tags, the rank of NUM is: 8 in number of lemmas, 8 in number of types and 9 in number of tokens.

The 10 most frequent NUM lemmas: _、 一、 20、 100、 三、 15、 200、 300、 45、 三十

The 10 most frequent NUM types: 一、 20、 三、 兩、 四、 100、 一百、 三十、 六十六、 十

The 10 most frequent ambiguous lemmas: _ (VERB 114, PUNCT 111, NOUN 69, ADV 63, PART 54, PRON 49, ADJ 21, NUM 19, AUX 18, ADP 10, PROPN 10, DET 8, INTJ 5, SCONJ 1, X 1)

The 10 most frequent ambiguous types:

Morphology

The form / lemma ratio of NUM is 1.250000 (the average of all parts of speech is 1.221258).

The 1st highest number of forms (11) was observed with the lemma “_”: 一, 一百, 三, 三十, 三十三, 兩, 六, 六十六, 十, 半, 四.

The 2nd highest number of forms (1) was observed with the lemma “100”: 100.

The 3rd highest number of forms (1) was observed with the lemma “15”: 15.

NUM does not occur with any features.

Relations

NUM nodes are attached to their parents using 4 different relations: nummod (40; 82% instances), conj (7; 14% instances), obj (1; 2% instances), root (1; 2% instances)

Parents of NUM nodes belong to 6 different parts of speech: NOUN (38; 78% instances), SYM (6; 12% instances), NUM (2; 4% instances), PROPN (1; 2% instances), (1; 2% instances), VERB (1; 2% instances)

39 (80%) NUM nodes are leaves.

9 (18%) NUM nodes have one child.

0 (0%) NUM nodes have two children.

1 (2%) NUM nodes have three or more children.

The highest child degree of a NUM node is 4.

Children of NUM nodes are attached using 4 different relations: clf (7; 54% instances), punct (3; 23% instances), conj (2; 15% instances), appos (1; 8% instances)

Children of NUM nodes belong to 3 different parts of speech: NOUN (8; 62% instances), PUNCT (3; 23% instances), NUM (2; 15% instances)