home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Cantonese-HK: POS Tags: NUM

There are 30 NUM lemmas (3%), 30 NUM types (3%) and 107 NUM tokens (2%). Out of 15 observed tags, the rank of NUM is: 7 in number of lemmas, 7 in number of types and 11 in number of tokens.

The 10 most frequent NUM lemmas: 一, 兩, 三, 三十, 一百, 十, 四, 幾, 二十, 六十六

The 10 most frequent NUM types: 一, 兩, 三, 三十, 一百, 十, 四, 幾, 二十, 六十六

The 10 most frequent ambiguous lemmas: (NUM 32, ADV 1), (NUM 5, ADV 4)

The 10 most frequent ambiguous types: (NUM 32, ADV 1), (NUM 5, ADV 4)

Morphology

The form / lemma ratio of NUM is 1.000000 (the average of all parts of speech is 0.998084).

The 1st highest number of forms (1) was observed with the lemma “一”: .

The 2nd highest number of forms (1) was observed with the lemma “一百”: 一百.

The 3rd highest number of forms (1) was observed with the lemma “七”: .

NUM does not occur with any features.

Relations

NUM nodes are attached to their parents using 8 different relations: nummod (86; 80% instances), conj (13; 12% instances), obj (2; 2% instances), root (2; 2% instances), det (1; 1% instances), nsubj (1; 1% instances), parataxis (1; 1% instances), reparandum (1; 1% instances)

Parents of NUM nodes belong to 4 different parts of speech: NOUN (92; 86% instances), NUM (9; 8% instances), VERB (4; 4% instances), (2; 2% instances)

70 (65%) NUM nodes are leaves.

32 (30%) NUM nodes have one child.

0 (0%) NUM nodes have two children.

5 (5%) NUM nodes have three or more children.

The highest child degree of a NUM node is 4.

Children of NUM nodes are attached using 10 different relations: clf (27; 55% instances), punct (7; 14% instances), conj (6; 12% instances), cc (3; 6% instances), advmod (1; 2% instances), discourse:sp (1; 2% instances), nsubj (1; 2% instances), nummod (1; 2% instances), parataxis (1; 2% instances), reparandum (1; 2% instances)

Children of NUM nodes belong to 7 different parts of speech: NOUN (27; 55% instances), NUM (9; 18% instances), PUNCT (7; 14% instances), CCONJ (3; 6% instances), ADV (1; 2% instances), PART (1; 2% instances), VERB (1; 2% instances)