Treebank Statistics: UD_Chinese: POS Tags: NUM
There are 1257 NUM
lemmas (6%), 1257 NUM
types (6%) and 6659 NUM
tokens (5%).
Out of 15 observed tags, the rank of NUM
is: 4 in number of lemmas, 4 in number of types and 6 in number of tokens.
The 10 most frequent NUM
lemmas: 一、 兩、 三、 1、 第一、 3、 12、 5、 2、 8
The 10 most frequent NUM
types: 一、 兩、 三、 1、 第一、 3、 12、 5、 2、 8
The 10 most frequent ambiguous lemmas: 一 (NUM 1123, NOUN 1), 第一 (NUM 117, PROPN 2), 四 (NUM 84, X 1), 多 (NUM 83, ADV 28, ADJ 16, PART 3), 雙 (NUM 35, NOUN 1), 很多 (NUM 33, ADJ 4), 單 (NUM 26, PART 2), 半 (NUM 24, PART 6), 數 (NUM 22, PART 15), 九 (NUM 16, PROPN 2)
The 10 most frequent ambiguous types: 一 (NUM 1123, NOUN 1), 第一 (NUM 117, PROPN 2), 四 (NUM 84, X 1), 多 (NUM 83, ADV 28, ADJ 16, PART 3), 雙 (NUM 35, NOUN 1), 很多 (NUM 33, ADJ 4), 單 (NUM 26, PART 2), 半 (NUM 24, PART 6), 數 (NUM 22, PART 15), 九 (NUM 16, PROPN 2)
- 一
- 第一
- 四
- 多
- 雙
- 很多
- 單
- 半
- 數
- 九
Morphology
The form / lemma ratio of NUM
is 1.000000 (the average of all parts of speech is 1.000266).
The 1st highest number of forms (1) was observed with the lemma “,”: ,.
The 2nd highest number of forms (1) was observed with the lemma “-15”: -15.
The 3rd highest number of forms (1) was observed with the lemma “-154”: -154.
NUM
occurs with 1 features: NumType (6659; 100% instances)
NUM
occurs with 1 feature-value pairs: NumType=Card
NUM
occurs with 1 feature combinations.
The most frequent feature combination is NumType=Card
(6659 tokens).
Examples: 一、 兩、 三、 1、 第一、 3、 12、 5、 2、 8
Relations
NUM
nodes are attached to their parents using 19 different relations: nummod (6198; 93% instances), root (77; 1% instances), obj (62; 1% instances), conj (53; 1% instances), nmod (51; 1% instances), advmod (50; 1% instances), det (40; 1% instances), nsubj (32; 0% instances), dep (26; 0% instances), acl (15; 0% instances), case:suff (10; 0% instances), nmod:tmod (10; 0% instances), appos (8; 0% instances), obl (8; 0% instances), amod (6; 0% instances), ccomp (5; 0% instances), xcomp (4; 0% instances), punct (3; 0% instances), nsubj:pass (1; 0% instances)
Parents of NUM
nodes belong to 9 different parts of speech: NOUN (6213; 93% instances), VERB (152; 2% instances), PART (93; 1% instances), (77; 1% instances), NUM (70; 1% instances), X (23; 0% instances), ADJ (15; 0% instances), PROPN (15; 0% instances), SYM (1; 0% instances)
6304 (95%) NUM
nodes are leaves.
191 (3%) NUM
nodes have one child.
56 (1%) NUM
nodes have two children.
108 (2%) NUM
nodes have three or more children.
The highest child degree of a NUM
node is 16.
Children of NUM
nodes are attached using 24 different relations: punct (231; 26% instances), det (113; 13% instances), nsubj (97; 11% instances), cop (93; 10% instances), dep (68; 8% instances), conj (51; 6% instances), cc (44; 5% instances), case:dec (40; 4% instances), nmod (40; 4% instances), advmod (39; 4% instances), acl (25; 3% instances), nummod (11; 1% instances), appos (10; 1% instances), case (10; 1% instances), nmod:tmod (7; 1% instances), csubj (4; 0% instances), flat:foreign (4; 0% instances), mark (2; 0% instances), acl:relcl (1; 0% instances), amod (1; 0% instances), case:pref (1; 0% instances), ccomp (1; 0% instances), obj (1; 0% instances), xcomp (1; 0% instances)
Children of NUM
nodes belong to 15 different parts of speech: PUNCT (227; 25% instances), NOUN (226; 25% instances), AUX (93; 10% instances), PART (77; 9% instances), NUM (70; 8% instances), CCONJ (44; 5% instances), VERB (43; 5% instances), ADV (36; 4% instances), ADP (16; 2% instances), DET (15; 2% instances), PROPN (14; 2% instances), PRON (12; 1% instances), X (12; 1% instances), ADJ (6; 1% instances), SYM (4; 0% instances)