Treebank Statistics: UD_Chinese-GSD: POS Tags: NUM
There are 1254 NUM lemmas (6%), 1254 NUM types (6%) and 6659 NUM tokens (5%).
Out of 16 observed tags, the rank of NUM is: 4 in number of lemmas, 4 in number of types and 6 in number of tokens.
The 10 most frequent NUM lemmas: 一、 兩、 三、 1、 第一、 3、 12、 5、 2、 8
The 10 most frequent NUM types: 一、 兩、 三、 1、 第一、 3、 12、 5、 2、 8
The 10 most frequent ambiguous lemmas: 一 (NUM 1124, NOUN 1), 第一 (NUM 117, ADJ 1, PROPN 1), 多 (NUM 83, ADV 28, ADJ 16, PART 3), 雙 (NUM 35, NOUN 1), 很多 (NUM 33, ADJ 4), 單 (NUM 26, PART 2), 半 (NUM 24, PART 6), 數 (NUM 22, PART 15), 九 (NUM 16, PROPN 2), 眾多 (ADJ 8, NUM 8)
The 10 most frequent ambiguous types: 一 (NUM 1124, NOUN 1), 第一 (NUM 117, ADJ 1, PROPN 1), 多 (NUM 83, ADV 28, ADJ 16, PART 3), 雙 (NUM 35, NOUN 1), 很多 (NUM 33, ADJ 4), 單 (NUM 26, PART 2), 半 (NUM 24, PART 6), 數 (NUM 22, PART 15), 九 (NUM 16, PROPN 2), 眾多 (ADJ 8, NUM 8)
- 一
- 第一
- 多
- 雙
- 很多
- 單
- 半
- 數
- 九
- 眾多
Morphology
The form / lemma ratio of NUM is 1.000000 (the average of all parts of speech is 1.004732).
The 1st highest number of forms (1) was observed with the lemma “-15”: -15.
The 2nd highest number of forms (1) was observed with the lemma “-154”: -154.
The 3rd highest number of forms (1) was observed with the lemma “-300”: -300.
NUM occurs with 1 features: NumType (6658; 100% instances)
NUM occurs with 2 feature-value pairs: NumType=Card, NumType=Ord
NUM occurs with 3 feature combinations.
The most frequent feature combination is NumType=Card (6257 tokens).
Examples: 一、 兩、 三、 1、 3、 12、 5、 2、 8、 10
Relations
NUM nodes are attached to their parents using 18 different relations: nummod (6237; 94% instances), obj (61; 1% instances), obl (58; 1% instances), conj (57; 1% instances), root (53; 1% instances), nmod (51; 1% instances), parataxis (44; 1% instances), nsubj (30; 0% instances), acl (12; 0% instances), nmod:tmod (12; 0% instances), appos (11; 0% instances), compound (10; 0% instances), advcl (6; 0% instances), amod (6; 0% instances), ccomp (5; 0% instances), xcomp (4; 0% instances), flat (1; 0% instances), nsubj:pass (1; 0% instances)
Parents of NUM nodes belong to 9 different parts of speech: NOUN (6201; 93% instances), VERB (171; 3% instances), PART (95; 1% instances), NUM (72; 1% instances), (53; 1% instances), PROPN (26; 0% instances), X (24; 0% instances), ADJ (16; 0% instances), SYM (1; 0% instances)
4240 (64%) NUM nodes are leaves.
2230 (33%) NUM nodes have one child.
73 (1%) NUM nodes have two children.
116 (2%) NUM nodes have three or more children.
The highest child degree of a NUM node is 7.
Children of NUM nodes are attached using 24 different relations: clf (2033; 72% instances), punct (191; 7% instances), nmod (143; 5% instances), nsubj (97; 3% instances), cop (93; 3% instances), case (59; 2% instances), conj (57; 2% instances), cc (44; 2% instances), advmod (35; 1% instances), acl (22; 1% instances), det (15; 1% instances), parataxis (12; 0% instances), nummod (10; 0% instances), appos (8; 0% instances), nmod:tmod (7; 0% instances), obl (5; 0% instances), csubj (4; 0% instances), mark (2; 0% instances), acl:relcl (1; 0% instances), advcl (1; 0% instances), amod (1; 0% instances), ccomp (1; 0% instances), obj (1; 0% instances), xcomp (1; 0% instances)
Children of NUM nodes belong to 16 different parts of speech: NOUN (2233; 79% instances), PUNCT (191; 7% instances), AUX (93; 3% instances), PART (77; 3% instances), NUM (72; 3% instances), CCONJ (44; 2% instances), ADV (35; 1% instances), VERB (24; 1% instances), ADP (17; 1% instances), DET (15; 1% instances), PROPN (14; 0% instances), PRON (12; 0% instances), X (6; 0% instances), SYM (5; 0% instances), ADJ (3; 0% instances), SCONJ (2; 0% instances)