home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Chinese-GSDSimp: POS Tags: NUM

There are 1255 NUM lemmas (6%), 1255 NUM types (6%) and 6660 NUM tokens (5%). Out of 16 observed tags, the rank of NUM is: 4 in number of lemmas, 4 in number of types and 6 in number of tokens.

The 10 most frequent NUM lemmas: 一、 两、 三、 1、 第一、 3、 12、 5、 2、 8

The 10 most frequent NUM types: 一、 两、 三、 1、 第一、 3、 12、 5、 2、 8

The 10 most frequent ambiguous lemmas: 一 (NUM 1124, NOUN 1), 第一 (NUM 117, ADJ 1, PROPN 1), 多 (NUM 83, ADV 28, ADJ 16, PART 3), 双 (NUM 35, NOUN 1), 很多 (NUM 33, ADJ 4), 单 (NUM 26, PART 2), 半 (NUM 24, PART 6), 数 (NUM 22, PART 15), 九 (NUM 16, PROPN 2), 众多 (ADJ 8, NUM 8)

The 10 most frequent ambiguous types: 一 (NUM 1124, NOUN 1), 第一 (NUM 117, ADJ 1, PROPN 1), 多 (NUM 83, ADV 28, ADJ 16, PART 3), 双 (NUM 35, NOUN 1), 很多 (NUM 33, ADJ 4), 单 (NUM 26, PART 2), 半 (NUM 24, PART 6), 数 (NUM 22, PART 15), 九 (NUM 16, PROPN 2), 众多 (ADJ 8, NUM 8)

Morphology

The form / lemma ratio of NUM is 1.000000 (the average of all parts of speech is 1.004660).

The 1st highest number of forms (1) was observed with the lemma “-15”: -15.

The 2nd highest number of forms (1) was observed with the lemma “-154”: -154.

The 3rd highest number of forms (1) was observed with the lemma “-300”: -300.

NUM occurs with 1 features: NumType (6659; 100% instances)

NUM occurs with 2 feature-value pairs: NumType=Card, NumType=Ord

NUM occurs with 3 feature combinations. The most frequent feature combination is NumType=Card (6258 tokens). Examples: 一、 两、 三、 1、 3、 12、 5、 2、 8、 10

Relations

NUM nodes are attached to their parents using 17 different relations: nummod (6237; 94% instances), obj (61; 1% instances), conj (58; 1% instances), obl (58; 1% instances), root (53; 1% instances), nmod (51; 1% instances), parataxis (44; 1% instances), nsubj (30; 0% instances), acl (12; 0% instances), appos (12; 0% instances), nmod:tmod (12; 0% instances), compound (10; 0% instances), advcl (6; 0% instances), amod (6; 0% instances), ccomp (5; 0% instances), xcomp (4; 0% instances), nsubj:pass (1; 0% instances)

Parents of NUM nodes belong to 10 different parts of speech: NOUN (6201; 93% instances), VERB (171; 3% instances), PART (96; 1% instances), NUM (72; 1% instances), (53; 1% instances), PROPN (25; 0% instances), X (22; 0% instances), ADJ (16; 0% instances), DET (3; 0% instances), SYM (1; 0% instances)

4239 (64%) NUM nodes are leaves.

2242 (34%) NUM nodes have one child.

61 (1%) NUM nodes have two children.

118 (2%) NUM nodes have three or more children.

The highest child degree of a NUM node is 7.

Children of NUM nodes are attached using 25 different relations: clf (2033; 72% instances), punct (185; 7% instances), nmod (142; 5% instances), nsubj (97; 3% instances), cop (93; 3% instances), case (59; 2% instances), conj (57; 2% instances), cc (44; 2% instances), advmod (34; 1% instances), acl (22; 1% instances), det (15; 1% instances), parataxis (12; 0% instances), nummod (10; 0% instances), appos (8; 0% instances), nmod:tmod (7; 0% instances), obl (5; 0% instances), csubj (4; 0% instances), flat:foreign (4; 0% instances), mark (2; 0% instances), acl:relcl (1; 0% instances), advcl (1; 0% instances), amod (1; 0% instances), ccomp (1; 0% instances), obj (1; 0% instances), xcomp (1; 0% instances)

Children of NUM nodes belong to 16 different parts of speech: NOUN (2234; 79% instances), PUNCT (185; 7% instances), AUX (93; 3% instances), PART (77; 3% instances), NUM (72; 3% instances), CCONJ (44; 2% instances), ADV (34; 1% instances), VERB (24; 1% instances), ADP (17; 1% instances), DET (15; 1% instances), PROPN (13; 0% instances), PRON (12; 0% instances), X (10; 0% instances), SYM (4; 0% instances), ADJ (3; 0% instances), SCONJ (2; 0% instances)