home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Chinese: POS Tags: NOUN

There are 8161 NOUN lemmas (36%), 8162 NOUN types (36%) and 34043 NOUN tokens (28%). Out of 15 observed tags, the rank of NOUN is: 1 in number of lemmas, 1 in number of types and 1 in number of tokens.

The 10 most frequent NOUN lemmas: 年、 個、 月、 人、 日、 等、 種、 次、 人口、 名

The 10 most frequent NOUN types: 年、 個、 月、 日、 人、 等、 種、 次、 人口、 名

The 10 most frequent ambiguous lemmas: 年 (NOUN 1558, PART 6), 月 (NOUN 604, PART 1), 人 (NOUN 385, PART 240, VERB 1), 日 (NOUN 382, PROPN 53, PART 7, NUM 2), 等 (NOUN 231, VERB 3, PART 1), 種 (NOUN 187, PART 5, VERB 1), 次 (NOUN 149, VERB 4, PART 3, NUM 1), 名 (NOUN 128, PART 6, VERB 3), 大學 (NOUN 120, PROPN 1), 世界 (NOUN 107, PROPN 1)

The 10 most frequent ambiguous types: 年 (NOUN 1558, PART 6), 月 (NOUN 604, PART 1), 日 (NOUN 382, PROPN 53, PART 7, NUM 2), 人 (NOUN 365, PART 240, VERB 1), 等 (NOUN 231, VERB 3, PART 1), 種 (NOUN 187, PART 5, VERB 1), 次 (NOUN 149, VERB 4, PART 3, NUM 1), 名 (NOUN 128, PART 6, VERB 3), 大學 (NOUN 120, PROPN 1), 世界 (NOUN 107, PROPN 1)

Morphology

The form / lemma ratio of NOUN is 1.000123 (the average of all parts of speech is 1.000266).

The 1st highest number of forms (2) was observed with the lemma “人”: 人, 人們.

The 2nd highest number of forms (1) was observed with the lemma “8.17”: 8.17.

The 3rd highest number of forms (1) was observed with the lemma “m”: m.

NOUN occurs with 1 features: Number (20; 0% instances)

NOUN occurs with 1 feature-value pairs: Number=Plur

NOUN occurs with 2 feature combinations. The most frequent feature combination is _ (34023 tokens). Examples: 年、 個、 月、 日、 人、 等、 種、 次、 人口、 名

Relations

NOUN nodes are attached to their parents using 28 different relations: nmod (7974; 23% instances), obj (5756; 17% instances), nsubj (5410; 16% instances), clf (2244; 7% instances), case:suff (1954; 6% instances), obl (1916; 6% instances), det (1782; 5% instances), conj (1633; 5% instances), nmod:tmod (1549; 5% instances), appos (887; 3% instances), acl (738; 2% instances), advmod (627; 2% instances), root (542; 2% instances), dep (476; 1% instances), ccomp (203; 1% instances), nsubj:pass (151; 0% instances), xcomp (48; 0% instances), iobj (44; 0% instances), csubj (37; 0% instances), advcl (18; 0% instances), acl:relcl (16; 0% instances), amod (13; 0% instances), dislocated (9; 0% instances), nummod (7; 0% instances), mark (4; 0% instances), case:pref (2; 0% instances), orphan (2; 0% instances), mark:relcl (1; 0% instances)

Parents of NOUN nodes belong to 16 different parts of speech: VERB (14404; 42% instances), NOUN (13276; 39% instances), PART (3726; 11% instances), ADJ (807; 2% instances), PROPN (779; 2% instances), (542; 2% instances), NUM (226; 1% instances), ADP (141; 0% instances), X (101; 0% instances), PRON (18; 0% instances), ADV (13; 0% instances), SYM (4; 0% instances), DET (2; 0% instances), PUNCT (2; 0% instances), AUX (1; 0% instances), CCONJ (1; 0% instances)

12908 (38%) NOUN nodes are leaves.

10800 (32%) NOUN nodes have one child.

5580 (16%) NOUN nodes have two children.

4755 (14%) NOUN nodes have three or more children.

The highest child degree of a NOUN node is 19.

Children of NOUN nodes are attached using 34 different relations: nmod (10444; 25% instances), nummod (6100; 15% instances), det (3841; 9% instances), punct (3159; 8% instances), clf (2231; 5% instances), case (2088; 5% instances), case:dec (1788; 4% instances), amod (1759; 4% instances), conj (1625; 4% instances), acl:relcl (1506; 4% instances), acl (1439; 3% instances), cop (1181; 3% instances), nsubj (1115; 3% instances), cc (984; 2% instances), case:pref (633; 2% instances), appos (477; 1% instances), dep (402; 1% instances), advmod (272; 1% instances), mark (80; 0% instances), csubj (67; 0% instances), nmod:tmod (47; 0% instances), dislocated (37; 0% instances), advcl (32; 0% instances), case:suff (32; 0% instances), ccomp (19; 0% instances), mark:relcl (17; 0% instances), aux (10; 0% instances), obj (10; 0% instances), xcomp (9; 0% instances), discourse (7; 0% instances), aux:caus (2; 0% instances), orphan (2; 0% instances), case:aspect (1; 0% instances), mark:advb (1; 0% instances)

Children of NOUN nodes belong to 15 different parts of speech: NOUN (13276; 32% instances), NUM (6213; 15% instances), PART (4169; 10% instances), ADP (3216; 8% instances), PROPN (3200; 8% instances), PUNCT (3139; 8% instances), VERB (2054; 5% instances), ADJ (1739; 4% instances), AUX (1192; 3% instances), DET (1117; 3% instances), CCONJ (981; 2% instances), PRON (589; 1% instances), ADV (270; 1% instances), X (244; 1% instances), SYM (18; 0% instances)