home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Chinese-PUD: POS Tags: NOUN

There are 2218 NOUN lemmas (38%), 2219 NOUN types (38%) and 5413 NOUN tokens (25%). Out of 15 observed tags, the rank of NOUN is: 1 in number of lemmas, 1 in number of types and 1 in number of tokens.

The 10 most frequent NOUN lemmas: 年、 個、 人、 月、 次、 公司、 世紀、 地區、 政府、 工作

The 10 most frequent NOUN types: 年、 個、 人、 月、 次、 公司、 世紀、 地區、 政府、 工作

The 10 most frequent ambiguous lemmas: 人 (NOUN 105, PART 3), 工作 (NOUN 26, VERB 5), 位 (NOUN 22, VERB 14), 影響 (NOUN 15, VERB 6), 研究 (NOUN 12, VERB 2), 天 (NOUN 10, NUM 1), 發展 (NOUN 10, VERB 3), 家 (NOUN 9, PART 1), 計劃 (NOUN 9, VERB 1), 可能 (AUX 24, NOUN 8, VERB 1)

The 10 most frequent ambiguous types: 人 (NOUN 91, PART 3), 工作 (NOUN 26, VERB 5), 位 (NOUN 22, VERB 14), 影響 (NOUN 15, VERB 6), 研究 (NOUN 12, VERB 2), 天 (NOUN 10, NUM 1), 發展 (NOUN 10, VERB 3), 家 (NOUN 9, PART 1), 計劃 (NOUN 9, VERB 1), 可能 (AUX 24, NOUN 8, VERB 1)

Morphology

The form / lemma ratio of NOUN is 1.000451 (the average of all parts of speech is 1.006233).

The 1st highest number of forms (2) was observed with the lemma “人”: 人, 人們.

The 2nd highest number of forms (1) was observed with the lemma “一帶”: 一帶.

The 3rd highest number of forms (1) was observed with the lemma “一會”: 一會.

NOUN occurs with 1 features: Number (14; 0% instances)

NOUN occurs with 1 feature-value pairs: Number=Plur

NOUN occurs with 2 feature combinations. The most frequent feature combination is _ (5399 tokens). Examples: 年、 個、 人、 月、 次、 公司、 世紀、 地區、 政府、 工作

Relations

NOUN nodes are attached to their parents using 23 different relations: compound (1200; 22% instances), obj (1186; 22% instances), nsubj (1002; 19% instances), obl (536; 10% instances), nmod (360; 7% instances), clf (357; 7% instances), conj (236; 4% instances), obl:tmod (210; 4% instances), root (59; 1% instances), appos (53; 1% instances), dep (50; 1% instances), nsubj:pass (40; 1% instances), ccomp (33; 1% instances), obl:patient (29; 1% instances), xcomp (20; 0% instances), obl:agent (14; 0% instances), iobj (9; 0% instances), advcl (7; 0% instances), acl:relcl (4; 0% instances), acl (3; 0% instances), amod (2; 0% instances), csubj (2; 0% instances), vocative (1; 0% instances)

Parents of NOUN nodes belong to 11 different parts of speech: VERB (2985; 55% instances), NOUN (2041; 38% instances), ADJ (139; 3% instances), PROPN (77; 1% instances), (59; 1% instances), ADP (42; 1% instances), NUM (26; 0% instances), PART (19; 0% instances), PRON (14; 0% instances), X (9; 0% instances), DET (2; 0% instances)

1707 (32%) NOUN nodes are leaves.

1834 (34%) NOUN nodes have one child.

1000 (18%) NOUN nodes have two children.

872 (16%) NOUN nodes have three or more children.

The highest child degree of a NOUN node is 10.

Children of NOUN nodes are attached using 31 different relations: compound (1598; 22% instances), case (844; 12% instances), nummod (794; 11% instances), nmod (640; 9% instances), punct (585; 8% instances), amod (399; 6% instances), acl:relcl (398; 6% instances), clf (332; 5% instances), det (319; 4% instances), case:loc (297; 4% instances), conj (222; 3% instances), cc (175; 2% instances), cop (143; 2% instances), appos (142; 2% instances), nsubj (118; 2% instances), advmod (59; 1% instances), dep (37; 1% instances), flat:name (28; 0% instances), acl (19; 0% instances), csubj (17; 0% instances), flat (7; 0% instances), obl (7; 0% instances), aux (6; 0% instances), obj (5; 0% instances), xcomp (5; 0% instances), dislocated (4; 0% instances), mark (4; 0% instances), discourse:sp (2; 0% instances), mark:rel (2; 0% instances), obl:tmod (2; 0% instances), ccomp (1; 0% instances)

Children of NOUN nodes belong to 15 different parts of speech: NOUN (2041; 28% instances), ADP (846; 12% instances), NUM (805; 11% instances), PROPN (623; 9% instances), PUNCT (585; 8% instances), VERB (530; 7% instances), ADJ (406; 6% instances), PART (366; 5% instances), DET (322; 4% instances), PRON (212; 3% instances), CCONJ (175; 2% instances), AUX (149; 2% instances), X (90; 1% instances), ADV (59; 1% instances), SCONJ (2; 0% instances)