Treebank Statistics: UD_Chinese-PUD: POS Tags: NOUN
There are 2218 NOUN
lemmas (38%), 2219 NOUN
types (38%) and 5413 NOUN
tokens (25%).
Out of 15 observed tags, the rank of NOUN
is: 1 in number of lemmas, 1 in number of types and 1 in number of tokens.
The 10 most frequent NOUN
lemmas: 年、 個、 人、 月、 次、 公司、 世紀、 地區、 政府、 工作
The 10 most frequent NOUN
types: 年、 個、 人、 月、 次、 公司、 世紀、 地區、 政府、 工作
The 10 most frequent ambiguous lemmas: 人 (NOUN 105, PART 3), 工作 (NOUN 26, VERB 5), 位 (NOUN 22, VERB 14), 影響 (NOUN 15, VERB 6), 研究 (NOUN 12, VERB 2), 天 (NOUN 10, NUM 1), 發展 (NOUN 10, VERB 3), 家 (NOUN 9, PART 1), 計劃 (NOUN 9, VERB 1), 可能 (AUX 24, NOUN 8, VERB 1)
The 10 most frequent ambiguous types: 人 (NOUN 91, PART 3), 工作 (NOUN 26, VERB 5), 位 (NOUN 22, VERB 14), 影響 (NOUN 15, VERB 6), 研究 (NOUN 12, VERB 2), 天 (NOUN 10, NUM 1), 發展 (NOUN 10, VERB 3), 家 (NOUN 9, PART 1), 計劃 (NOUN 9, VERB 1), 可能 (AUX 24, NOUN 8, VERB 1)
- 人
- 工作
- 位
- 影響
- 研究
- 天
- 發展
- 家
- 計劃
- 可能
Morphology
The form / lemma ratio of NOUN
is 1.000451 (the average of all parts of speech is 1.006233).
The 1st highest number of forms (2) was observed with the lemma “人”: 人, 人們.
The 2nd highest number of forms (1) was observed with the lemma “一帶”: 一帶.
The 3rd highest number of forms (1) was observed with the lemma “一會”: 一會.
NOUN
occurs with 1 features: Number (14; 0% instances)
NOUN
occurs with 1 feature-value pairs: Number=Plur
NOUN
occurs with 2 feature combinations.
The most frequent feature combination is _
(5399 tokens).
Examples: 年、 個、 人、 月、 次、 公司、 世紀、 地區、 政府、 工作
Relations
NOUN
nodes are attached to their parents using 23 different relations: compound (1200; 22% instances), obj (1186; 22% instances), nsubj (1002; 19% instances), obl (536; 10% instances), nmod (360; 7% instances), clf (357; 7% instances), conj (236; 4% instances), obl:tmod (210; 4% instances), root (59; 1% instances), appos (53; 1% instances), dep (50; 1% instances), nsubj:pass (40; 1% instances), ccomp (33; 1% instances), obl:patient (29; 1% instances), xcomp (20; 0% instances), obl:agent (14; 0% instances), iobj (9; 0% instances), advcl (7; 0% instances), acl:relcl (4; 0% instances), acl (3; 0% instances), amod (2; 0% instances), csubj (2; 0% instances), vocative (1; 0% instances)
Parents of NOUN
nodes belong to 11 different parts of speech: VERB (2985; 55% instances), NOUN (2041; 38% instances), ADJ (139; 3% instances), PROPN (77; 1% instances), (59; 1% instances), ADP (42; 1% instances), NUM (26; 0% instances), PART (19; 0% instances), PRON (14; 0% instances), X (9; 0% instances), DET (2; 0% instances)
1707 (32%) NOUN
nodes are leaves.
1834 (34%) NOUN
nodes have one child.
1000 (18%) NOUN
nodes have two children.
872 (16%) NOUN
nodes have three or more children.
The highest child degree of a NOUN
node is 10.
Children of NOUN
nodes are attached using 31 different relations: compound (1598; 22% instances), case (844; 12% instances), nummod (794; 11% instances), nmod (640; 9% instances), punct (585; 8% instances), amod (399; 6% instances), acl:relcl (398; 6% instances), clf (332; 5% instances), det (319; 4% instances), case:loc (297; 4% instances), conj (222; 3% instances), cc (175; 2% instances), cop (143; 2% instances), appos (142; 2% instances), nsubj (118; 2% instances), advmod (59; 1% instances), dep (37; 1% instances), flat:name (28; 0% instances), acl (19; 0% instances), csubj (17; 0% instances), flat (7; 0% instances), obl (7; 0% instances), aux (6; 0% instances), obj (5; 0% instances), xcomp (5; 0% instances), dislocated (4; 0% instances), mark (4; 0% instances), discourse:sp (2; 0% instances), mark:rel (2; 0% instances), obl:tmod (2; 0% instances), ccomp (1; 0% instances)
Children of NOUN
nodes belong to 15 different parts of speech: NOUN (2041; 28% instances), ADP (846; 12% instances), NUM (805; 11% instances), PROPN (623; 9% instances), PUNCT (585; 8% instances), VERB (530; 7% instances), ADJ (406; 6% instances), PART (366; 5% instances), DET (322; 4% instances), PRON (212; 3% instances), CCONJ (175; 2% instances), AUX (149; 2% instances), X (90; 1% instances), ADV (59; 1% instances), SCONJ (2; 0% instances)