home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Chinese-GSD: POS Tags: NOUN

There are 8159 NOUN lemmas (36%), 8160 NOUN types (36%) and 34044 NOUN tokens (28%). Out of 16 observed tags, the rank of NOUN is: 1 in number of lemmas, 1 in number of types and 1 in number of tokens.

The 10 most frequent NOUN lemmas: 年、 個、 月、 人、 日、 等、 種、 次、 人口、 名

The 10 most frequent NOUN types: 年、 個、 月、 日、 人、 等、 種、 次、 人口、 名

The 10 most frequent ambiguous lemmas: 年 (NOUN 1558, PART 6), 月 (NOUN 604, PART 1), 人 (NOUN 385, PART 240, VERB 1), 日 (NOUN 382, PROPN 53, PART 7, NUM 2), 等 (NOUN 231, VERB 4, PART 1), 種 (NOUN 187, PART 5, VERB 1), 次 (NOUN 149, VERB 4, PART 3, NUM 1), 名 (NOUN 128, PART 6, VERB 3), 大學 (NOUN 120, PROPN 1), 世界 (NOUN 107, PROPN 1)

The 10 most frequent ambiguous types: 年 (NOUN 1558, PART 6), 月 (NOUN 604, PART 1), 日 (NOUN 382, PROPN 53, PART 7, NUM 2), 人 (NOUN 365, PART 240, VERB 1), 等 (NOUN 231, VERB 3, PART 1), 種 (NOUN 187, PART 5, VERB 1), 次 (NOUN 149, VERB 4, PART 3, NUM 1), 名 (NOUN 128, PART 6, VERB 3), 大學 (NOUN 120, PROPN 1), 世界 (NOUN 107, PROPN 1)

Morphology

The form / lemma ratio of NOUN is 1.000123 (the average of all parts of speech is 1.004732).

The 1st highest number of forms (2) was observed with the lemma “人”: 人, 人們.

The 2nd highest number of forms (1) was observed with the lemma “m”: m.

The 3rd highest number of forms (1) was observed with the lemma “n=1”: n=1.

NOUN occurs with 1 features: Number (20; 0% instances)

NOUN occurs with 1 feature-value pairs: Number=Plur

NOUN occurs with 2 feature combinations. The most frequent feature combination is _ (34024 tokens). Examples: 年、 個、 月、 日、 人、 等、 種、 次、 人口、 名

Relations

NOUN nodes are attached to their parents using 27 different relations: nmod (9825; 29% instances), obj (5674; 17% instances), nsubj (5518; 16% instances), obl (2581; 8% instances), clf (2247; 7% instances), compound (1954; 6% instances), conj (1659; 5% instances), nmod:tmod (1555; 5% instances), root (571; 2% instances), acl (570; 2% instances), appos (516; 2% instances), parataxis (398; 1% instances), advcl (227; 1% instances), ccomp (193; 1% instances), nsubj:pass (159; 0% instances), obl:patient (141; 0% instances), xcomp (78; 0% instances), iobj (49; 0% instances), obl:agent (44; 0% instances), csubj (34; 0% instances), acl:relcl (15; 0% instances), amod (10; 0% instances), dislocated (10; 0% instances), nsubj:outer (6; 0% instances), nummod (6; 0% instances), case (2; 0% instances), orphan (2; 0% instances)

Parents of NOUN nodes belong to 14 different parts of speech: VERB (14745; 43% instances), NOUN (11211; 33% instances), PART (3699; 11% instances), NUM (2233; 7% instances), ADJ (762; 2% instances), (571; 2% instances), PROPN (468; 1% instances), DET (216; 1% instances), X (68; 0% instances), ADP (36; 0% instances), PRON (17; 0% instances), ADV (14; 0% instances), SYM (3; 0% instances), AUX (1; 0% instances)

14844 (44%) NOUN nodes are leaves.

8124 (24%) NOUN nodes have one child.

5424 (16%) NOUN nodes have two children.

5652 (17%) NOUN nodes have three or more children.

The highest child degree of a NOUN node is 14.

Children of NOUN nodes are attached using 32 different relations: nmod (12926; 31% instances), nummod (6104; 15% instances), case (5916; 14% instances), punct (4620; 11% instances), amod (1793; 4% instances), conj (1638; 4% instances), acl:relcl (1507; 4% instances), det (1319; 3% instances), cop (1154; 3% instances), nsubj (1098; 3% instances), cc (984; 2% instances), appos (865; 2% instances), acl (501; 1% instances), parataxis (357; 1% instances), advmod (207; 0% instances), mark (82; 0% instances), advcl (72; 0% instances), obl (68; 0% instances), csubj (60; 0% instances), nmod:tmod (53; 0% instances), dislocated (36; 0% instances), compound (33; 0% instances), ccomp (20; 0% instances), xcomp (17; 0% instances), mark:rel (15; 0% instances), obj (12; 0% instances), aux (10; 0% instances), discourse (8; 0% instances), nsubj:outer (2; 0% instances), orphan (2; 0% instances), mark:adv (1; 0% instances), obl:patient (1; 0% instances)

Children of NOUN nodes belong to 16 different parts of speech: NOUN (11211; 27% instances), NUM (6201; 15% instances), PUNCT (4620; 11% instances), PART (4374; 11% instances), PROPN (3456; 8% instances), ADP (3368; 8% instances), VERB (2100; 5% instances), ADJ (1726; 4% instances), AUX (1165; 3% instances), DET (1128; 3% instances), CCONJ (981; 2% instances), PRON (588; 1% instances), X (252; 1% instances), ADV (206; 0% instances), SCONJ (86; 0% instances), SYM (19; 0% instances)