home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Chinese-Beginner: POS Tags: NOUN

There are 698 NOUN lemmas (40%), 698 NOUN types (40%) and 4085 NOUN tokens (20%). Out of 15 observed tags, the rank of NOUN is: 1 in number of lemmas, 1 in number of types and 1 in number of tokens.

The 10 most frequent NOUN lemmas: 个、 家、 人、 老板、 点、 妈妈、 菜、 公司、 工作、 今天

The 10 most frequent NOUN types: 个、 家、 人、 老板、 点、 妈妈、 菜、 公司、 工作、 今天

The 10 most frequent ambiguous lemmas: 点 (NOUN 58, ADV 21, VERB 5, DET 4), 工作 (NOUN 50, VERB 17), 钱 (NOUN 48, PROPN 1), 些 (NOUN 43, DET 1), 班 (NOUN 40, VERB 2), 时候 (NOUN 39, ADP 1), 上 (VERB 48, NOUN 28, ADP 5), 半 (NOUN 28, NUM 11), 晚上 (NOUN 22, VERB 2), 气 (NOUN 20, VERB 1)

The 10 most frequent ambiguous types: 点 (NOUN 58, ADV 21, VERB 5, DET 4), 工作 (NOUN 50, VERB 17), 钱 (NOUN 48, PROPN 1), 些 (NOUN 43, DET 1), 班 (NOUN 40, VERB 2), 时候 (NOUN 39, ADP 1), 上 (VERB 48, NOUN 28, ADP 5), 半 (NOUN 28, NUM 11), 晚上 (NOUN 22, VERB 2), 气 (NOUN 20, VERB 1)

Morphology

The form / lemma ratio of NOUN is 1.000000 (the average of all parts of speech is 1.000000).

The 1st highest number of forms (1) was observed with the lemma “一个”: 一个.

The 2nd highest number of forms (1) was observed with the lemma “一会儿”: 一会儿.

The 3rd highest number of forms (1) was observed with the lemma “一半”: 一半.

NOUN does not occur with any features.

Relations

NOUN nodes are attached to their parents using 20 different relations: obj (1337; 33% instances), nsubj (791; 19% instances), clf (535; 13% instances), nmod (350; 9% instances), root (285; 7% instances), obl:tmod (241; 6% instances), obl (197; 5% instances), obl:arg (136; 3% instances), conj (95; 2% instances), obl:lmod (37; 1% instances), parataxis (17; 0% instances), appos (14; 0% instances), vocative (11; 0% instances), ccomp (10; 0% instances), nsubj:outer (8; 0% instances), flat (6; 0% instances), advcl (5; 0% instances), compound:svc (4; 0% instances), xcomp (4; 0% instances), discourse (2; 0% instances)

Parents of NOUN nodes belong to 12 different parts of speech: VERB (2238; 55% instances), NOUN (560; 14% instances), ADJ (392; 10% instances), (285; 7% instances), DET (263; 6% instances), NUM (235; 6% instances), PRON (48; 1% instances), AUX (22; 1% instances), ADV (17; 0% instances), ADP (14; 0% instances), PROPN (7; 0% instances), PART (4; 0% instances)

2025 (50%) NOUN nodes are leaves.

1424 (35%) NOUN nodes have one child.

371 (9%) NOUN nodes have two children.

265 (6%) NOUN nodes have three or more children.

The highest child degree of a NOUN node is 8.

Children of NOUN nodes are attached using 32 different relations: nmod (846; 25% instances), nummod (446; 13% instances), det (318; 9% instances), punct (298; 9% instances), case (286; 9% instances), nsubj (168; 5% instances), cop (154; 5% instances), amod (150; 4% instances), advmod (119; 4% instances), acl (104; 3% instances), conj (97; 3% instances), cc (66; 2% instances), dep (59; 2% instances), parataxis (55; 2% instances), clf (40; 1% instances), obl:tmod (27; 1% instances), discourse (24; 1% instances), discourse:sp (20; 1% instances), obj (13; 0% instances), appos (9; 0% instances), obl:arg (8; 0% instances), aux (7; 0% instances), flat (7; 0% instances), advcl (6; 0% instances), csubj (6; 0% instances), ccomp (5; 0% instances), obl:lmod (5; 0% instances), mark (3; 0% instances), nsubj:outer (2; 0% instances), compound (1; 0% instances), fixed (1; 0% instances), obl (1; 0% instances)

Children of NOUN nodes belong to 15 different parts of speech: NOUN (560; 17% instances), PRON (489; 15% instances), NUM (474; 14% instances), DET (345; 10% instances), PUNCT (298; 9% instances), ADJ (212; 6% instances), PART (183; 5% instances), VERB (177; 5% instances), ADP (163; 5% instances), AUX (162; 5% instances), ADV (133; 4% instances), PROPN (83; 2% instances), CCONJ (51; 2% instances), SCONJ (19; 1% instances), INTJ (2; 0% instances)