home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Chinese-HK: POS Tags: NOUN

There are 116 NOUN lemmas (25%), 142 NOUN types (25%) and 278 NOUN tokens (15%). Out of 17 observed tags, the rank of NOUN is: 1 in number of lemmas, 2 in number of types and 3 in number of tokens.

The 10 most frequent NOUN lemmas: _、 人、 個、 客人、 年、 以前、 歌、 現在、 夜總會、 檔

The 10 most frequent NOUN types: 人、 個、 爺爺、 家、 年、 客人、 以前、 歌、 現在、 閃卡

The 10 most frequent ambiguous lemmas: _ (VERB 114, PUNCT 111, NOUN 69, ADV 63, PART 54, PRON 49, ADJ 21, NUM 19, AUX 18, ADP 10, PROPN 10, DET 8, INTJ 5, SCONJ 1, X 1), 工作 (NOUN 2, VERB 1), 消費 (NOUN 1, VERB 1)

The 10 most frequent ambiguous types: 工作 (NOUN 2, VERB 1), 消費 (NOUN 1, VERB 1)

Morphology

The form / lemma ratio of NOUN is 1.224138 (the average of all parts of speech is 1.221258).

The 1st highest number of forms (38) was observed with the lemma “_”: 事, 人, 今晚, 個, 元, 分, 功課, 卡通片, 口邊, 品味, 哥哥, 塊, 女朋友, 好運, 姐姐, 媽媽, 家, 小朋友, 年, 張, 戰鬥力, 手, 斤, 明天, 晚上, 朋友, 歲, 爺爺, 物品, 運氣, 邏輯, 錄影機, 錢, 錯, 閃卡, 集, 雞翼, 零用錢.

The 2nd highest number of forms (1) was observed with the lemma “70”: 70.

The 3rd highest number of forms (1) was observed with the lemma “80”: 80.

NOUN does not occur with any features.

Relations

NOUN nodes are attached to their parents using 21 different relations: obj (97; 35% instances), nsubj (34; 12% instances), obl (23; 8% instances), clf (21; 8% instances), obl:tmod (19; 7% instances), root (19; 7% instances), conj (13; 5% instances), dislocated (9; 3% instances), vocative (9; 3% instances), nmod (7; 3% instances), parataxis (6; 2% instances), compound (5; 2% instances), appos (3; 1% instances), compound:vo (3; 1% instances), advcl (2; 1% instances), advmod:df (2; 1% instances), xcomp (2; 1% instances), ccomp (1; 0% instances), det (1; 0% instances), goeswith (1; 0% instances), obl:patient (1; 0% instances)

Parents of NOUN nodes belong to 9 different parts of speech: VERB (190; 68% instances), NOUN (34; 12% instances), (19; 7% instances), ADJ (11; 4% instances), DET (10; 4% instances), NUM (8; 3% instances), PRON (3; 1% instances), PROPN (2; 1% instances), AUX (1; 0% instances)

115 (41%) NOUN nodes are leaves.

89 (32%) NOUN nodes have one child.

43 (15%) NOUN nodes have two children.

31 (11%) NOUN nodes have three or more children.

The highest child degree of a NOUN node is 7.

Children of NOUN nodes are attached using 22 different relations: punct (57; 19% instances), det (44; 15% instances), nummod (33; 11% instances), case (30; 10% instances), nmod (22; 7% instances), conj (21; 7% instances), acl (18; 6% instances), cop (16; 5% instances), advmod (11; 4% instances), amod (9; 3% instances), nsubj (9; 3% instances), compound (7; 2% instances), parataxis (6; 2% instances), appos (4; 1% instances), clf (3; 1% instances), obl (3; 1% instances), cc (2; 1% instances), case:loc (1; 0% instances), discourse:sp (1; 0% instances), goeswith (1; 0% instances), mark (1; 0% instances), obl:tmod (1; 0% instances)

Children of NOUN nodes belong to 14 different parts of speech: PUNCT (57; 19% instances), DET (43; 14% instances), VERB (40; 13% instances), NUM (38; 13% instances), NOUN (34; 11% instances), ADP (24; 8% instances), PRON (23; 8% instances), ADV (11; 4% instances), ADJ (10; 3% instances), PART (9; 3% instances), PROPN (7; 2% instances), CCONJ (2; 1% instances), SCONJ (1; 0% instances), SYM (1; 0% instances)