home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Cantonese-HK: POS Tags: PRON

There are 29 PRON lemmas (3%), 34 PRON types (2%) and 1183 PRON tokens (8%). Out of 15 observed tags, the rank of PRON is: 8 in number of lemmas, 10 in number of types and 6 in number of tokens.

The 10 most frequent PRON lemmas: _、 我、 你、 佢、 我哋、 咩、 佢哋、 自己、 人哋、 呢度

The 10 most frequent PRON types: 我、 你、 佢、 我哋、 大家、 自己、 佢哋、 嗰度、 呢個、 咩

The 10 most frequent ambiguous lemmas: _ (PUNCT 1377, VERB 1352, NOUN 1283, ADV 853, PART 764, PRON 662, AUX 335, DET 217, ADJ 209, ADP 140, NUM 124, SCONJ 101, CCONJ 93, INTJ 92, PROPN 52), 咩 (DET 15, PRON 13, PART 10), 呢啲 (PRON 6, DET 2), 嗰啲 (PRON 5, DET 2), 嗰時 (PRON 4, ADP 3), 邊 (PRON 4, ADV 1), 乜 (DET 2, PRON 2), 呢個 (PART 2, PRON 1), 幾多 (DET 4, PRON 1)

The 10 most frequent ambiguous types: 佢 (PRON 106, VERB 2), 呢個 (PRON 20, DET 16, PART 2), 咩 (PRON 17, DET 15, PART 10), 依個 (DET 25, PRON 9), 呢啲 (DET 7, PRON 7), 嗰啲 (PRON 7, DET 3), 嗰個 (DET 14, PRON 5), 邊 (PRON 5, ADV 1, DET 1), 嗰時 (PRON 4, ADP 3), 乜 (DET 2, PRON 2)

Morphology

The form / lemma ratio of PRON is 1.172414 (the average of all parts of speech is 1.624294).

The 1st highest number of forms (26) was observed with the lemma “_”: 乜嘢, 人哋, 什麼, 你, 你哋, 佢, 佢哋, 依個, 依啲, 依度, 呢個, 呢啲, 呢度, 咩, 嗰個, 嗰啲, 嗰度, 大家, 我, 我們, 我哋, 自己, 講講, 邊, 邊個, 閣下.

The 2nd highest number of forms (1) was observed with the lemma “一個二個”: 一個二個.

The 3rd highest number of forms (1) was observed with the lemma “乜”: 乜.

PRON does not occur with any features.

Relations

PRON nodes are attached to their parents using 21 different relations: nsubj (680; 57% instances), obj (202; 17% instances), nmod (115; 10% instances), reparandum (32; 3% instances), obl (31; 3% instances), appos (23; 2% instances), compound (19; 2% instances), det (14; 1% instances), iobj (12; 1% instances), root (9; 1% instances), obj:periph (8; 1% instances), nsubj:periph (7; 1% instances), advcl (6; 1% instances), case:loc (6; 1% instances), conj (6; 1% instances), dislocated (4; 0% instances), obl:tmod (3; 0% instances), discourse (2; 0% instances), vocative (2; 0% instances), amod (1; 0% instances), obl:agent (1; 0% instances)

Parents of PRON nodes belong to 9 different parts of speech: VERB (885; 75% instances), NOUN (207; 17% instances), PRON (27; 2% instances), ADJ (19; 2% instances), AUX (18; 2% instances), ADV (11; 1% instances), (9; 1% instances), PROPN (4; 0% instances), ADP (3; 0% instances)

997 (84%) PRON nodes are leaves.

143 (12%) PRON nodes have one child.

31 (3%) PRON nodes have two children.

12 (1%) PRON nodes have three or more children.

The highest child degree of a PRON node is 10.

Children of PRON nodes are attached using 23 different relations: punct (79; 31% instances), case (68; 26% instances), discourse:sp (26; 10% instances), reparandum (19; 7% instances), appos (10; 4% instances), advmod (9; 3% instances), acl (8; 3% instances), compound (6; 2% instances), cop (5; 2% instances), nsubj (5; 2% instances), conj (4; 2% instances), case:loc (3; 1% instances), aux (2; 1% instances), ccomp (2; 1% instances), det (2; 1% instances), mark (2; 1% instances), nmod (2; 1% instances), advcl:coverb (1; 0% instances), cc (1; 0% instances), discourse (1; 0% instances), dislocated (1; 0% instances), nsubj:periph (1; 0% instances), obl (1; 0% instances)

Children of PRON nodes belong to 12 different parts of speech: PUNCT (79; 31% instances), PART (67; 26% instances), PRON (27; 10% instances), ADP (26; 10% instances), NOUN (24; 9% instances), VERB (11; 4% instances), ADV (9; 3% instances), AUX (7; 3% instances), PROPN (3; 1% instances), ADJ (2; 1% instances), SCONJ (2; 1% instances), CCONJ (1; 0% instances)