home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Cantonese-HK: POS Tags: AUX

There are 28 AUX lemmas (3%), 33 AUX types (2%) and 563 AUX tokens (4%). Out of 15 observed tags, the rank of AUX is: 9 in number of lemmas, 11 in number of types and 7 in number of tokens.

The 10 most frequent AUX lemmas: _、 咗、 係、 要、 唔好、 過、 冇、 可以、 會、 中意

The 10 most frequent AUX types: 係、 咗、 可以、 要、 能夠、 會、 想、 應該、 冇、 過

The 10 most frequent ambiguous lemmas: _ (PUNCT 1377, VERB 1352, NOUN 1283, ADV 853, PART 764, PRON 662, AUX 335, DET 217, ADJ 209, ADP 140, NUM 124, SCONJ 101, CCONJ 93, INTJ 92, PROPN 52), 係 (VERB 65, AUX 44, DET 1), 要 (AUX 25, VERB 4), 過 (AUX 12, ADP 2, VERB 1), 冇 (VERB 37, AUX 11), 中意 (AUX 8, VERB 5), 想 (AUX 6, VERB 1), 唔使 (AUX 5, VERB 2), 得 (VERB 16, ADV 4, AUX 4, PART 2), 有 (VERB 65, AUX 4)

The 10 most frequent ambiguous types: 係 (VERB 312, AUX 99, DET 1), 要 (AUX 55, VERB 6), 會 (AUX 29, NOUN 2), 想 (AUX 28, VERB 2), 冇 (VERB 72, AUX 18), 過 (AUX 17, VERB 4, ADP 2), 有 (VERB 124, AUX 10), 緊 (AUX 9, ADJ 1), 中意 (AUX 8, VERB 5), 可 (AUX 8, INTJ 4, PART 1)

Morphology

The form / lemma ratio of AUX is 1.178571 (the average of all parts of speech is 1.624294).

The 1st highest number of forms (25) was observed with the lemma “_”: 住, 係, 倒, 冇, 可, 可以, 可能, 咗, 唔使, 唔好, 好, 希望, 得, 必須, 想, 應該, 是, 會, 有, 緊, 能, 能夠, 要, 過, 需要.

The 2nd highest number of forms (2) was observed with the lemma “係”: 係, 係咪.

The 3rd highest number of forms (1) was observed with the lemma “中意”: 中意.

AUX does not occur with any features.

Relations

AUX nodes are attached to their parents using 10 different relations: aux (422; 75% instances), cop (93; 17% instances), conj (21; 4% instances), root (8; 1% instances), ccomp (7; 1% instances), reparandum (6; 1% instances), acl (2; 0% instances), parataxis (2; 0% instances), advcl (1; 0% instances), obj (1; 0% instances)

Parents of AUX nodes belong to 9 different parts of speech: VERB (412; 73% instances), NOUN (60; 11% instances), AUX (29; 5% instances), ADJ (28; 5% instances), PROPN (11; 2% instances), (8; 1% instances), ADV (7; 1% instances), PRON (7; 1% instances), PART (1; 0% instances)

461 (82%) AUX nodes are leaves.

67 (12%) AUX nodes have one child.

14 (2%) AUX nodes have two children.

21 (4%) AUX nodes have three or more children.

The highest child degree of a AUX node is 11.

Children of AUX nodes are attached using 20 different relations: advmod (56; 29% instances), punct (35; 18% instances), conj (23; 12% instances), nsubj (18; 9% instances), discourse:sp (12; 6% instances), discourse (8; 4% instances), obj (8; 4% instances), reparandum (6; 3% instances), aux (5; 3% instances), ccomp (5; 3% instances), advcl (4; 2% instances), cc (2; 1% instances), xcomp (2; 1% instances), advmod:df (1; 1% instances), case (1; 1% instances), compound:vo (1; 1% instances), mark (1; 1% instances), obj:periph (1; 1% instances), obl:tmod (1; 1% instances), parataxis (1; 1% instances)

Children of AUX nodes belong to 11 different parts of speech: ADV (59; 31% instances), PUNCT (35; 18% instances), AUX (29; 15% instances), PRON (18; 9% instances), VERB (15; 8% instances), PART (14; 7% instances), NOUN (13; 7% instances), CCONJ (3; 2% instances), INTJ (3; 2% instances), ADJ (1; 1% instances), SCONJ (1; 1% instances)