home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Chinese-PUD: POS Tags: PUNCT

There are 1 PUNCT lemmas (7%), 26 PUNCT types (0%) and 2902 PUNCT tokens (14%). Out of 15 observed tags, the rank of PUNCT is: 12 in number of lemmas, 13 in number of types and 3 in number of tokens.

The 10 most frequent PUNCT lemmas: _

The 10 most frequent PUNCT types: ,、 。、 (、 )、 ·、 、、 ”、 “、 《、 》

The 10 most frequent ambiguous lemmas: _ (NOUN 5410, VERB 3467, PUNCT 2902, PART 1881, PROPN 1361, ADP 1288, ADV 1283, NUM 873, PRON 710, ADJ 650, AUX 618, DET 355, X 306, CCONJ 283, SCONJ 28)

The 10 most frequent ambiguous types:

Morphology

The form / lemma ratio of PUNCT is 26.000000 (the average of all parts of speech is 388.466667).

The 1st highest number of forms (26) was observed with the lemma “_”: “, (, ), -, …, /, ·, ——, ‘, ’, “, ”, •, ……, 、, 。, 《, 》, 丶, (, ), ,, /, :, ;, ?.

PUNCT does not occur with any features.

Relations

PUNCT nodes are attached to their parents using 1 different relations: punct (2902; 100% instances)

Parents of PUNCT nodes belong to 12 different parts of speech: VERB (1725; 59% instances), NOUN (587; 20% instances), X (196; 7% instances), PROPN (168; 6% instances), ADJ (136; 5% instances), ADV (49; 2% instances), NUM (15; 1% instances), PRON (11; 0% instances), ADP (6; 0% instances), PART (6; 0% instances), AUX (2; 0% instances), CCONJ (1; 0% instances)

2902 (100%) PUNCT nodes are leaves.

The highest child degree of a PUNCT node is 0.