home edit page issue tracker

This page pertains to UD version 2.


UD Chinese

[Description not currently available.]

UD Chinese-CFL


The lemma is the same as the word, except when the word contains a spelling error. See Tokenization section for UD Chinese-CFL for a definition of a spelling error.

Part-of-speech tags

POS tagging is performed on the basis of the lemma, rather than the word. Hence, in the sentence *不關多貴我也買, 不關 is not tagged as VERB but rather as SCONJ, on account of its lemma 不管.

When determining the POS, one usually considers both the “morphological evidence”, i.e., the linguistic form of the word, as well as the “distributional evidence”, i.e., its syntactic use in the sentence. In a well-formed sentence, these two kinds of evidence should agree; in learner text, however, they may conflict (Ragheb and Dickinson, 2014).

Consider the word 可怕 kepa “scary” in the sentence *我可怕他 “*I scary him”. Morphological evidence suggests the word 可怕 kepa “scary” should be tagged as an adjective (ADJ), reflecting its normal usage. Distributional evidence suggests it should be tagged as a verb, since the trailing pronoun 他 ta “him” implies its use as a verb with a direct object.

When these two kinds of evidence contradict one another, the morphological evidence prevails. The example sentence is thus tagged as:

我/PN 可怕/ADJ 他/PN

However, we also include the “distributional POS tag” (in column 3 of the .conllux file).


Features are not currently implemented.

UD Chinese-HK

[Description not currently available.]