home edit page issue tracker

This page pertains to UD version 2.

Syntax

UD Chinese

[Description not currently available.]

UD Chinese-CFL

General Comments

A “literal annotation” is preferred, i.e., one should annotate “as if the sentence were as syntactically well-formed as it can be, possibly ignoring meaning” (Ragheb and Dickinson, 2014).

Missing words

When a word seems missing in the learner sentence, we annotate according to the UD guidelines on promotion by head elision. For example, in the sentence fragment 在中國最近幾年 zai zhongguo zuijin ji nian “in China recent few years”, we promote 年 nian “year” to be the root. Although both 中國 zhongguo “China” and 年 nian “year” would be obl dependents if a verb was present, 年 nian “year” is promoted because it is closer to the expected location of the verb.

Word-order errors

The annotation should assume no word order error. For example, in the sentence *我被了他打一頓. The aspect particle 了 le usually modifies the verb that precedes it immediately, and is probably misplaced in this sentence. It is most likely intended to modify 打 da “hit”, and should immediately follow da rather than 被 bei, the passive marker.

To adhere to the principle of “literal annotation”, rather than annotating le as the child of 打 da “hit” with the aux relation, we annotate 了 le as the child of 被 bei with the dep relation.

dep (unspecified dependency)

When learner errors make it difficult to characterize the grammatical relation between a word and the rest of the sentence, we use the dep relation. Typically, when the POS tag differs from the distributional POS tag, the dep relation is needed.

Consider the sentence *我可怕他 “*I scary him”. From the point of view of its POS tag, it is unclear how the word 可怕 kepa “scary”, as an adjective, relates to the pronoun. We thus consider kepa as the head of 他 ta “him” with the dep relation.

When a word has a different distributional POS tag, we also include a “distributional” dependency relation on the basis of the word’s distributional POS tag. This relation is stored in column 4 of the .conllux file. In the example sentence above, the word 可怕 kepa “scary”, as a verb, is the head of 他 ta “him” with the obj relation.

References

Marwa Ragheb and Markus Dickinson. 2014. Developing a Corpus of Syntactically-annotated Learner Language for English. Proceedings of the 13th International Workshop on Treebanks and Linguistic Theories (TLT).

UD Chinese-HK

[Description not currently available.]