home edit page issue tracker

This page pertains to UD version 2.

UD Chinese GSDSimp

Language: Chinese (code: zh)
Family: Sino-Tibetan

This treebank has been part of Universal Dependencies since the UD v2.5 release.

The following people have contributed to making this treebank part of UD: Peng Qi, Koichi Yasuoka.

Repository: UD_Chinese-GSDSimp
Search this treebank on-line: PML-TQ
Download all treebanks: UD 2.13

License: CC BY-SA 4.0

Genre: wiki

Questions, comments? General annotation questions (either Chinese-specific or cross-linguistic) can be raised in the main UD issue tracker. You can report bugs in this treebank in the treebank-specific issue tracker on Github. If you want to collaborate, please contact [pengqi (æt) cs • stanford • edu]. Development of the treebank happens directly in the UD repository, so you may submit bug fixes as pull requests against the dev branch.

Annotation Source
Lemmas assigned by a program, with some manual corrections, but not a full manual verification
UPOS annotated manually in non-UD style, automatically converted to UD, with some manual corrections of the conversion
XPOS annotated manually
Features assigned by a program, with some manual corrections, but not a full manual verification
Relations annotated manually in non-UD style, automatically converted to UD

Description

Simplified Chinese Universal Dependencies dataset converted from the GSD (traditional) dataset with manual corrections.

This is a simplified Chinese version of the UD Chinese GSD treebank. It is initially automatically converted into simplified Chinese with the OpenCC tool with patterns for mapping punctuation, then corrected with manual fixes.

Acknowledgments

Statistics of UD Chinese GSDSimp

POS Tags

ADJADPADVAUXCCONJDETNOUNNUMPARTPRONPROPNPUNCTSCONJSYMVERBX

Features

AspectCaseNumberNumTypePartTypePersonPolarityVoice

Relations

aclacl:relcladvcladvmodamodapposauxaux:passcaseccccompclfcompoundcompound:extconjcopcsubjcsubj:passdetdiscoursediscourse:spdislocatedflat:foreignflat:nameiobjmarkmark:advmark:relnmodnmod:tmodnsubjnsubj:passnummodobjoblobl:patientorphanparataxispunctreparandumrootvocativexcomp

Tokenization and Word Segmentation

Morphology

Tags

Nominal Features

Degree and Polarity

Verbal Features

Pronouns, Determiners, Quantifiers

Other Features

Syntax

Auxiliary Verbs and Copula

Core Arguments, Oblique Arguments and Adjuncts

Here we consider only relations between verbs (parent) and nouns or pronouns (child).

Relations Overview