UD Classical Chinese TueCL
Language: Classical Chinese (code: lzh
Family: Sino-Tibetan
This treebank has been part of Universal Dependencies since the UD v2.14 release.
The following people have contributed to making this treebank part of UD: Yifei Chen, John Wang, Çağrı Çöltekin.
Repository: UD_Classical_Chinese-TueCL
Search this treebank on-line: PML-TQ
Download all treebanks: UD 2.14
License: CC BY-SA 4.0
Genre: fiction
Questions, comments? General annotation questions (either Classical Chinese-specific or cross-linguistic) can be raised in the main UD issue tracker. You can report bugs in this treebank in the treebank-specific issue tracker on Github. If you want to collaborate, please contact [yifei • chen (æt) student • uni-tuebingen • de,johnz • wang (æt) outlook • com,cagri • coeltekin (æt) uni-tuebingen • de]. Development of the treebank happens directly in the UD repository, so you may submit bug fixes as pull requests against the dev branch.
Annotation | Source |
Lemmas | annotated manually |
UPOS | annotated manually, natively in UD style |
XPOS | not available |
Features | annotated manually, natively in UD style |
Relations | annotated manually, natively in UD style |
A dependency Treebank of “逍遥游(Enjoyment in Untroubled Ease)” written by Zhuangzi.
Full text of the source is available at: https://chinesenotes.com/zhuangzi/zhuangzi001.html (in both EN&CN), and translation to modern Chinese is available at https://so.gushiwen.cn/shiwenv_5bfecbe60620.aspx (or http://www.acmuller.net/con-dao/zhuangzi.html).
- (citation)
Statistics of UD Classical Chinese TueCL
POS Tags
AdvType – Case – Degree – NameType – NounType – Person – Polarity – PronType – Reflex – Tense
acl – advcl – advmod – amod – case – cc – ccomp – clf – compound – conj – cop – csubj – det – discourse – discourse:sp – dislocated – fixed – flat – flat:vv – iobj – mark – nmod – nsubj – nummod – obj – obl – obl:lmod – obl:tmod – parataxis – root
Tokenization and Word Segmentation
- This corpus contains 100 sentences and 648 tokens.
- This corpus contains 648 tokens (100%) that are not followed by a space.
- This corpus does not contain words with spaces.
- This corpus does not contain words that contain both letters and punctuation.
- This corpus uses 13 UPOS tags out of 17 possible: ADJ, ADP, ADV, AUX, CCONJ, DET, NOUN, NUM, PART, PRON, PROPN, SCONJ, VERB
- This corpus does not use the following tags: INTJ, SYM, PUNCT, X
- This corpus contains 12 word types tagged as particles (PART): 乎, 也, 哉, 夫, 已, 所, 焉, 然, 矣, 者, 而, 邪
- This corpus contains 13 lemmas tagged as pronouns (PRON): _, 之, 其, 奚, 己, 彼, 惡, 我, 斯, 是, 此, 焉, 自
- This corpus contains 2 lemmas tagged as determiners (DET): 之, 數
- Out of the above, 1 lemmas occurred sometimes as PRON and sometimes as DET: 之
- This corpus contains 1 lemmas tagged as auxiliaries (AUX): 爲
- Out of the above, 1 lemmas occurred sometimes as AUX and sometimes as VERB: 爲
- This corpus does not use the VerbForm feature.
Nominal Features
- Loc
- NOUN: 天, 南, 上, 世, 下, 北, 地, 池, 海, 內
- PROPN: 楚
- Tem
- NOUN: 歲, 年, 後, 春, 秋, 今, 月, 古, 日, 時
Degree and Polarity
- Equ
- ADV: 猶
- VERB: 若
- Pos
- NOUN: 冥, 廣, 怪, 正
- VERB: 冥, 厚, 大, 數, 然, 窮, 久, 匹, 太, 夭
- Neg
- ADV: 不, 未, 无, 莫
- VERB: 无, 非
Verbal Features
- Fut
- ADV: 將
Pronouns, Determiners, Quantifiers
- Dem
- PRON: 彼, 此, 是, 斯, 焉
- Int
- PRON: 奚
- Prs
- PRON: 其, 之, 我, 己, 自
- Yes
- PRON: 己, 自
- 1
- PRON: 我
- 3
- PRON: 其, 之
Other Features
- AdvType
- Cau
- ADV: 奚, 何
- Tim
- ADV: 則, 乃, 將
- Cau
- NameType
- Giv
- PROPN: 鯤, 彭祖, 榮, 湯
- Nat
- PROPN: 楚
- Sur
- PROPN: 宋, 閼
- Giv
- NounType
- Clf
- NOUN: 里, 仞
- Clf
Auxiliary Verbs and Copula
- This corpus uses 1 lemmas as copulas (cop). Examples: 爲.
- This corpus does not contain auxiliaries.
Core Arguments, Oblique Arguments and Adjuncts
Here we consider only relations between verbs (parent) and nouns or pronouns (child).
- nsubj
- VERB--NOUN (20)
- VERB--NOUN-ADP(之) (2)
- VERB--NOUN-ADP(也) (2)
- VERB--NOUN-Loc (4)
- VERB--NOUN-Tem (1)
- VERB--PRON (14)
- obj
- VERB--NOUN (38)
- VERB--NOUN-Loc (7)
- VERB--NOUN-Tem (5)
- VERB--PRON (12)
- VERB--PRON-ADP(以) (1)
Verbs with Reflexive Core Objects
- This corpus contains 2 lemmas that occur at least once with a reflexive core object (obj or iobj). Examples: 無 己、 視 自