UD for Classical Chinese 
Tokenization and Word Segmentation
There are neither spaces nor punctuations between words or sentences. Every word consists of a single character, except for several (proper) nouns.
Morphology
Tags
The predicate-object-final structure of very early Chinese texts had only three categories of words: predicate, object, and final. Here in our linguistic model we tentatively call them “verb” “noun” and “particle” respectively. Several words were specialised to be used as verbs, several as nouns, but most of them had been used in two or three categories around Zhou (周) dynasty.
At that era, we can observe very early modifier usages of verbs. Several verbs were specialised to be used as adverbial modifiers, afterwards caused adverbs. In between verbs and adverbs, auxiliary verbs were almost specialised to auxiliary uses, but incidentally used as verbs. Adjective usages of verbs were not specialised as adjectives at that era, on the other hand, some caused prepositions.
For POS-tagging of classical Chinese texts in UD, we use VERB ADV AUX ADP and SCONJ to fill UPOS field of each verb-origin word, following the overview of modifier usages mentioned above. For noun-origin words we use NOUN PROPN PRON NUM and ADV (noun-origin adverbs including 何), categorising them in rather nowadays point of view. For particle-origin words we use PART CCONJ and INTJ, keeping up with the guideline of UD v2. We rarely use SYM, and do not use ADJ DET PUNCT or X.
Features
-
NameType=SurGivPrsNatGeoforPROPN. -
Case=LocTemorNounType=ClassforNOUN. -
PronType=PrswithPerson=123orReflex=Yesfor personalPRON. PronType=Demfor demonstrativePRON.PronType=Intfor interrogativePRON.NumType=Ordfor zodiacNUM.-
Polarity=NegorDegree=PosEquSupCmpforVERBandADV. -
AdvType=DegTimCauwithAspect=PerforTense=PastPresFutforADV. -
Mood=PotNecDesorVoice=PassforAUX. VerbType=Copfor copular use of verb (its UPOS is changed intoAUX).VerbForm=Partfor adjective use ofVERB.VerbForm=Convfor adverbial use of verb (its UPOS is changed intoADV).
Syntax
discourse:spto annotate the final sentence particles in the predicate-object-final structure.nsubj:passto annotate passive subjects.nsubj:outerandcsubj:outerto annotate subjects for predicate clauses.obl:tmodto annotate temporal oblique nominals.obl:lmodto annotate locational oblique nominals.compound:redup(left-to-right) to annotate reduplicated compounds.flat:vv(left-to-right) to annotate serial verbs (rather exocentric).flat:foreign(left-to-right) to annotate foreign words.
Treebanks
There are two Classical Chinese UD treebanks:
- UD_Classical_Chinese-Kyoto (implemented in UD-Kanbun and SuPar-Kanbun)
- UD_Classical_Chinese-TueCL
References
- Koichi Yasuoka: Universal Dependencies Treebank of the Four Books in Classical Chinese, DADH2019: 10th International Conference of Digital Archives and Digital Humanities (December 2019), pp.20-28.