home edit page issue tracker

This page pertains to UD version 2.

UD Korean KSL

Language: Korean (code: ko)
Family: Korean

This treebank has been part of Universal Dependencies since the UD v2.15 release.

The following people have contributed to making this treebank part of UD: Hakyung Sung, Gyu-Ho Shin.

Repository: UD_Korean-KSL
Search this treebank on-line: PML-TQ
Download all treebanks: UD 2.16

License: CC BY-SA 4.0

Genre: learner-essays

Questions, comments? General annotation questions (either Korean-specific or cross-linguistic) can be raised in the main UD issue tracker. You can report bugs in this treebank in the treebank-specific issue tracker on Github. If you want to collaborate, please contact [hsung (æt) uoregon • edu]. Development of the treebank happens directly in the UD repository, so you may submit bug fixes as pull requests against the dev branch.

Annotation Source
Lemmas annotated manually
UPOS annotated manually in non-UD style, automatically converted to UD, with some manual corrections of the conversion
XPOS annotated manually
Features annotated manually in non-UD style, automatically converted to UD
Relations annotated manually, natively in UD style

Description

UD_Korean-KSL is a dependency treebank of second-language (L2) Korean.

The treebank contains 12,977 sentences—10,323 in the training set, 1,311 in the dev set, and 1,343 in the test set. These sentences are sourced from two datasets: (1) the Kyung Hee dataset, with sentence IDs starting with “KH” and annotated with classroom proficiency levels (A1–C2); and (2) the KoLLA dataset, with sentence IDs starting with “KL” and grouped as fb (foreign beginners), fi (foreign intermediates), and hb (heritage beginners).

Acknowledgments

We acknowledge the original data contributors: the Kyung Hee dataset (credit to Jungyeul Park and Jung Hee Lee; note that this dataset is no longer maintained and its sentences are no longer used for further annotation) and the KoLLA dataset (credit to Markus Dickinson, Ross Israel, and Sun-Hee Lee). We also acknowledge our annotators: Hee-June Koh, Chanyoung Lee, and Youkyung Sung.

Statistics of UD Korean KSL

POS Tags

ADJADPADVAUXCCONJDETNOUNNUMPARTPRONPROPNPUNCTSYMVERBX

Features

Typo

Relations

acladvcladvmodamodapposauxcaseccccompcompoundconjcopcsubjdepdetdiscoursedislocatedflatgoeswithlistmarknmodnmod:possnsubjnummodobjoblparataxispunctreparandumrootvocative

Tokenization and Word Segmentation

Morphology

Tags

Nominal Features

Degree and Polarity

Verbal Features

Pronouns, Determiners, Quantifiers

Other Features

Syntax

Auxiliary Verbs and Copula

Core Arguments, Oblique Arguments and Adjuncts

Here we consider only relations between verbs (parent) and nouns or pronouns (child).

Relations Overview