This page pertains to UD version 2.

UD Turkish German SAGT

Language: Turkish German (code: qtd)
Family: Code switching

This treebank has been part of Universal Dependencies since the UD v2.7 release.

The following people have contributed to making this treebank part of UD: Özlem Çetinoğlu, Çağrı Çöltekin.

License: CC BY-NC-SA 4.0

Genre: spoken

Questions, comments? General annotation questions (either Turkish German-specific or cross-linguistic) can be raised in the main UD issue tracker. You can report bugs in this treebank in the treebank-specific issue tracker on Github. If you want to collaborate, please contact [ozlem (æt) ims • uni-stuttgart • de]. Development of the treebank happens outside the UD repository. If there are bugs, either the original data source or the conversion procedure must be fixed. Do not submit pull requests against the UD repository.

UD Turkish-German SAGT is a Turkish-German code-switching treebank that is developed as part of the SAGT project.

The treebank consists of bilingual conversation transcriptions annotated with several layers: language IDs, lemmas, POS tags, morphological features, and dependency relations. Language IDs employ the tag set of Çetinoğlu (2017). The rest of the annotations follow Universal Dependencies annotation scheme, and the conventions used in monolingual Turkish and German treebanks.

There are 48 distinct conversations from 17 participants. The majority of the speakers are university students, hence the most frequent age range is 18–25. Common conversation themes include studies, work, travel, free time activities such as sports, books, TV, and future plans.

The accompanying audio recordings of transcriptions are also available as a speech corpus, with a separate licence. Please contact ozlem@ims.uni-stuttgart.de for further information.


The treebank development is funded by DFG via project CE 326/1-1 “Computational Structural Analysis of German-Turkish Code-Switching”. We thank Cansu Turgut, Reha Sakızlı, Semanur Ceylan, and Sevde Ceylan for data collection and annotation.


