home edit page issue tracker

This page pertains to UD version 2.

UD Greek Cretan

Language: Greek (code: el)
Family: IE

This treebank has been part of Universal Dependencies since the UD v2.16 release.

The following people have contributed to making this treebank part of UD: Socrates Vakirtzian, Stella Markantonatou, Vivian Stamou.

Repository: UD_Greek-Cretan
Search this treebank on-line: PML-TQ
Download all treebanks: UD 2.17

License: CC BY-SA 4.0

Genre: fiction

Questions, comments? General annotation questions (either Greek-specific or cross-linguistic) can be raised in the main UD issue tracker. You can report bugs in this treebank in the treebank-specific issue tracker on Github. If you want to collaborate, please contact [stiliani • markantonatou (æt) gmail • com]. Development of the treebank happens directly in the UD repository, so you may submit bug fixes as pull requests against the dev branch.

Annotation Source
Lemmas annotated manually
UPOS annotated manually, natively in UD style
XPOS annotated manually
Features annotated manually, natively in UD style
Relations annotated manually, natively in UD style

Description

The text of the treebank was transcribed with Wisper (trained on Cretan) from 9 tapes containing folklore narratives by one speaker, Ioannis Anagnostakis, who is responsible for their composition. The narratives are radio broadcasts in digital format, with permission from the Audiovisual Department of the Vikelaia Municipal Library of Heraklion, Crete (1998-2001). The data were split into training (70%), dev (10%) and test (20%) sets.

This is the first treebank for the living (but under resourced) dialect of East Crete. The dialect diverges from Standard Modern Greek at all levels. The treebank is annotated for euphonics and voicing; these phonological phenomena affect the orthography of the dialect. Active annotation was used for knowledge transfer from GUD, a UD treebank of Standard Modern Greek, and the results have been edited manually by a native speaker.

Acknowledgments

We thank Yannis Kazos for his contribution.

References

Socrates Vakirtzian, Vivian Stamou, Yannis Kazos, Stella Markantonatou. 2024. Dialectal treebanks and their relation with the standard variety: The case of East Cretan and Standard Modern Greek. The Joint 25th Nordic Conference on Computational Linguistics and 11th Baltic Conference on Human Language Technologies (NoDaLiDa/Baltic-HLT 2025), Tallinn (Estonia), March 2–5, 2025.

Statistics of UD Greek Cretan

POS Tags

ADJADPADVAUXCCONJDETINTJNOUNNUMPARTPRONPROPNPUNCTSCONJVERB

Features

AspectCaseDefiniteDegreeExtPosGenderMoodNumberNumTypePersonPolarityPossPronTypeTenseVerbFormVoice

Relations

aclacl:relcladvcladvmodamodapposauxcaseccccompcompoundcompound:redupconjcopcsubjdetdiscoursedislocatedexplfixedflatiobjmarknmodnsubjnummodobjoblorphanparataxispunctrootvocativexcomp

Tokenization and Word Segmentation

Morphology

Tags

Nominal Features

Degree and Polarity

Verbal Features

Pronouns, Determiners, Quantifiers

Other Features

Syntax

Auxiliary Verbs and Copula

Core Arguments, Oblique Arguments and Adjuncts

Here we consider only relations between verbs (parent) and nouns or pronouns (child).

Relations Overview