home edit page issue tracker

This page pertains to UD version 2.

UD Scottish Gaelic ARCOSG

Language: Scottish Gaelic (code: gd)
Family: Indo-European, Celtic

This treebank has been part of Universal Dependencies since the UD v2.5 release.

The following people have contributed to making this treebank part of UD: Colin Batchelor.

Repository: UD_Scottish_Gaelic-ARCOSG
Search this treebank on-line: PML-TQ
Download all treebanks: UD 2.14

License: CC BY-SA 4.0

Genre: nonfiction, fiction, news, spoken

Questions, comments? General annotation questions (either Scottish Gaelic-specific or cross-linguistic) can be raised in the main UD issue tracker. You can report bugs in this treebank in the treebank-specific issue tracker on Github. If you want to collaborate, please contact [colin • r • batchelor (æt) googlemail • com]. Development of the treebank happens directly in the UD repository, so you may submit bug fixes as pull requests against the dev branch.

Annotation Source
Lemmas annotated manually in non-UD style, automatically converted to UD
UPOS annotated manually in non-UD style, automatically converted to UD
XPOS annotated manually
Features annotated manually in non-UD style, automatically converted to UD
Relations annotated manually in non-UD style, automatically converted to UD

Description

A treebank of Scottish Gaelic based on the Annotated Reference Corpus Of Scottish Gaelic (ARCOSG).

The Scottish Gaelic treebank takes data from ARCOSG, the Annotated Reference Corpus of Scottish Gaelic (Lamb et al. 2016) with the annotation scheme based on that in the Irish UD treebank. Full bibliographic details are to be had there.

It contains eight subcorpora of a varying number of original files, each of approximately 1000 tokens. All files listed below are in the training set unless they are explicitly marked as being in test or dev. In the ARCOSG documentation the names of contributors are largely given in Gaelic, which I have kept and glossed with their names in English where they will be familiar to non-Gaelic speakers.

See https://universaldependencies.org/gd/index.html for detailed linguistic documentation.

Acknowledgments

We wish to thank all of the contributors to ARCOSG and fellow Celtic language UD developers Teresa Lynn, Kevin Scannell, Johannes Heinecke and Fran Tyers.

References

Statistics of UD Scottish Gaelic ARCOSG

POS Tags

ADJADPADVAUXCCONJDETINTJNOUNNUMPARTPRONPROPNPUNCTSCONJSYMVERBX

Features

CaseDefiniteDegreeForeignFormGenderMoodNumberNumFormNumTypePartTypePersonPolarityPossPronTypeReflexTenseTypoVerbForm

Relations

aclacl:relcladvcladvmodamodapposaux:passcasecase:vocccccompcompoundconjcopcsubj:cleftcsubj:copcsubj:outerdepdetdiscoursedislocatedfixedflatflat:foreignflat:namemarkmark:prtnmodnmod:possnsubjnsubj:outernsubj:passnummodobjoblobl:smodobl:tmodorphanparataxispunctreparandumrootvocativexcompxcomp:pred

Tokenization and Word Segmentation

Morphology

Tags

Nominal Features

Degree and Polarity

Verbal Features

Pronouns, Determiners, Quantifiers

Other Features

Syntax

Auxiliary Verbs and Copula

Core Arguments, Oblique Arguments and Adjuncts

Here we consider only relations between verbs (parent) and nouns or pronouns (child).

Relations Overview