home edit page issue tracker

This page pertains to UD version 2.

UD Irish IDT

Language: Irish (code: ga)
Family: Indo-European, Celtic

This treebank has been part of Universal Dependencies since the UD v1.0 release.

The following people have contributed to making this treebank part of UD: Teresa Lynn, Jennifer Foster, Sarah McGuinness, Abigail Walsh, Jason Phelan.

Repository: UD_Irish-IDT
Search this treebank on-line: PML-TQ
Download all treebanks: UD 2.5

License: CC BY-SA 3.0

Genre: news, fiction, web, legal

Questions, comments? General annotation questions (either Irish-specific or cross-linguistic) can be raised in the main UD issue tracker. You can report bugs in this treebank in the treebank-specific issue tracker on Github. If you want to collaborate, please contact [teresa • lynn (æt) adaptcentre • ie; jennifer • foster (æt) dcu • ie]. Development of the treebank happens outside the UD repository. If there are bugs, either the original data source or the conversion procedure must be fixed. Do not submit pull requests against the UD repository.

Annotation Source
Lemmas annotated manually in non-UD style, automatically converted to UD, with some manual corrections of the conversion
UPOS annotated manually in non-UD style, automatically converted to UD, with some manual corrections of the conversion
XPOS assigned by a program, with some manual corrections, but not a full manual verification
Features annotated manually in non-UD style, automatically converted to UD, with some manual corrections of the conversion
Relations annotated manually in non-UD style, automatically converted to UD, with some manual corrections of the conversion

Description

A Universal Dependencies 1763-sentence treebank for modern Irish.

The Irish UD Treebank is a conversion of the Irish Dependency Treebank (IDT), which was part of a PhD research project by Teresa Lynn at Dublin City University, Ireland (Lynn, 2016).

—- The IDT data has been released on [GitHub] (https://github.com/tlynn747/IrishDependencyTreebank). The Treebank contains 1020 sentences taken from the New Corpus of Ireland-Irish (NCII), with text from books, newswire, websites and other media. These sentences are a subset of a gold-standard POS-tagged corpus for Irish. —-

The conversion from the IDT annotation scheme to the UD annotation scheme was designed by Teresa Lynn and Jennifer Foster at Dublin City University, Ireland. The mapping to UD is reported in Lynn et al., (2016)

The UD Treebank is split into two sets as follows:

Note: the 451 dev trees were taken from the set of newly annotated trees. The rest of the newly annotated trees have been added to the training set.

Acknowledgments

We wish to thank all of the contributors to the original IDT annotation, including Elaine Uí Dhonnchadha for her gold POS-tagged corpus and linguistic advice. We would also like to acknowledge linguistic advice offered by Kevin Scannell in the conversion to UD effort.

Expansion of the IUDT from 2019-2021 is funded by the Irish Government Department of Culture, Heritage and the Gaeltacht.

This research is partially supported by Science Foundation Ireland through the ADAPT Centre for Digital Content Technology. The ADAPT Centre for Digital Content Technology is funded under the SFI Research Centres Programme (Grant 13/RC/2106) and is co-funded under the European Regional Development Fund.

Statistics of UD Irish IDT

POS Tags

ADJADPADVAUXCCONJDETINTJNOUNNUMPARTPRONPROPNPUNCTSCONJSYMVERBX

Features

AbbrCaseDefiniteDegreeDialectForeignFormGenderMoodNounTypeNumberNumTypePartTypePersonPolarityPossPrepFormPronTypeReflexTenseVerbFormVoice

Relations

acl:relcladvcladvmodamodapposcasecase:vocccccompcompoundcompound:prtconjcopcsubj:cleftcsubj:copdetdiscoursefixedflatflat:foreignflat:namelistmarkmark:prtnmodnmod:possnsubjnummodobjoblobl:prepobl:tmodparataxispunctrootvocativexcompxcomp:pred

Tokenization and Word Segmentation

Morphology

Tags

Nominal Features

Degree and Polarity

Verbal Features

Pronouns, Determiners, Quantifiers

Other Features

Syntax

Auxiliary Verbs and Copula

Core Arguments, Oblique Arguments and Adjuncts

Here we consider only relations between verbs (parent) and nouns or pronouns (child).

Relations Overview