home edit page issue tracker

This page pertains to UD version 2.

UD Middle Armenian ArmTDP

Language: Middle Armenian (code: axm)
Family: IE

This treebank has been part of Universal Dependencies since the UD v2.18 release.

The following people have contributed to making this treebank part of UD: Anna S. Danielyan, Marat M. Yavrumyan.

Repository: UD_Middle_Armenian-ArmTDP
Search this treebank on-line: PML-TQ
Download all treebanks: UD 2.18

License: CC BY-SA 4.0

Genre: legal, medical

Questions, comments? General annotation questions (either Middle Armenian-specific or cross-linguistic) can be raised in the main UD issue tracker. You can report bugs in this treebank in the treebank-specific issue tracker on Github. If you want to collaborate, please contact [adanielyan (æt) ysu • am]. Development of the treebank happens directly in the UD repository, so you may submit bug fixes as pull requests against the dev branch.

Annotation Source
Lemmas annotated manually
UPOS annotated manually, natively in UD style
XPOS not available
Features annotated manually, natively in UD style
Relations annotated manually, natively in UD style

Description

A Universal Dependencies treebank for Middle Armenian developed for UD originally by the ArmTDP team led by Marat M. Yavrumyan at the Yerevan State University.

The UD_Middle_Armenian-ArmTDP treebank is derived from the Middle Armenian component of the ArmTDP v3.0 (Հայերենի ծառադարան), a comprehensive corpus of the Armenian language across various genres. Adhering strictly to Universal Dependencies (UD) guidelines, the dataset was manually annotated by the ArmTDP team. The processing pipeline—including tokenization and POS-tagging—utilized a hybrid approach of glossary-based automation followed by rigorous manual revision. As the only manually verified corpus of Middle Armenian, it provides exhaustive morphological and syntactic annotations, featuring complete dependency trees for every sentence.

Acknowledgments

This work was supported by the Higher Education and Science Committee of the Ministry of Education, Science, Culture and Sports of the Republic of Armenia (Research Project № 27TARGET-6B173). The main contributor, Anna S. Danielyan, was involved in COST Action CA21167 — Universality, Diversity and Idiosyncrasy in Language Technology (UniDive).

References

This treebank can also be referenced:

@misc{UD_Middle_Armenian-ArmTDP,
title={{UD_Middle_Armenian-ArmTDP}: Universal Dependencies for Middle Armenian},
url={https://github.com/UniversalDependencies/UD_Middle_Armenian-ArmTDP},
author={
Anna S. Danielyan and Marat M. Yavrumyan
},
year={2026},
}

Format

UD_Middle_Armenian-ArmTDP data conforms to CoNLL-U format with the following specifics:

Statistics of UD Middle Armenian ArmTDP

POS Tags

ADJADPADVAUXCCONJDETNOUNNUMPARTPRONPROPNPUNCTSCONJVERB

Features

AdpTypeAnimacyAspectCaseDefiniteDegreeDeixisDeixis[psor]ExtPosMoodNameTypeNumberNumFormNumTypePersonPolarityPronTypeReflexStyleSubcatTenseTypoVerbFormVoice

Relations

aclacl:relcladvcladvcl:relcladvmodadvmod:emphamodapposauxaux:causcaseccccompcompound:lvccompound:redupconjcopcsubjcsubj:outerdetdet:possdiscoursedislocatedfixediobjmarknmodnmod:npmodnmod:possnsubjnsubj:outernsubj:passnummodobjoblorphanparataxispunctrootxcomp

Tokenization and Word Segmentation

Morphology

Tags

Nominal Features

Degree and Polarity

Verbal Features

Pronouns, Determiners, Quantifiers

Other Features

Syntax

Auxiliary Verbs and Copula

Core Arguments, Oblique Arguments and Adjuncts

Here we consider only relations between verbs (parent) and nouns or pronouns (child).

Relations Overview