home edit page issue tracker

This page pertains to UD version 2.

UD Pashto Prince

Language: Pashto (code: ps)
Family: IE

This treebank has been part of Universal Dependencies since the UD v2.18 release.

The following people have contributed to making this treebank part of UD: Salwan Aziz, Luigi Talamo, Annemarie Verkerk.

Repository: UD_Pashto-Prince
Search this treebank on-line: PML-TQ
Download all treebanks: UD 2.18

License: CC BY-SA 4.0

Genre: fiction, government

Questions, comments? General annotation questions (either Pashto-specific or cross-linguistic) can be raised in the main UD issue tracker. You can report bugs in this treebank in the treebank-specific issue tracker on Github. If you want to collaborate, please contact [annemarie • verkerk (æt) uni-saarland • de]. Development of the treebank happens directly in the UD repository, so you may submit bug fixes as pull requests against the dev branch.

Annotation Source
Lemmas annotated manually
UPOS annotated manually, natively in UD style
XPOS not available
Features annotated manually, natively in UD style
Relations annotated manually, natively in UD style

Description

The UD Pashto-Prince treebank contains manually annotated Pashto sentences from two textual sources: 50 sentences from Le Petit Prince, which was then translated and adapted into Northern Pashto, and 14 sentences from a Pashto prose text on Pashtun leadership. All sentences are annotated natively according to Universal Dependencies guidelines.

The Pashto-Prince treebank is a manually annotated Universal Dependencies (UD) treebank for Pashto. It consists of a total of 64 sentences drawn from two sources:

50 sentences from Le Petit Prince, originally sourced from an online Pashto version. The original text reflects Afghan (Southern) Pashto; therefore, the sentences were manually rewritten and adapted into Northern Pashto to reflect dialectal differences in morphology, lexicon, and syntax before annotation.

14 sentences from a Pashto prose text titled Silent Pashtun Leadership, sourced from an online PDF publication.

All sentences were manually annotated for lemmas, universal part-of-speech tags (UPOS), morphological features, and dependency relations following the Universal Dependencies v2 guidelines. The annotations were performed directly in Pashto without automatic pre-annotation.

Acknowledgments

We thank Salwan Aziz for the manual translation and adaptation of the Le Petit Prince sentences into Northern Pashto and for carrying out the complete manual annotation of all sentences in the treebank. We also thank the course instructors and supervisors for their guidance and feedback on Universal Dependencies annotation standards.

References

de Saint-Exupéry, A. (1943). Le Petit Prince. Pashto version available at: https://pashtogaheez.com/books/637

Silent Pashtun Leadership. Source: https://www.pashtoonkhwa.com/?cnt=3037&page=pashtoonkhwa

Universal Dependencies Consortium. (2024). Universal Dependencies v2. https://universaldependencies.org

Statistics of UD Pashto Prince

POS Tags

ADJADPADVAUXCCONJDETNOUNNUMPARTPRONPROPNPUNCTSCONJVERB

Features

AspectCaseDeixisMoodNumberNumTypePersonPolarityPossPronTypeReflexTenseVerbForm

Relations

aclacl:relcladvcladvmodamodapposauxaux:passcaseccccompcompoundcompound:lvccompound:prtconjcopdetdet:possdiscourseexplfixediobjmarknmodnsubjnsubj:passnummodobjoblobl:agentobl:argparataxispunctrootxcomp

Tokenization and Word Segmentation

Morphology

Tags

Nominal Features

Degree and Polarity

Verbal Features

Pronouns, Determiners, Quantifiers

Other Features

Syntax

Auxiliary Verbs and Copula

Core Arguments, Oblique Arguments and Adjuncts

Here we consider only relations between verbs (parent) and nouns or pronouns (child).

Verbs with Reflexive Core Objects

Relations Overview