home edit page issue tracker

This page pertains to UD version 2.

UD Catalan AnCora

Language: Catalan (code: ca)
Family: Indo-European, Romance

This treebank has been part of Universal Dependencies since the UD v1.3 release.

The following people have contributed to making this treebank part of UD: Héctor Martínez Alonso, Elena Pascual, Daniel Zeman.

Repository: UD_Catalan-AnCora
Search this treebank on-line: PML-TQ
Download all treebanks: UD 2.10

License: CC BY 4.0

Genre: news

Questions, comments? General annotation questions (either Catalan-specific or cross-linguistic) can be raised in the main UD issue tracker. You can report bugs in this treebank in the treebank-specific issue tracker on Github. If you want to collaborate, please contact [zeman (æt) ufal • mff • cuni • cz]. Development of the treebank happens directly in the UD repository, so you may submit bug fixes as pull requests against the dev branch.

Annotation Source
Lemmas annotated manually in non-UD style, automatically converted to UD
UPOS annotated manually in non-UD style, automatically converted to UD
XPOS not available
Features annotated manually in non-UD style, automatically converted to UD
Relations annotated manually in non-UD style, automatically converted to UD


Catalan data from the AnCora corpus.

The original annotation was done in a constituency framework as a part of the AnCora project at the University of Barcelona. It was converted to dependencies and used in the CoNLL 2009 shared task. The CoNLL 2009 version was later converted to HamleDT and to Universal Dependencies.

The GNU license is inherited from the original dataset, downloaded from the AnCora website. Any license-related questions have to be directed to the original data providers at the University of Barcelona (that is, not to the UD contact address listed at the end of this README file).


The following paper must be cited when using this corpus:

In addition, the following paper must be cited if coreference information (attributes entity, coreftype, corefsubtype, homophoricDD or entityref) is used:

Additionally, the following paper must be cited when argumental attributes in “sn” or “grup.nom” (attributes func, arg, tem or lexicalid) are used:

Statistics of UD Catalan AnCora

POS Tags






Tokenization and Word Segmentation



Nominal Features

Degree and Polarity

Verbal Features

Pronouns, Determiners, Quantifiers

Other Features


Auxiliary Verbs and Copula

Core Arguments, Oblique Arguments and Adjuncts

Here we consider only relations between verbs (parent) and nouns or pronouns (child).

Reflexive Passive

Verbs with Reflexive Core Objects

Relations Overview