home edit page issue tracker

This page pertains to UD version 2.

UD Italian MarkIT

Language: Italian (code: it)
Family: Indo-European, Romance

This treebank has been part of Universal Dependencies since the UD v2.10 release.

The following people have contributed to making this treebank part of UD: Teresa Paccosi, Alessio Palmero Aprosio, Sara Tonelli.

Repository: UD_Italian-MarkIT
Search this treebank on-line: PML-TQ
Download all treebanks: UD 2.13

License: CC BY 4.0

Genre: grammar-examples

Questions, comments? General annotation questions (either Italian-specific or cross-linguistic) can be raised in the main UD issue tracker. You can report bugs in this treebank in the treebank-specific issue tracker on Github. If you want to collaborate, please contact [aprosio (æt) fbk • eu]. Development of the treebank happens outside the UD repository. If there are bugs, either the original data source or the conversion procedure must be fixed. Do not submit pull requests against the UD repository.

Annotation Source
Lemmas assigned by a program, with some manual corrections, but not a full manual verification
UPOS assigned by a program, with some manual corrections, but not a full manual verification
XPOS assigned by a program, with some manual corrections, but not a full manual verification
Features assigned by a program, with some manual corrections, but not a full manual verification
Relations annotated manually, natively in UD style

Description

The MarkIT resource contains around 800 sentences extracted from students’ essays manually annotated with syntactic depencendies. The treebank covers seven types of marked constructions, plus some ambiguous sentences whose syntax can be wrongly classified as marked.

MarkIT is a treebank of marked constructions in Italian, containing around 1,300 sentences with dependency annotation. First we automatically annotate the sentences using Tint, then a manual fix of the errors is performed on the whole dataset. The resource covers seven types of marked constructions plus some ambiguous sentences, whose syntax can be wrongly classified as marked.

Acknowledgments

The selection, extraction, and annotation of the dataset have been performed by Teresa Paccosi, Alessio Palmero Aprosio, and Sara Tonelli.

Statistics of UD Italian MarkIT

POS Tags

ADJADPADVAUXCCONJDETINTJNOUNNUMPRONPROPNPUNCTSCONJVERBX

Features

CliticDefiniteDegreeGenderMoodNumberNumTypePersonPossPronTypeTenseVerbForm

Relations

aclacl:relcladvcladvmodamodapposauxaux:passcaseccccompcompoundconjcopcsubjdetdet:possdet:predetdiscoursedislocatedexplexpl:impersexpl:passfixedflatflat:foreignflat:nameiobjmarknmodnsubjnsubj:outernsubj:passnummodobjoblobl:agentorphanparataxispunctrootxcomp

Tokenization and Word Segmentation

Morphology

Tags

Nominal Features

Degree and Polarity

Verbal Features

Pronouns, Determiners, Quantifiers

Other Features

Syntax

Auxiliary Verbs and Copula

Core Arguments, Oblique Arguments and Adjuncts

Here we consider only relations between verbs (parent) and nouns or pronouns (child).

Reflexive Passive

Relations Overview