home edit page issue tracker

This page pertains to UD version 2.

It appears that you have Javascript disabled. Please consider enabling Javascript for this page to see the visualizations.

UD Italian ISDT

Language: Italian (code: it)
Family: Indo-European, Romance

This treebank has been part of Universal Dependencies since the UD v1.0 release.

The following people have contributed to making this treebank part of UD: Cristina Bosco, Alessandro Lenci, Simonetta Montemagni, Maria Simi.

Repository: UD_Italian-ISDT
Search this treebank on-line: PML-TQ
Download all treebanks: UD 2.13

License: CC BY-NC-SA 3.0

Genre: legal, news, wiki

Questions, comments? General annotation questions (either Italian-specific or cross-linguistic) can be raised in the main UD issue tracker. You can report bugs in this treebank in the treebank-specific issue tracker on Github. If you want to collaborate, please contact [simi (æt) di • unipi • it]. Development of the treebank happens outside the UD repository. If there are bugs, either the original data source or the conversion procedure must be fixed. Do not submit pull requests against the UD repository.

Annotation	Source
Lemmas	annotated manually in non-UD style, automatically converted to UD
UPOS	annotated manually in non-UD style, automatically converted to UD
XPOS	annotated manually
Features	annotated manually in non-UD style, automatically converted to UD
Relations	annotated manually in non-UD style, automatically converted to UD

Description

The Italian corpus annotated according to the UD annotation scheme was obtained by conversion from ISDT (Italian Stanford Dependency Treebank), released for the dependency parsing shared task of Evalita-2014 (Bosco et al. 2014).

ISDT is a resource annotated according to the Stanford dependencies scheme (de Marneffe et al. 2008, 2013a, 2013b, 2014), obtained through a semi-automatic conversion process starting from MIDT (the Merged Italian Dependency Treebank). MIDT, in turn, is the result of a previous effort in the direction of improving interoperability of data sets available for Italian by harmonizing and merging two existing dependency–based resources, differing both in corpus composition and adopted annotation schemes, namely:

TUT, the Turin University Treebank (Bosco et al. 2000);
ISST-TANL, first released as ISST-CoNLL for the CoNLL-2007 shared task (Montemagni, Simi 2007), which was developed as a joint effort by the Istituto di Linguistica Computazionale (ILC–CNR) and the University of Pisa and originating from the Italian Syntactic–Semantic Treebank (ISST, Montemagni et al. 2003).

The details of the harmonization and conversion process leading to MIDT are discussed in (Bosco, Montemagni, Simi, 2012). The Stanford annotation scheme, obtained from an enriched version of MIDT, was adapted to the specificity of the Italian language. We refer to (Bosco, Montemagni, Simi, 2013 and 2014) for a discussion.

Acknowledgments

We wish to thank all of the contributors to the original annotation efforts, as well as the supporting organizations, i.e. the Institute for Computational Linguistics “A. Zampolli”, the University of Pisa, and the University of Torino. Thanks go to Chiara Alzetta and Giulia Venturi for the good work in defining the error detection methodology and the manual revision / correction of automatically identified errors in Version 2.1.

Statistics of UD Italian ISDT

POS Tags

ADJ – ADP – ADV – AUX – CCONJ – DET – INTJ – NOUN – NUM – PART – PRON – PROPN – PUNCT – SCONJ – SYM – VERB – X

Features

Clitic – Definite – Degree – Foreign – Gender – Mood – Number – NumType – Person – Polarity – Poss – PronType – Tense – Typo – VerbForm

Relations

acl – acl:relcl – advcl – advmod – amod – appos – aux – aux:pass – case – cc – ccomp – compound – conj – cop – csubj – csubj:pass – dep – det – det:poss – det:predet – discourse – dislocated – expl – expl:impers – expl:pass – fixed – flat – flat:foreign – flat:name – goeswith – iobj – mark – nmod – nsubj – nsubj:pass – nummod – obj – obl – obl:agent – orphan – parataxis – punct – root – vocative – xcomp

Tokenization and Word Segmentation

This corpus contains 14167 sentences, 278423 tokens and 298337 syntactic words.

This corpus contains 39870 tokens (14%) that are not followed by a space.

This corpus does not contain words with spaces.

This corpus contains 209 types of words that contain both letters and punctuation. Examples: l', d', un', l’, art., c', quest', cos', d’, po', v., quest’, n., e', s', dov', 's, l., c’, un’, anch', att., quell', check-up, S., Sant', e-mail, tutt', ss., R.E.M., cinquant', es., Cost., F., artt., ecc., n', trent', com', dell', distr., s.p.a., sett., vent', Civ., Cod., H., Proc., T., W.

This corpus contains 19888 multi-word tokens. On average, one multi-word token consists of 2.00 syntactic words.
There are 810 types of multi-word tokens. Examples: del, della, dei, al, nel, dell', delle, alla, dal, all', nella, degli, dalla, ai, alle, nei, sul, nell', nelle, sulla, dall', dalle, dello, dai, negli, sulle, dell’, sull', agli, sui, dagli, allo, nello, col, sugli, dallo, all’, sullo, farsi, farlo, esserci, essersi, nell’, dall’, farne, coi, muoversi, osservarsi, renderlo, servirsi.

Morphology

Nominal Features

Gender

Fem
- ADJ: prima, italiana, altra, altre, stessa, seconda, nuova, nuove, economica, alta
- AUX-Part: stata, state, potuta, andata, fatta
- DET: la, le, una, sua, un', questa, sue, queste, tutte, molte
- NOUN: città, parte, persone, legge, società, proprietà, attività, vita, servitù, commissione
- PRON: la, le, quella, quelle, una, questa, essa, esse, altra, lei
- PROPN: hye
- VERB-Fin: prese
- VERB-Part: fatta, stabilite, fatte, vista, dovuta, considerata, costituita, fondata, nata, chiamata

Masc
- ADJ: primo, nuovo, altri, altro, stesso, vero, secondo, terzo, europeo, italiani
- ADP: du
- ADV: pochissimo
- AUX-Part: stato, stati, potuto, dovuto, voluto, andato, fatto, potuti
- DET: il, i, un, gli, lo, suo, questo, tutti, suoi, alcuni
- NOUN: anni, presidente, anno, fondo, diritto, film, stato, proprietario, mondo, caso
- NOUN-Part: partiti, previsto
- PRON: lo, quello, uno, li, questo, gli, lui, tutto, ciò, tutti
- VERB-Fin: chiamati
- VERB-Part: fatto, visto, vinto, avuto, tenuto, detto, nato, dato, messo, ricevuto
- X: mixer

Number

Plur
- ADJ: altri, grandi, seguenti, nazionali, importanti, locali, altre, speciali, internazionali, italiani
- ADP: quali
- AUX-Fin: sono, hanno, possono, erano, siano, devono, abbiamo, possiamo, siamo, vengono
- AUX-Part: stati, state, potuti
- DET: i, le, gli, tutti, suoi, alcuni, quanti, sue, questi, queste
- NOUN: anni, persone, paesi, opere, cittadini, diritti, giorni, membri, donne, condizioni
- NOUN-Part: partiti
- PRON: ci, li, noi, tutti, altri, loro, quelli, quelle, quali, le
- PROPN: hye
- VERB: hanno, sono, fanno, fatti, stabilite, trovano, stabiliti, applicano, partecipano, vivono
- VERB-Fin: hanno, sono, fanno, trovano, applicano, partecipano, vivono, abbiamo, esistono, lavorano
- VERB-Part: fatti, stabilite, stabiliti, fatte, derivanti, chiamati, appartenenti, costituite, posti, previsti

Sing
- ADJ: grande, presente, primo, comune, mondiale, prima, internazionale, nazionale, possibile, sociale
- ADP: quale, du
- ADV: pochissimo
- AUX-Fin: è, ha, può, era, deve, sia, fu, viene, aveva, venne
- AUX-Part: stato, stata, potuto, dovuto, voluto, andato, potuta, andata, fatta, fatto
- DET: il, la, l', un, una, lo, quale, sua, suo, un'
- NOUN: presidente, parte, anno, fondo, diritto, legge, stato, proprietario, mondo, caso
- NOUN-Part: previsto
- PRON: lo, qual, quanto, mi, quale, quello, uno, la, questo, cosa
- VERB-Fin: ha, è, trova, fa, chiama, dice, morì, significa, vede, era
- VERB-Part: fatto, visto, vinto, avuto, tenuto, detto, nato, dato, messo, ricevuto
- X: cultural, state

Definite

Def
- DET: il, la, i, l', le, gli, lo, l’, the, les
- PRON: le

Ind
- DET: un, una, un', uno, un’, A, dei, Une, delle, l'

Degree and Polarity

Degree

Abs
- ADJ: gravissimo, altissimo, altissima, bellissimo, chiarissimo, durissima, giovanissimi, grandissima, gravissimi, lunghissimo
- ADV: benissimo, moltissimo, pochissimo, fortissimo, lontanissimo, malissimo

Cmp
- ADJ: maggiore, maggior, migliore, inferiore, superiore, minore, maggiori, migliori, superiori, miglior

Polarity

Neg
- INTJ: no

Pos
- INTJ: sì

Verbal Features

Mood

Cnd
- AUX-Fin: sarebbe, potrebbe, avrebbe, dovrebbe, dovrebbero, potrebbero, sarebbero, vorrei, avrebbero, dovremmo
- VERB-Fin: bisognerebbe, comporterebbe, consentirebbe, direi, sarebbe, vorrei, avrebbe, sarebbero, farebbe, gradirei

Imp
- AUX-Fin: devi, dovete, sii
- VERB: v., Nomina, Dimmi, Elenca, vedi, Dammi, andate, clicca, ricorda, usa
- VERB-Fin: v., Nomina, Dimmi, Elenca, vedi, Dammi, clicca, ricorda, usa, vai

Ind
- AUX-Fin: è, sono, ha, può, hanno, era, possono, deve, fu, erano
- VERB-Fin: ha, è, hanno, trova, sono, fa, chiama, fanno, dice, morì

Sub
- AUX-Fin: sia, siano, possa, abbia, fosse, venga, avesse, debba, possano, fossero
- VERB-Fin: abbia, sia, faccia, abbiano, veda, siano, facciano, tratti, disponga, permetta

Tense

Fut
- AUX-Fin: sarà, saranno, potrà, dovrà, potranno, dovranno, verrà, sarò, avrà, potremo
- VERB-Fin: sarà, vedrà, avrà, farà, vedremo, andrà, continuerà, diventerà, saranno, avranno

Imp
- AUX-Fin: era, erano, aveva, avevano, fosse, avesse, poteva, potevano, fossero, avevo
- VERB-Fin: era, aveva, chiamava, erano, avevano, faceva, facevano, diceva, lavorava, lavoravano

Past
- AUX-Fin: fu, venne, furono, vennero, potè, Fui, dovette, poterono
- AUX-Part: stato, stata, stati, state, potuto, dovuto, voluto, andato, potuta, andata
- NOUN-Part: partiti, previsto
- VERB-Fin: morì, scrisse, nacque, ebbe, fu, vide, avvenne, divenne, portò, fece
- VERB-Part: fatto, visto, vinto, avuto, tenuto, detto, nato, dato, messo, ricevuto

Pres
- AUX-Fin: è, sono, ha, può, hanno, possono, deve, sia, viene, ho
- VERB-Fin: ha, è, hanno, trova, sono, fa, chiama, fanno, dice, significa
- VERB-Part: concedente, derivanti, appartenenti, concernente, aventi, esistenti, provenienti, concernenti, appartenente, avente

Pronouns, Determiners, Quantifiers

PronType

Art
- DET: il, la, i, l', le, un, gli, una, lo, un'
- PRON: le

Dem
- DET: questo, questa, questi, tale, queste, quest', quel, tali, quest’, quella
- PRON: quello, questo, ciò, quella, quelli, quelle, questa, questi, coloro, queste

Exc
- DET: che

Ind
- DET: ogni, alcuni, qualche, molti, più, qualsiasi, molte, diversi, alcune, alcuna
- PRON: uno, tutto, tutti, altri, una, altro, nessuno, più, molti, nulla

Int
- DET: quale, che, quanti, quante, quali, quanta, quanto, Qual, quel
- PRON: chi, qual, cosa, quanto, cos', che, quale, quanti, Quali, Quante

Neg
- ADV: non, neppure, nemmeno, no, neanche, mica, nè, perniente

Prs
- ADJ: propria
- DET: sua, suo, loro, suoi, sue, proprio, nostra, mio, nostro, nostri
- PRON: si, ci, lo, ne, c', mi, la, li, gli, lui

Rel
- DET: cui, quali
- PRON: che, cui, chi, quale, quanto, quali, dove, chiunque, quando, quanti
- SCONJ: che

Tot
- DET: tutti, tutte, tutto, tutta, entrambi, entrambe, ambedue, tutt', quanti

NumType

Card
- NUM: due, 1, 2, tre, 3, cinque, 4, mila, quattro, 5
- PROPN: 9/11

Ord
- ADJ: primo, prima, secondo, terzo, seconda, primi, prime, ultimi, ultimo, ii
- NUM: I

Range
- NUM: 3/4, 150/300, 2/3

Poss

Yes
- ADJ: propria
- DET: sua, suo, loro, suoi, sue, proprio, nostra, mio, nostro, nostri
- PRON: sua, suo, suoi, proprio, tuo, mia, miei, mio, nostro, tua

Person

1
- AUX-Fin: ho, abbiamo, possiamo, siamo, sono, vorrei, dobbiamo, devo, stiamo, posso
- PRON: ci, mi, noi, io, me, ce, I, m'
- VERB-Fin: credo, abbiamo, vediamo, so, ho, faccio, mettiamo, facciamo, metto, sappiamo

2
- AUX-Fin: puoi, devi, sei, hai, avete, siete, vuoi, volete, Dovevi, abbiate
- PRON: ti, vi, te, tu, voi, ve
- VERB: v., Nomina, Dimmi, vedi, fai, Elenca, hai, ricevi, Dammi, crei
- VERB-Fin: v., Nomina, Dimmi, vedi, fai, Elenca, hai, ricevi, Dammi, crei

3
- AUX-Fin: è, sono, ha, può, hanno, era, possono, deve, sia, fu
- PRON: si, lo, la, li, gli, lui, le, l', loro, se
- VERB-Fin: ha, è, hanno, trova, sono, fa, chiama, fanno, dice, morì

Other Features

Clitic
- Yes
  - PRON: si, ci, lo, ne, c', mi, la, li, gli, le

Foreign
- Yes
  - NOUN: Award
  - PROPN: Les, Nobody, barbares, knows
  - X: de, Illusions, perdues, la, ad, home, the, Come, Damage, Ecce

Typo
- Yes
  - ADJ: 1

Syntax

Auxiliary Verbs and Copula

This corpus uses 1 lemmas as copulas (cop). Examples: essere.

This corpus uses 10 lemmas as auxiliaries (aux). Examples: avere, essere, potere, dovere, volere, stare, venire, andare, fare, sapere.
This corpus uses 6 lemmas as passive auxiliaries (aux:pass). Examples: essere, venire, stare, andare, avere, potere.

Core Arguments, Oblique Arguments and Adjuncts

Here we consider only relations between verbs (parent) and nouns or pronouns (child).

nsubj
- VERB-Fin--NOUN (3772)
- VERB-Fin--PRON (1881)
- VERB-Ger--NOUN (46)
- VERB-Ger--PRON (21)
- VERB-Inf--NOUN (611)
- VERB-Inf--PRON (187)
- VERB-Part--NOUN (1299)
- VERB-Part--NOUN-ADP(di) (1)
- VERB-Part--PRON (559)

obj
- VERB-Fin--NOUN (3532)
- VERB-Fin--PRON (778)
- VERB-Ger--NOUN (360)
- VERB-Ger--PRON (53)
- VERB-Inf--NOUN (2879)
- VERB-Inf--NOUN-ADP(in) (1)
- VERB-Inf--PRON (450)
- VERB-Part--NOUN (1334)
- VERB-Part--PRON (317)

iobj
- VERB-Fin--PRON (316)
- VERB-Ger--PRON (26)
- VERB-Inf--PRON (172)
- VERB-Part--PRON (159)

Reflexive Passive

This corpus contains 150 lemmas that occur at least once with an expl:pass child. Examples: applicare si, osservare si, vedere si, fare si, presumere si, registrare si, usare si, aprire si, considerare si, intendere si, misurare si, produrre si, rendere si, trasferire si, effettuare si, esercitare si, pagare si, parlare si, prescrivere si, ripartire si, tradurre si, conservare si, eseguire si, indicare si, istituire si, ottenere si, ricavare si, valutare si, acquistare si, cambiare si, compiere si, comprendere si, computare si, concedere si, costruire si, determinare si, formare si, giocare si, incontrare si, incorporare si, mettere si, operare si, paragonare si, prevedere si, raggiungere si, richiedere si, ricordare si, ripetere si, ritenere si, sostenere si

Relations Overview

This corpus uses 11 relation subtypes: acl:relcl, aux:pass, csubj:pass, det:poss, det:predet, expl:impers, expl:pass, flat:foreign, flat:name, nsubj:pass, obl:agent
The following 3 relation types are not used in this corpus at all: clf, list, reparandum