home edit page issue tracker

This page pertains to UD version 2.

It appears that you have Javascript disabled. Please consider enabling Javascript for this page to see the visualizations.

UD Italian TWITTIRO

Language: Italian (code: it)
Family: IE

This treebank has been part of Universal Dependencies since the UD v2.5 release.

The following people have contributed to making this treebank part of UD: Alessandra T. Cignarella, Cristina Bosco, Manuela Sanguinetti.

Repository: UD_Italian-TWITTIRO
Search this treebank on-line: PML-TQ
Download all treebanks: UD 2.18

License: CC BY-SA 4.0

Genre: social

Questions, comments? General annotation questions (either Italian-specific or cross-linguistic) can be raised in the main UD issue tracker. You can report bugs in this treebank in the treebank-specific issue tracker on Github. If you want to collaborate, please contact [cigna (æt) di • unito • it]. Development of the treebank happens outside the UD repository. If there are bugs, either the original data source or the conversion procedure must be fixed. Do not submit pull requests against the UD repository.

Annotation	Source
Lemmas	annotated manually in non-UD style, automatically converted to UD
UPOS	annotated manually in non-UD style, automatically converted to UD
XPOS	annotated manually
Features	annotated manually in non-UD style, automatically converted to UD
Relations	annotated manually in non-UD style, automatically converted to UD

Description

TWITTIRÒ-UD is a collection of ironic Italian tweets annotated in Universal Dependencies. The treebank can be exploited for the training of NLP systems to enhance their performance on social media texts, and in particular, for irony detection purposes.

TWITTIRÒ-UD has been created by enriching a resource originally developed for training and testing irony detection systems, also exploited as a benchmark for the Italian irony detection task held in EVALITA 2018 (Cignarella et al., 2018c). The treebank comprises both the fine-grained annotation for irony applied in Karoui et al. (2017), and the morphological and syntactic information encoded by the UD format.

The original corpus consists of 1,424 tweets (28,387 tokens). The syntactic annotation process was carried out through alternating steps of automatic scripting and manual revision, and finally with some out-of-domain parsing experiments. Parsing results also underwent a manual revision by two independent annotators.

In order to meet the requirements of the EU General Data Protection Regulation (GDPR), entered into force on May 2018, the resource content has been pseudonymized, by substituting original tweet IDs and user names.

:warning: An overall amount of 527 tweets overlaps with PoSTWITA-UD. The overlapping content however has been distributed such that it ends up in the same partition in both treebanks.

Acknowledgments

Statistics of UD Italian TWITTIRO

POS Tags

ADJ – ADP – ADV – AUX – CCONJ – DET – INTJ – NOUN – NUM – PRON – PROPN – PUNCT – SCONJ – SYM – VERB – X

Features

Clitic – Definite – Degree – Foreign – Gender – Mood – Number – NumType – Person – Polarity – Poss – PronType – Tense – Typo – VerbForm

Relations

acl – acl:relcl – advcl – advmod – amod – appos – aux – aux:pass – case – cc – ccomp – compound – conj – cop – csubj – csubj:pass – dep – det – det:poss – det:predet – discourse – discourse:emo – dislocated – expl – expl:impers – expl:pass – fixed – flat – flat:foreign – flat:name – goeswith – iobj – list – mark – nmod – nsubj – nsubj:outer – nsubj:pass – nummod – obj – obl – obl:agent – orphan – parataxis – parataxis:appos – parataxis:discourse – parataxis:hashtag – parataxis:insert – parataxis:nsubj – parataxis:obj – punct – root – vocative – vocative:mention – xcomp

Tokenization and Word Segmentation

This corpus contains 1424 sentences, 28384 tokens and 29602 syntactic words.

This corpus contains 4471 tokens (16%) that are not followed by a space.

This corpus does not contain words with spaces.

This corpus contains 677 types of words that contain both letters and punctuation. Examples: @user, #labuonascuola, l', #monti, @user1, @user2, c', #renzi, e', #scuola, @user3, http://t.co/oDPUtx2DvV, #Grillo, #governo, http://t.co/oDPUtxkMK3, l’, un', #midaperruolo, d', po', #tfaordinario, https://t.co/oDPUtx2DvV, #manovra, @user4, cit., http://t.co/oDPUtxTqU7, #labuonascuolauncazzo, #m5s, #mario, #passodopopasso, #Quota96Scuola, #giannini, #jobsact, #rimontiamo, #berlusconi, #fullmonti, #ministri, #sapevatelo, #riformascuola, @user5, cos', http://t.co/8REeGqIhCK, quest', #elezioni, #fatepresto, #lascuolaingiusta, #liberalizzazioni, #oramonti, #pd, #postofisso

This corpus contains 1213 multi-word tokens. On average, one multi-word token consists of 2.00 syntactic words.
There are 174 types of multi-word tokens. Examples: del, della, al, alla, nel, dei, ai, dal, delle, dell', all', alle, nella, sul, degli, dalla, sulla, sui, col, sull', agli, dall', nei, sulle, nelle, dai, dalle, negli, allo, nell', farlo, dagli, glielo, dello, dirlo, ditemi, essersi, farci, farla, ricordargli, AMMETTILO, Armiamoci, Avendolo, Cercasi, Chiediamolo, Consolati, Convincetemi, Dicesi, Eccolo, Fatevi.

Morphology

Nominal Features

Gender

Fem
- ADJ: buona, bella, italiana, pubblica, prima, unica, igienica, nuova, prime, nuove
- AUX-Part: stata
- DET: la, le, una, un', questa, sua, mia, tutte, quella, tua
- NOUN: scuola, riforma, cosa, casa, crisi, vita, foto, volta, cit., fine
- PRON: la, quella, questa, le, lei, quelle, una, altra, mia, tante
- VERB: fatta, letta, interrogata, varata, iniziata, ritrovata, scritta, trovata, @user, Basta
- VERB-Part: fatta, letta, interrogata, varata, iniziata, ritrovata, scritta, trovata, @user, Basta

Masc
- ADJ: nuovo, primo, buon, italiano, bel, caro, giusto, italiani, unico, bello
- ADV: tutti
- AUX-Part: stato, potuto, stati
- DET: il, i, un, gli, lo, suo, tutti, mio, questo, uno
- NOUN: governo, anni, lavoro, anno, italiani, mesi, mondo, tagli, merito, ministro
- PRON: lo, tutti, tutto, li, gli, quello, questo, altro, nessuno, qualcuno
- PROPN: Folletto
- SYM: #cambiaverso
- VERB-Part: fatto, detto, morto, messo, avuto, dato, letto, arrivato, capito, lasciato
- X: mal

Number

Plur
- ADJ: elementari, grande, italiani, civili, giovani, bravi, prime, altri, brevi, brutti
- ADV: tutti
- AUX-Fin: sono, siamo, hanno, saranno, avete, abbiamo, erano, possono, siete, vogliono
- AUX-Part: stati
- DET: i, le, gli, tutti, tutte, nostri, dei, questi, sue, suoi
- NOUN: docenti, anni, insegnanti, italiani, mesi, tagli, giorni, precari, studenti, scuole
- PRON: ci, tutti, c', noi, li, vi, ce, voi, quelli, altri
- VERB-Fin: fanno, hanno, sono, speriamo, fate, dicono, dite, abbiamo, andiamo, aspettiamo
- VERB-Part: assunti, abilitati, morti, riusciti, tolti, @user, Arrestati, Comprese, Eliminati, acquisite

Sing
- ADJ: buona, nuovo, grande, primo, buon, bella, italiana, possibile, pubblica, italiano
- AUX-Fin: è, ha, era, e', ho, sarà, può, deve, sono, sia
- AUX-Part: stata, stato, potuto
- DET: il, la, un, l', una, lo, l’, suo, ogni, un'
- NOUN: governo, scuola, riforma, lavoro, anno, cosa, casa, vita, mondo, volta
- PRON: lo, mi, ti, la, io, tutto, gli, me, lui, quello
- PROPN: Folletto
- SYM: #cambiaverso
- VERB: continua, è, fa, fatto, ha, detto, dice, va, parla, sembra
- VERB-Fin: continua, è, fa, ha, dice, va, parla, sembra, ho, pare
- VERB-Part: fatto, detto, morto, messo, avuto, dato, letto, arrivato, capito, lasciato
- X: mal

Definite

Def
- DET: il, la, i, l', le, gli, lo, l’, l, a1

Ind
- DET: un, una, un', uno, na, 1, n', nà
- PRON: una

Degree and Polarity

Degree

Abs
- ADJ: PERICOLOSISSIMO, bellissimo, fedelissimi, piccolissimo, utilissimo, vivissimi

Cmp
- ADJ: inferiore, inodori, miglior, superiori

Polarity

Neg
- INTJ: No

Verbal Features

Mood

Cnd
- AUX-Fin: sarebbe, dovrebbe, potrebbe, avrebbe, avrei, vorrei, dovrei, potrei, Sareste, avremmo
- VERB-Fin: bisognerebbe, direbbe, direi, farei, sarebbe, servirebbe, vorrebbe, vorrei, #nelmulinochevorrei, Ebbero

Imp
- AUX-Fin: abbiamo, siate, Siamo
- VERB: CONTINUA, speriamo, vai, dite, leggi, scusate, pensa, venga, Aspettiamo, Venite
- VERB-Fin: CONTINUA, speriamo, vai, dite, leggi, scusate, pensa, venga, Aspettiamo, Venite

Ind
- AUX-Fin: è, ha, sono, era, e', hanno, siamo, ho, sarà, può
- VERB-Fin: continua, è, fa, ha, dice, va, sembra, parla, fanno, ho

Sub
- AUX-Fin: sia, fosse, siano, abbia, avessero, fossero, possa, stia, voglia, Possiamo
- VERB-Fin: vada, dica, faccia, abbia, arrivi, fossero, sia, spieghi, trovi, Dipendesse

Tense

Fut
- AUX-Fin: sarà, saranno, verrà, saremo, avrà, dovrà, potranno, potrà, dovranno, sara'
- VERB-Fin: farà, andremo, metterà, saranno, vedremo, arriverà, avrò, capirete, fara', faranno

Imp
- AUX-Fin: era, fosse, erano, aveva, ero, avevo, poteva, voleva, avessero, avevano
- VERB-Fin: era, aveva, bastava, facevo, fossero, pensavo, Dipendesse, ESISTEVA, Stavo, Voleva

Past
- AUX-Fin: fu, voleste
- AUX-Part: stata, stato, potuto, stati
- VERB-Fin: accorsero, appesi, compresi, dichiarò, entrò, farai, formò, morì, recò, scritte
- VERB-Part: fatto, detto, morto, messo, avuto, dato, letto, arrivato, capito, lasciato

Pres
- AUX-Fin: è, ha, sono, e', siamo, hanno, ho, può, deve, sia
- VERB: continua, è, fa, ha, dice, va, parla, sembra, fanno, ho
- VERB-Fin: continua, è, fa, ha, dice, va, parla, sembra, fanno, ho
- VERB-Part: avente

Pronouns, Determiners, Quantifiers

PronType

Art
- DET: il, la, i, un, l', le, una, gli, lo, l’
- PRON: una

Dem
- DET: questa, questo, sto, quel, quest', quella, questi, quelle, sti, qst
- PRON: quello, quella, questo, questa, quelli, quelle, questi, ciò, stesso, Gelli

Exc
- DET: che, Ke, quanta, quanto

Ind
- ADV: tutti
- DET: ogni, qualche, tutto, tutti, nessun, tutta, tutte, tanti, altro, nessuna
- PRON: tutti, tutto, altro, qualcosa, niente, nessuno, qualcuno, uno, nulla, altri

Int
- DET: che, quale, quanta, quanti, ke, quanto
- PRON: cosa, chi, cos', che, quanto, quale, cos, qual', quanti

Neg
- ADV: non, nn, neanche, no, Mica, nemmeno, neppure

Prs
- ADV: proprio
- DET: suo, mio, sua, mia, loro, nostri, nostro, sue, suoi, tua
- PRON: si, ci, lo, mi, c', ti, la, ne, noi, gli

Rel
- PRON: che, chi, cui, quanto, quale, cha, delinque, dove, quali, quanta

Tot
- DET: tutti, tutte, tutto, #celemeritiamotutte, tutta

NumType

Card
- ADJ: 1'
- NUM: due, 3, 2, mila, tre, 1, 12, 5, 7, 10

Ord
- ADJ: primo, prima, prime, primi, terza, 1', 1mo, 3, ennesima, seconda

Poss

Yes
- ADV: proprio
- DET: suo, mio, sua, mia, loro, nostri, nostro, sue, suoi, tua
- PRON: mia, LORO, mio, nostri, suo, tuo

Person

1
- AUX-Fin: siamo, ho, sono, abbiamo, ero, posso, avevo, dobbiamo, stiamo, possiamo
- PRON: ci, mi, c', noi, io, me, ce, miiiii
- VERB-Fin: ho, so, visto, speriamo, vedo, amo, faccio, abbiamo, andiamo, aspettiamo

2
- AUX-Fin: sei, avete, siete, puoi, vuoi, hai, siate, sarai, volete, 6
- PRON: ti, vi, te, voi, tu, ve, TU', t, t'
- VERB: CONTINUA, vai, fate, dite, hai, vuoi, fai, leggi, scusate, andate
- VERB-Fin: CONTINUA, vai, fate, dite, hai, vuoi, fai, leggi, scusate, andate

3
- AUX-Fin: è, ha, sono, era, e', hanno, sarà, può, deve, sia
- PRON: si, lo, la, li, gli, lui, l', se, le, glie
- VERB-Fin: continua, è, fa, ha, dice, va, sembra, parla, fanno, pare

Other Features

Clitic
- Yes
  - PRON: si, ci, lo, mi, c', ti, la, ne, gli, li

Foreign
- Yes
  - X: indoor

Typo
- Yes
  - ADV: in

Syntax

Auxiliary Verbs and Copula

This corpus uses 1 lemmas as copulas (cop). Examples: essere.

This corpus uses 7 lemmas as auxiliaries (aux). Examples: avere, essere, potere, dovere, volere, stare, andare.
This corpus uses 2 lemmas as passive auxiliaries (aux:pass). Examples: essere, venire.

Core Arguments, Oblique Arguments and Adjuncts

Here we consider only relations between verbs (parent) and nouns or pronouns (child).

nsubj
- VERB-Fin--NOUN (330)
- VERB-Fin--NOUN-ADP(di) (2)
- VERB-Fin--PRON (158)
- VERB-Fin--PRON-ADP(in) (1)
- VERB-Ger--NOUN (2)
- VERB-Ger--PRON (5)
- VERB-Inf--NOUN (24)
- VERB-Inf--PRON (15)
- VERB-Part--NOUN (65)
- VERB-Part--NOUN-ADP(quattro) (1)
- VERB-Part--PRON (40)

obj
- VERB-Fin--NOUN (424)
- VERB-Fin--NOUN-ADP(a) (1)
- VERB-Fin--NOUN-ADP(di) (2)
- VERB-Fin--PRON (182)
- VERB-Ger--NOUN (11)
- VERB-Ger--PRON (10)
- VERB-Inf--NOUN (201)
- VERB-Inf--NOUN-ADP(da) (1)
- VERB-Inf--NOUN-ADP(in) (1)
- VERB-Inf--PRON (77)
- VERB-Part--NOUN (96)
- VERB-Part--NOUN-ADP(di) (1)
- VERB-Part--PRON (40)

iobj
- VERB-Fin--NOUN-ADP(a) (1)
- VERB-Fin--PRON (126)
- VERB-Fin--PRON-ADP(a) (3)
- VERB-Ger--PRON (3)
- VERB-Inf--NOUN (1)
- VERB-Inf--NOUN-ADP(a) (1)
- VERB-Inf--PRON (22)
- VERB-Inf--PRON-ADP(a) (1)
- VERB-Part--NOUN-ADP(a) (1)
- VERB-Part--PRON (35)
- VERB-Part--PRON-ADP(a) (1)

Reflexive Passive

This corpus contains 13 lemmas that occur at least once with an expl:pass child. Examples: fare si, anticipare si, commemorare si, dare se, diagnosticare si, distruggere si, guardare si, mandare si, prevedere si, riassumere si, sentire si, tagliare si, vistare s'

Relations Overview

This corpus uses 20 relation subtypes: acl:relcl, aux:pass, csubj:pass, det:poss, det:predet, discourse:emo, expl:impers, expl:pass, flat:foreign, flat:name, nsubj:outer, nsubj:pass, obl:agent, parataxis:appos, parataxis:discourse, parataxis:hashtag, parataxis:insert, parataxis:nsubj, parataxis:obj, vocative:mention
The following 2 relation types are not used in this corpus at all: clf, reparandum