UD English Atis
Language: English (code: en
)
Family: IE
This treebank has been part of Universal Dependencies since the UD v2.9 release.
The following people have contributed to making this treebank part of UD: Aslı Kuzgun, Neslihan Cesur, Olcay Taner Yıldız.
Repository: UD_English-Atis
Search this treebank on-line: PML-TQ
Download all treebanks: UD 2.15
License: CC BY-SA 4.0
Genre: nonfiction, news
Questions, comments? General annotation questions (either English-specific or cross-linguistic) can be raised in the main UD issue tracker. You can report bugs in this treebank in the treebank-specific issue tracker on Github. If you want to collaborate, please contact [kuzgunasli (æt) gmail • com]. Development of the treebank happens directly in the UD repository, so you may submit bug fixes as pull requests against the dev branch.
Annotation | Source |
---|---|
Lemmas | annotated manually |
UPOS | annotated manually, natively in UD style |
XPOS | not available |
Features | annotated manually in non-UD style, automatically converted to UD |
Relations | annotated manually, natively in UD style |
Description
UD Atis Treebank is a manually annotated treebank consisting of the sentences in the Atis (Airline Travel Informations) dataset which includes the human speech transcriptions of people asking for flight information on the automated inquiry systems.
UD Atis Treebank is manually annoated over the Atis data. The data is split into 4224 training, 586 test, and 572 development items.
Acknowledgments
We thank the Starlang Software for funding and supporting this work.
References
The ATIS corpus: https://github.com/howl-anderson/ATIS_dataset/blob/master/README.en-US.md
Statistics of UD English Atis
POS Tags
ADJ – ADP – ADV – AUX – CCONJ – DET – INTJ – NOUN – NUM – PART – PRON – PROPN – SYM – VERB
Features
Case – Degree – Gender – Mood – Number – NumType – Person – Poss – PronType – Tense – VerbForm
Relations
acl – acl:relcl – advcl – advmod – amod – appos – aux – aux:pass – case – cc – cc:preconj – ccomp – compound – compound:prt – conj – cop – csubj – dep – det – det:predet – discourse – dislocated – expl – fixed – flat – iobj – list – mark – nmod – nmod:poss – nmod:tmod – nsubj – nsubj:outer – nummod – obj – obl – obl:tmod – parataxis – reparandum – root – xcomp
Tokenization and Word Segmentation
- This corpus contains 5432 sentences and 61879 tokens.
- All tokens in this corpus are followed by a space.
- This corpus does not contain words with spaces.
- This corpus contains 11 types of words that contain both letters and punctuation. Examples: 'd, st., 's, o'clock, 'm, 're, 'll, 've, a.m., n't, o'hare
Morphology
Tags
- This corpus uses 14 UPOS tags out of 17 possible: ADJ, ADP, ADV, AUX, CCONJ, DET, INTJ, NOUN, NUM, PART, PRON, PROPN, SYM, VERB
- This corpus does not use the following tags: SCONJ, PUNCT, X
- This corpus contains 2 word types tagged as particles (PART): 's, to
- This corpus contains 14 lemmas tagged as pronouns (PRON): I, be, each, it, one, that, there, they, this, we, what, which, who, you
- This corpus contains 19 lemmas tagged as determiners (DET): I, a, all, another, any, both, each, either, like, no, some, that, the, these, they, this, what, which, you
- Out of the above, 8 lemmas occurred sometimes as PRON and sometimes as DET: I, each, that, they, this, what, which, you
- This corpus contains 9 lemmas tagged as auxiliaries (AUX): be, can, do, have, may, must, should, will, would
- Out of the above, 3 lemmas occurred sometimes as AUX and sometimes as VERB: be, do, have
- There are 3 (de)verbal forms:
- Fin
- AUX: is, are, does, do, 's, 'm, 're, am, 've
- VERB: need, show, want, are, arrive, is, have, leaves, go, arrives
- Inf
- AUX: be, will
- VERB: show, list, like, fly, give, leave, find, have, go, tell
- Part
- AUX: being
- VERB: leaving, arriving, going, departing, used, flying, connecting, looking, stopping, using
Nominal Features
- Neut
- PRON: it
- Plur
- DET: their
- NOUN: flights, fares, airlines, dollars, airports, cities, meals, times, prices, types
- PRON: we, they, them
- PROPN: airlines, tuesdays, mondays, sundays, thursdays, fridays, sunday
- Sing
- AUX-Fin: is, does, 's, 're
- DET: your, my
- NOUN: flight, pm, morning, wednesday, fare, trip, ground, round, transportation, class
- PRON: me, i, you, it
- PROPN: san, boston, denver, francisco, atlanta, pittsburgh, dallas, baltimore, philadelphia, washington
- VERB-Fin: is, leaves, arrives, serves, goes, has, flies, stops, makes, uses
- Acc
- PRON: me, them
- Nom
- PRON: i, you, it, we
Degree and Polarity
- Cmp
- ADJ: less, more, earlier
- ADV: less
- Pos
- ADJ: available, first, next, early, like, many, expensive, daily, seventh, last
- ADV: o'clock, now, much, back, also, early, then, only, first, again
- Sup
- ADJ: cheapest, earliest, latest, least, lowest, shortest, smallest, most, closest, highest
- ADV: most, earliest
Verbal Features
- Ind
- AUX-Fin: is, are, does, do, 's, 'm, 're, am, 've
- VERB-Fin: need, show, want, are, arrive, is, have, leaves, go, arrives
- Past
- VERB-Fin: served, provided, serviced, called, got, represented, used, wanted
- VERB-Part: used, interested, served, bound, carried, come, included, listed, located, offered
- Pres
- AUX-Fin: is, are, does, do, 's, 'm, 're, am, 've
- AUX-Part: being
- VERB-Fin: need, show, want, are, arrive, is, have, leaves, go, arrives
- VERB-Part: leaving, arriving, going, departing, flying, connecting, looking, stopping, using, returning
Pronouns, Determiners, Quantifiers
- Art
- DET: the, a, all, any, an, that, this, no, some, both
- Dem
- PRON: there
- Int,Rel
- ADV: how, where, when
- DET: what, which, like, that
- PRON: what, which, that, who, this, those
- Prs
- DET: your, my, their
- PRON: me, i, you, it, 's, one, this, we, each, that
- Card
- NUM: one, twenty, 5, 6, 10, 8, 7, 12, 4, 9
- Ord
- ADJ: first, seventh, second, eighth, third, fifth, sixth, fourth, ninth, tenth
- ADV: first
- Yes
- DET: your, my, their
- 1
- PRON: me, i, we
- 2
- PRON: you
- 3
- AUX-Fin: is, does, 's, 're
- PRON: it, they, them
- VERB-Fin: is, leaves, arrives, serves, goes, has, flies, stops, makes, uses
Other Features
Syntax
Auxiliary Verbs and Copula
- This corpus uses 1 lemmas as copulas (cop). Examples: be.
- This corpus uses 9 lemmas as auxiliaries (aux). Examples: will, do, would, can, be, may, should, have, must.
- This corpus uses 1 lemmas as passive auxiliaries (aux:pass). Examples: be.
Core Arguments, Oblique Arguments and Adjuncts
Here we consider only relations between verbs (parent) and nouns or pronouns (child).
- nsubj
- VERB-Fin--NOUN (333)
- VERB-Fin--PRON (36)
- VERB-Fin--PRON-Nom (507)
- VERB-Inf--NOUN (234)
- VERB-Inf--NOUN-ADP(of) (2)
- VERB-Inf--PRON (32)
- VERB-Inf--PRON-Nom (628)
- VERB-Part--NOUN (50)
- VERB-Part--NOUN-ADP(for) (1)
- VERB-Part--PRON (3)
- VERB-Part--PRON-Nom (56)
- obj
- VERB-Fin--NOUN (640)
- VERB-Fin--NOUN-ADP(about) (1)
- VERB-Fin--PRON (15)
- VERB-Fin--PRON-ADP(for) (1)
- VERB-Fin--PRON-Nom (5)
- VERB-Inf--NOUN (2244)
- VERB-Inf--NOUN-ADP(about) (23)
- VERB-Inf--NOUN-ADP(if) (1)
- VERB-Inf--NOUN-ADP(of) (5)
- VERB-Inf--NOUN-ADP(out)-ADP(about) (1)
- VERB-Inf--NOUN-ADP(than) (1)
- VERB-Inf--NOUN-ADP(with) (2)
- VERB-Inf--PRON (35)
- VERB-Inf--PRON-Acc (3)
- VERB-Inf--PRON-Nom (1)
- VERB-Part--NOUN (24)
- VERB-Part--NOUN-ADP(for) (10)
- VERB-Part--NOUN-ADP(in) (3)
- VERB-Part--NOUN-ADP(on) (1)
- iobj
- VERB-Fin--PRON-Acc (164)
- VERB-Inf--PRON-Acc (1073)
- VERB-Inf--PRON-Acc-ADP(for) (1)
- VERB-Inf--PRON-Nom (2)
Relations Overview
- This corpus uses 9 relation subtypes: acl:relcl, aux:pass, cc:preconj, compound:prt, det:predet, nmod:poss, nmod:tmod, nsubj:outer, obl:tmod
- The following 5 relation types are not used in this corpus at all: vocative, clf, orphan, goeswith, punct