home edit page issue tracker

This page pertains to UD version 2.

It appears that you have Javascript disabled. Please consider enabling Javascript for this page to see the visualizations.

UD English GUM

Language: English (code: en)
Family: IE

This treebank has been part of Universal Dependencies since the UD v2.2 release.

The following people have contributed to making this treebank part of UD: Siyao Peng, Amir Zeldes.

Repository: UD_English-GUM
Search this treebank on-line: PML-TQ
Download all treebanks: UD 2.18

License: CC BY-NC-SA 4.0

Genre: academic, blog, email, fiction, government, legal, news, nonfiction, social, spoken, web, wiki

Questions, comments? General annotation questions (either English-specific or cross-linguistic) can be raised in the main UD issue tracker. You can report bugs in this treebank in the treebank-specific issue tracker on Github. If you want to collaborate, please contact [amir • zeldes (æt) georgetown • edu]. Development of the treebank happens outside the UD repository. If there are bugs, either the original data source or the conversion procedure must be fixed. Do not submit pull requests against the UD repository.

Annotation	Source
Lemmas	annotated manually
UPOS	annotated manually in non-UD style, automatically converted to UD
XPOS	annotated manually
Features	annotated manually in non-UD style, automatically converted to UD
Relations	annotated manually, natively in UD style

Description

Universal Dependencies syntax annotations from the GUM corpus (https://gucorpling.org/gum/)

GUM, the Georgetown University Multilayer corpus, is an open source collection of richly annotated texts from multiple text types. The corpus is collected and expanded by students as part of the curriculum in the course LING-4427 “Computational Corpus Linguistics” at Georgetown University. The selection of text types is meant to represent different communicative purposes, while coming from sources that are readily and openly available (usually Creative Commons licenses), so that new texts can be annotated and published with ease.

The dependencies in the corpus up to GUM version 5 were originally annotated using Stanford Typed Depenencies (de Marneffe & Manning 2013) and converted automatically to UD using DepEdit (https://gucorpling.org/depedit/). The rule-based conversion took into account gold annotations found in other annotation layers of the GUM corpus (e.g. entity annotations), and has since been corrected manually in native UD. The original conversion script used can found in the GUM build bot code from version 5, available from the (non-UD) GUM repository. Documents from version 6 of GUM onwards were annotated directly in UD, and subsequent manual error correction to all GUM data has also been done directly using the UD guidelines. Enhanced dependencies were added semi-automatically from version 7.1 of the corpus. For more details see the corpus website.

Acknowledgments

GUM annotation team (so far - thanks for participating!)

Abhishek Purushothama, Adrienne Isaac, Akitaka Yamada, Alex Giorgioni, Alexandra Berends, Alexandra Slome, Amani Aloufi, Amber Hall, Amelia Becker, Anastasia Kelly, Andrea Price, Andrew O’Brien, Ángeles Ortega Luque, Anika Lippke, Aniya Harris, Anna Prince, Anna Runova, Anne Butler, Arianna Janoff, Aryaman Arora, Aurora Smedvig, Ayşenur Sağdiç, Ayan Mandal, Bennett Gilhuly, Bertille Baron, Bielasan Zaina, Bradford Salen, Brandon Tullock, Brent Laing, Caitlyn Pineault, Calvin Engstrom, Candice Penelton, Carlotta Hübener, Caroline Gish, Charlie Dees, Chenyue Guo, Chloe Evered, Cindy Luo, Colleen Diamond, Connor O’Dwyer, Cristina Lopez, Cynthia Li, Dan DeGenaro, Dan Simonson, Derek Reagan, Devika Tiwari, Diana Robson, Didem Ikizoglu, Edwin Ko, Eliza Rice, Emile Zahr, Emily Pace, Emma Manning, Emma Rafkin, Emma Thronson, Ethan Beaman, Felipe De Jesus, Garrison Smith, Han Bu, Hana Altalhi, Hang Jiang, Hannah Wingett, Hanwool Choe, Hassan Munshi, Helen Dominic, Ho Fai Cheng, Hortensia Gutierrez, Hyun Min, Jakob Prange, James Maguire, Janine Karo, Jehan al-Mahmoud, Jemm Excelle Dela Cruz, Jess Godes, Jessica Cusi, Jessica Kotfila, Jingni Wu, Joaquin Gris Roca, John Chi, Jongbong Lee, Juliet May, Jungyoon Koh, Kat Scarborough, Katarina Starcevic, Katelyn Carroll, Katelyn MacDougald, Katherine Conhaim, Katherine Vadella, Khalid Alharbi, Kohei Kajikawa, Kristen Cook, Kushaan Vardhan, Lanni Bu, Lara Bryfonski, Lauren Levine, Leah Northington, Lillian Ehrhart, Lin Ai, Lindley Winchester, Linxi Zhang, Lucia Donatelli, Luke Gessler, Mackenzie Gong, Margaret Anne Rowe, Margaret Borowczyk, Maria Laura Zalazar, Maria Stoianova, Mariko Uno, Mary Henderson, Maya Barzilai, Md. Jahurul Islam, Micaela Wells, Michael Kranzlein, Michaela Harrington, Mikayla Campbell, Mingyeong Choi, Minnie Annan, Mitchell Abrams, Mohammad Ali Yektaie, Naomee-Minh Nguyen, Negar Siyari, Nicholas Mararac, Nicholas Workman, Nicole Steinberg, Nitin Venkateswaran, Nola Goodwin, Parker DiPaolo, Phoebe Fisher, Rachel Kerr, Rachel Thorson, Rebecca Childress, Rebecca Farkas, Riley Breslin Amalfitano, Rima Elabdali, Robert Maloney, Ruizhong Li, Ryan Mannion, Ryan Murphy, Sakol Suethanapornkul, Sarah Bellavance, Sarah Carlson, Sasha Slone, Saurav Goswami, Sean Macavaney, Sean Simpson, Seyma Toker, Shane Quinn, Shannon Mooney, Shelby Lake, Shira Wein, Sichang Tu, Siddharth Singh, Siona Ely, Siyao Peng, Siyu Liang, Stephanie Kramer, Sylvia Sierra, Talal Alharbi, Tatsuya Aoyama, Tess Feyen, Timothy Ingrassia, Trevor Adriaanse, Ulie Xu, Wai Ching Leung, Wenxi Yang, Wesley Scivetti, Xiaopei Wu, Xiulin Yang, Yang Liu, Yi-Ju Lin, Yifu Mu, Yilun Zhu, Yingzhu Chen, Yiran Xu, Young-A Son, Yu-Tzu Chang, Yuhang Hu, Yunjung Ku, Yushi Zhao, Zhijie Song, Zhuosi Luo, Zhuxin Wang, Amir Zeldes

… and other annotators who wish to remain anonymous!

References

The best paper to cite depends on the data you are using. To cite the corpus in general, please refer to the following article (but note that the corpus has changed and grown a lot in the time since); otherwise see different citations for specific aspects below:

Zeldes, Amir (2017) “The GUM Corpus: Creating Multilayer Resources in the Classroom”. Language Resources and Evaluation 51(3), 581–612.

@Article{Zeldes2017,
author = {Amir Zeldes},
title = {The {GUM} Corpus: Creating Multilayer Resources in the Classroom},
journal = {Language Resources and Evaluation},
year = {2017},
volume = {51},
number = {3},
pages = {581--612},
doi = {http://dx.doi.org/10.1007/s10579-016-9343-x}
}

If you are using the Reddit subset of GUM in particular, please use this citation instead:

Behzad, Shabnam and Zeldes, Amir (2020) “A Cross-Genre Ensemble Approach to Robust Reddit Part of Speech Tagging”. In: Proceedings of the 12th Web as Corpus Workshop (WAC-XII).

@InProceedings{BehzadZeldes2020,
author = {Shabnam Behzad and Amir Zeldes},
title = {A Cross-Genre Ensemble Approach to Robust {R}eddit Part of Speech Tagging},
booktitle = {Proceedings of the 12th Web as Corpus Workshop (WAC-XII)},
pages = {50--56},
year = {2020},
}

For papers focusing on the discourse relations, discourse markers or other discourse signal annotations, please cite the eRST paper:

@article{zeldes-etal-2025-erst,
title = "e{RST}: A Signaled Graph Theory of Discourse Relations and Organization",
author = "Zeldes, Amir and
Aoyama, Tatsuya and
Liu, Yang Janet and
Peng, Siyao and
Das, Debopam and
Gessler, Luke",
journal = "Computational Linguistics",
volume = "51",
number = "1",
year = "2025",
address = "Cambridge, MA",
publisher = "MIT Press",
url = "https://aclanthology.org/2025.cl-1.3/",
doi = "10.1162/coli_a_00538",
pages = "23--72"
}

For papers using GDTB/PDTB style shallow discourse relations, please cite:

Yang Janet Liu, Tatsuya Aoyama, Wesley Scivetti, Yilun Zhu, Shabnam Behzad, Lauren Elizabeth Levine, Jessica Lin, Devika Tiwari, and Amir Zeldes (2024), “GDTB: Genre Diverse Data for English Shallow Discourse Parsing across Modalities, Text Types, and Domains”. In: Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics: Miami, USA.

@inproceedings{liu-etal-2024-gdtb,
title = "{GDTB}: Genre Diverse Data for {E}nglish Shallow Discourse Parsing across Modalities, Text Types, and Domains",
author = "Liu, Yang Janet and
Aoyama, Tatsuya and
Scivetti, Wesley and
Zhu, Yilun and
Behzad, Shabnam and
Levine, Lauren Elizabeth and
Lin, Jessica and
Tiwari, Devika and
Zeldes, Amir",
editor = "Al-Onaizan, Yaser and
Bansal, Mohit and
Chen, Yun-Nung",
booktitle = "Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing",
month = nov,
year = "2024",
address = "Miami, Florida, USA",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2024.emnlp-main.684/",
doi = "10.18653/v1/2024.emnlp-main.684",
pages = "12287--12303"
}

If you are using the OntoNotes schema version of the coreference annotations (a.k.a. OntoGUM data in coref/ontogum/), please cite this paper instead:

@InProceedings{ZhuEtAl2021,
author = {Yilun Zhu and Sameer Pradhan and Amir Zeldes},
booktitle = {Proceedings of ACL-IJCNLP 2021},
title = {{OntoGUM}: Evaluating Contextualized {SOTA} Coreference Resolution on 12 More Genres},
year = {2021},
pages = {461--467},
address = {Bangkok, Thailand}

For papers focusing on named entities or entity linking (Wikification), please cite this paper instead:

@inproceedings{lin-zeldes-2021-wikigum,
title = {{W}iki{GUM}: Exhaustive Entity Linking for Wikification in 12 Genres},
author = {Jessica Lin and Amir Zeldes},
booktitle = {Proceedings of The Joint 15th Linguistic Annotation Workshop (LAW) and
3rd Designing Meaning Representations (DMR) Workshop (LAW-DMR 2021)},
year = {2021},
address = {Punta Cana, Dominican Republic},
url = {https://aclanthology.org/2021.law-1.18},
pages = {170--175},
}

For papers focusing on the salience annotations, please cite this paper instead:

@inproceedings{lin-zeldes-2024-gumsley,
title = "{GUM}sley: Evaluating Entity Salience in Summarization for 12 {E}nglish Genres",
author = "Lin, Jessica and
Zeldes, Amir",
editor = "Graham, Yvette and
Purver, Matthew",
booktitle = "Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)",
year = "2024",
address = "St. Julian{'}s, Malta",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2024.eacl-long.158/",
pages = "2575--2588"
}

For papers focusing on bridging anaphora, please cite this paper instead:

@inproceedings{levine-zeldes-2026-gumbridge,
title = "{GUMBridge}: a Corpus for Varieties of Bridging Anaphora",
author = "Levine, Lauren and
Zeldes, Amir",
booktitle = "Proceedings of LREC 2026",
year = "2026",
address = "Mallorca",
}

Statistics of UD English GUM

POS Tags

ADJ – ADP – ADV – AUX – CCONJ – DET – INTJ – NOUN – NUM – PART – PRON – PROPN – PUNCT – SCONJ – SYM – VERB – X

Features

Abbr – Case – Definite – Degree – ExtPos – Foreign – Gender – Mood – Number – NumForm – NumType – Person – Polarity – Poss – PronType – Reflex – Style – Tense – Typo – VerbForm – Voice

Relations

acl – acl:relcl – advcl – advcl:relcl – advmod – amod – appos – aux – aux:pass – case – cc – cc:preconj – ccomp – compound – compound:prt – conj – cop – csubj – csubj:outer – csubj:pass – dep – det – det:predet – discourse – dislocated – expl – fixed – flat – goeswith – iobj – list – mark – nmod – nmod:desc – nmod:poss – nmod:unmarked – nsubj – nsubj:outer – nsubj:pass – nummod – obj – obl – obl:agent – obl:unmarked – orphan – parataxis – punct – reparandum – root – vocative – xcomp

Tokenization and Word Segmentation

This corpus contains 14353 sentences, 252284 tokens and 256739 syntactic words.

This corpus contains 34860 tokens (14%) that are not followed by a space.

This corpus does not contain words with spaces.

This corpus contains 531 types of words that contain both letters and punctuation. Examples: 's, n't, ’s, 're, 'm, n’t, 've, 'll, 'd, U.S., ’re, Mr., ’m, ’ve, ’ll, ’d, W., e.g., etc., th-, T., L'Enfant, al., St., A., Mrs., w-, Dr., n-, c., d-, d., f-, i.e., m., non-avian, s-, D.C., a.m., Mof-Ávvi, b., pro-Beijing, p., t-, Naqsh-e, cross-sectional, #logos, #sharedvalues, #systemanalysis, J.

This corpus contains 4455 multi-word tokens. On average, one multi-word token consists of 2.00 syntactic words.
There are 645 types of multi-word tokens. Examples: it's, don't, I'm, that's, you're, gonna, it’s, there's, didn't, they're, I've, we're, can't, don’t, I'll, he's, let's, doesn't, I’m, that’s, cannot, you’re, I'd, what's, isn't, wasn't, you'll, she's, you've, won't, didn’t, we’re, can’t, haven't, we've, we'll, you'd, city's, couldn't, who's, let’s, wanna, she’s, world's, wouldn't, Warhol's, aren't, they'll, he’s, shouldn't.

Morphology

Nominal Features

Gender

Fem
- PRON: she, her, herself

Fem,Masc
- PRON: s/he

Masc
- PRON: he, his, him, himself

Neut
- PRON: it, its, itself, it's

Number

Plur
- AUX-Fin: are, were, have, 're, do, did, had, 've, ’re, was
- DET: these, those
- NOUN: people, years, things, days, guys, minutes, others, ways, words, months
- PRON: we, they, our, their, them, you, us, those, these, 's
- PROPN: States, Americans, Nations, skittles, Polaroids, Chathams, Pirates, Mets, Sox, Democrats
- VERB-Fin: have, are, had, know, want, need, got, make, do, go

Ptan
- NOUN: clothes, thanks, pants, means, glasses, 1960s, politics, 1970s, jeans, surroundings
- PROPN: Olympics, Netherlands, Paralympics, Philippines, Vans, Analytics, Forties, Maldives, McMunchies

Sing
- AUX: is, was, 's, has, do, 'm, did, ’s, had, does
- AUX-Fin: is, was, 's, has, do, 'm, did, ’s, had, does
- DET: this, that, half
- NOUN: time, way, day, year, world, life, today, city, work, lot
- NUM: half
- PRON: i, it, you, he, that, your, his, my, this, she
- PROPN: University, President, America, York, New, south, north, State, Warhol, figure
- SYM: %
- VERB: know, said, think, have, has, had, is, 's, let, mean
- VERB-Fin: know, said, think, have, has, had, is, 's, let, mean

Case

Acc
- PRON: it, you, me, them, us, him, her, 's, itself, yourself

Gen
- PRON: your, his, my, our, their, its, her, it's, he, it

Nom
- PRON: i, you, it, we, they, he, she, him, me, them

Definite

Def
- DET: the

Ind
- DET: a, an

Degree and Polarity

Degree

Cmp
- ADJ: more, better, greater, larger, further, higher, lower, easier, older, smaller
- ADV: more, later, less, longer, better, earlier, further, sooner, closer, faster

Pos
- ADJ: other, new, many, good, little, same, first, different, last, own
- ADV: really, well, back, still, again, too, much, actually, probably, away

Sup
- ADJ: most, best, least, largest, worst, greatest, highest, latest, biggest, smallest
- ADV: most, best, least, longest, fastest, foremost

Polarity

Neg
- ADV: no
- CCONJ: nor, neither
- INTJ: no, naw
- PART: not, n't, n’t, n`t

Pos
- INTJ: yeah, yes

Verbal Features

Mood

Imp
- AUX-Fin: be, do
- VERB-Fin: let, look, see, make, get, try, use, add, place, take

Ind
- AUX-Fin: is, was, are, 's, do, were, have, has, 're, 'm
- VERB-Fin: have, know, said, think, had, has, want, are, is, 's

Sub
- AUX-Fin: were, be
- VERB-Fin: become, clean, collaborate, face, look, make, remain, rise, rule, wear

Tense

Past
- AUX-Fin: was, were, did, had, 'd, ’d, got, where
- AUX-Part: been, 'd, had
- VERB-Fin: said, had, got, made, went, came, was, took, wanted, did
- VERB-Part: united, called, made, based, used, got, known, done, given, seen

Pres
- AUX-Fin: is, are, 's, do, have, has, 're, 'm, ’s, does
- AUX-Part: being, having, getting, doing
- VERB-Fin: have, know, think, has, want, are, is, 's, mean, says
- VERB-Part: going, gon, doing, trying, using, including, making, getting, looking, according

Voice

Pass
- VERB-Part: called, based, used, known, made, given, done, born, found, taken

Pronouns, Determiners, Quantifiers

PronType

Art
- DET: the, a, an

Dem
- ADV: then, here, there
- DET: this, that, these, those, yonder
- PRON: there, that, this, those, these

Emp
- PRON: itself, themselves, himself, yourself

Ind
- DET: some, all, any, every, another, each, such, both, either, quite
- PRON: something, anything, someone, anyone, somebody, anybody

Int
- ADV: when, how, where, why, whither, whenever, whereupon
- DET: which, what
- PRON: what, who, which, Whoever, whose

Neg
- DET: no, neither
- PRON: nothing, one, nobody

Prs
- PRON: i, it, you, we, they, he, your, his, my, our

Rcp
- DET: each
- PRON: one

Rel
- ADV: where, how, why, when, whenever, wherever, however, whereby
- DET: what
- PRON: that, which, what, who, whatever, whom, whose, whoever, Whosoever, wish

Tot
- DET: all, both, each, every
- PRON: everything, everyone, everybody, ev

NumType

Card
- NOUN: 1960s, 1970s, 1980s, 1990s, 1830s, 1950s, 1920s, 1930s, 1940s, 2000s
- NUM: one, two, 1, three, 2, 3, five, four, 4, 10
- PROPN: EIGHT, One

Frac
- ADV: half
- DET: half
- NOUN: half, quarter, third, thirds, quarters, fifth, fifths, halves, hundredths, millionth
- NUM: 2.0, 7.2, 1.5, 6.8, .08, 1.3, 4.0, half, 1.2, 1.4

Mult
- ADV: once, twice

Ord
- ADJ: first, second, third, 19th, 20th, fourth, fifth, 30th, 3rd, 10th
- ADV: first, second, third, 135th, Fifth, Fourth, 15th, sixth

Poss

Yes
- PRON: your, his, my, our, their, its, her, whose, yours, mine

Reflex

Yes
- PRON: itself, yourself, himself, themselves, myself, herself, ourselves, yourselves

Person

1
- AUX-Fin: 'm, do, was, have, am, 've, are, 're, did, were
- PRON: i, we, my, our, me, us, 's, myself, ’s, mine
- VERB-Fin: think, have, mean, know, had, thank, want, got, thought, said

2
- AUX-Fin: 're, do, are, have, did, ’re, 've, be, were, ’ve
- PRON: you, your, yourself, yours, ye, ya, y', yourselves
- VERB-Fin: know, let, have, get, want, see, look, make, think, put

3
- AUX-Fin: is, was, 's, are, has, were, have, ’s, had, does
- PRON: it, they, he, his, their, she, her, them, its, him
- VERB-Fin: said, has, are, had, have, is, 's, says, made, makes

Other Features

Abbr
- Yes
  - ADJ: OK, Eng., Epis., U., voc.
  - ADP: @, vs., vs
  - ADV: e.g., i.e., c., ca., PS, approx.
  - INTJ: OK
  - NOUN: etc., AI, TV, a.m., DNA, GIS, p.m., p., No., Ph.D.
  - PROPN: U.S., US, Mr., NASA, NATO, W., CC, T., USI, UK
  - VERB-Part: b., d., div., m., Wed, encl
  - X: al., P.S., Mlle., Ave

ExtPos
- ADP
  - ADJ: such, due
  - ADP: out, because, as, off, up, On
  - ADV: instead, prior, as, out, next
  - SYM: –, -, /, :
  - VERB-Part: according, depending
- ADV
  - ADJ: more, less, close, fewer
  - ADP: of, up, in, As
  - ADV: as, more, less, just, close
  - DET: all
  - NOUN: kind, sort
  - PRON: that
- CCONJ
  - ADV: as, rather
  - VERB-Fin: let
  - VERB-Inf: let
- NOUN
  - ADJ: such
- PRON
  - DET: each
  - PRON: one
- PROPN
  - ADJ: Happy, Simple
  - VERB-Fin: Walk, Walks
  - VERB-Inf: Write
  - VERB-Part: United, Wed
- SCONJ
  - ADJ: such
  - ADP: in
  - ADV: instead, rather, As, prior
  - SCONJ: so, as, in

Foreign
- Yes
  - ADJ: National
  - ADP: x
  - ADV: Ne, pas
  - DET: Une
  - INTJ: sh-
  - NUM: 62
  - PROPN: Shobhajatra, Mangal, de, Cérebro, Escola, do, et, Catarin, Conservatoire, Federal
  - PUNCT: !, ,, -, ?, “, ”
  - SYM: 33A, 56A
  - X: de, alcalde, 樋口, Ciao, Información, Montejo, Módulo, Palacio, Paseo, Shobha-

NumForm
- Combi
  - ADJ: 19th, 20th, 30th, 3rd, 10th, 17th, 21st, 13th, 15th, 25th
  - ADV: 135th, 15th
  - NOUN: 1960s, 1970s, 1980s, 1990s, 1830s, 1950s, 1920s, 1930s, 1940s, 2000s
- Digit
  - NUM: 1, 2, 3, 4, 10, 20, 6, 5, 15, 7
- Roman
  - NUM: II, I, IV, III, VI, XIV, XV, XVII
- Word
  - ADJ: first, second, third, fourth, fifth, ninth, seventh, sixth, tenth
  - ADV: first, once, twice, second, third, half, Fifth, Fourth, sixth
  - DET: half
  - NOUN: half, quarter, third, thirds, quarters, fifth, fifths, halves, hundredths, millionth
  - NUM: one, two, three, five, four, six, million, ten, twenty, hundred
  - PROPN: EIGHT, One

Style
- Coll
  - PART: ta
  - PRON: em, ya, ’em
- Expr
  - INTJ: hmm, Hmmm, Wow-eee, eee
- Vrnc
  - SCONJ: cause, cuz, 'cuz, ‘cuz
  - VERB-Fin: wan, ai
  - VERB-Inf: wan
  - VERB-Part: gon

Typo
- Yes
  - ADJ: residential, 2D, I.=, Indie, Water, beautiful, completed, crowed, digital, first
  - ADP: on, to, of, with, a, as, fro, from, in, is
  - ADV: aka, all, before, Non, a, abaut, ie, p, really, them
  - AUX-Fin: are, is, was, can, will, get, has, have, were, where
  - AUX-Inf: be
  - AUX-Part: been
  - CCONJ: and, n
  - DET: a, the, an, he, on, some, this, to
  - INTJ: y-, Ca-, Ro-, T-, alreet, alroot, f-, i-, n-, plo-
  - NOUN: lotos, etc, kind, nite, per, type, dodge, fisherman, m, order
  - NUM: 1, 19, one, 6:00, fiftyfive, five, to
  - PART: do, the, not
  - PRON: em, it, you, ya, i, it's, we, She, Who, ev
  - PROPN: sea, skittles, #langu, American, Chatnam, Fla., Hutter, JOHN, Misalette, Oija
  - PUNCT: ., ", –, -, (, ;, [, |, ’
  - SCONJ: cuz, cause, 'til, Altho, despite, that, then, whil, 'cuz, ‘cuz
  - VERB: got, dwibbling, Pre, questi, se, set, under, understand, United, Untied
  - VERB-Fin: got, set, address, ate, begun, belidve, belie-, beraded, cause, construe
  - VERB-Ger: fighting, leading, preceeding, recurring, traightening
  - VERB-Inf: understand, breath, experience, fall, go, happen, loose, makke, r, recieve
  - VERB-Part: dwibbling, got, United, Untied, charged, deeping, disappeared, exper-, food, going

Syntax

Auxiliary Verbs and Copula

This corpus uses 1 lemmas as copulas (cop). Examples: be.

This corpus uses 13 lemmas as auxiliaries (aux). Examples: have, be, do, can, will, would, could, should, may, might, must, shall, need.
This corpus uses 2 lemmas as passive auxiliaries (aux:pass). Examples: be, get.

Core Arguments, Oblique Arguments and Adjuncts

Here we consider only relations between verbs (parent) and nouns or pronouns (child).

nsubj
- VERB--PRON-Nom (3)
- VERB-Fin--NOUN (2425)
- VERB-Fin--PRON (925)
- VERB-Fin--PRON-Nom (4571)
- VERB-Ger--NOUN (23)
- VERB-Ger--NOUN-ADP(of) (1)
- VERB-Ger--PRON-Gen (1)
- VERB-Ger--PRON-Nom (9)
- VERB-Inf--NOUN (571)
- VERB-Inf--PRON (212)
- VERB-Inf--PRON-Nom (1909)
- VERB-Part--NOUN (469)
- VERB-Part--PRON (153)
- VERB-Part--PRON-Gen (1)
- VERB-Part--PRON-Nom (1261)

obj
- VERB--PRON-Acc (1)
- VERB-Fin--NOUN (3524)
- VERB-Fin--PRON (307)
- VERB-Fin--PRON-Acc (720)
- VERB-Fin--PRON-Gen (3)
- VERB-Ger--NOUN (512)
- VERB-Ger--PRON (21)
- VERB-Ger--PRON-Acc (39)
- VERB-Inf--NOUN (2711)
- VERB-Inf--PRON (390)
- VERB-Inf--PRON-Acc (559)
- VERB-Part--NOUN (1344)
- VERB-Part--PRON (152)
- VERB-Part--PRON-Acc (175)

iobj
- VERB-Fin--NOUN (47)
- VERB-Fin--PRON-Acc (157)
- VERB-Ger--NOUN (5)
- VERB-Ger--PRON (2)
- VERB-Ger--PRON-Acc (5)
- VERB-Inf--NOUN (59)
- VERB-Inf--PRON (1)
- VERB-Inf--PRON-Acc (109)
- VERB-Part--NOUN (11)
- VERB-Part--PRON (1)
- VERB-Part--PRON-Acc (34)

Verbs with Reflexive Core Objects

This corpus contains 85 lemmas that occur at least once with a reflexive core object (obj or iobj). Examples: find yourself, find himself, call themselves, find myself, find themselves, force yourself, give yourself, lick themselves, proclaim himself, teach himself, ask yourself, assert himself, associate itself, attach itself, beat yourself, better myself, better yourself, bind ourselves, bring myself, bring themselves, buy myself, call myself, coin myself, comfort yourself, confine ourselves, confine yourself, consider himself, consider themselves, convince yourself, declare himself, declare myself, defend himself, devote himself, discover herself, distinguish himself, distinguish itself, establish herself, exalt itself, expose yourself, fashion himself, feel himself, find itself, fire yourself, fling themselves, get themselves, get yourself, give themselves, go yourself, govern himself, haul themselves

Relations Overview

This corpus uses 15 relation subtypes: acl:relcl, advcl:relcl, aux:pass, cc:preconj, compound:prt, csubj:outer, csubj:pass, det:predet, nmod:desc, nmod:poss, nmod:unmarked, nsubj:outer, nsubj:pass, obl:agent, obl:unmarked
The following 1 relation types are not used in this corpus at all: clf