home edit page issue tracker

This page pertains to UD version 2.

Enhanced Dependencies in UD v2

The first version of the guidelines provided very little guidance regarding the enhanced representation and so far only very few treebanks contain additional dependencies. For v2, we propose the following guidelines for the enhanced representation:

Ellipsis

(See also the notes on ellipsis.)

In the enhanced representation, we add special null nodes in clauses in which a predicate is elided.

I like tea and you E5.1 coffee .

nsubj(like-2, I-1)
obj(like-2, tea-3)
nsubj(E5.1-6, you-5)
conj(like-2, E5.1-6)
obj(E5.1-6, coffee-7)
Mary wants to buy a book and Jenny E8.1 E8.2 a CD .

nsubj(wants-2, Mary-1)
xcomp(wants-2, buy-4)
obj(buy-4, book-6)
conj(wants-2, E8.1-9)
nsubj(E8.1-9, Jenny-8)
xcomp(E8.1-9, E8.2-10)
obj(E8.2-10, CD-12)

Note that this is a case in which the enhanced UD graph is not a supergraph of the basic tree as the basic tree contains orphan relations, which are not present in the enhanced UD graph.

Controlled/raised subjects

The basic trees lack a subject dependency between a controlled verb and its controller or between an embedded verb and its raised subject. In the enhanced graph, there is an additional dependency between the embedded verb and the subject of the matrix clause.

Mary wants to buy a book .

nsubj(wants, Mary)
xcomp(wants, buy)
nsubj(buy, Mary)
She seems to be reading a book .

nsubj(seems, She)
xcomp(seems, reading)
nsubj(reading, She)

Propagation of Conjuncts

In the basic representation, the governor and dependents of a conjoined phrase are all attached to the first conjunct. The enhanced representation also contains dependencies between the other conjuncts and the governor and dependents of the phrase.

Conjoined verbs and verb phrases

When two verbs share their objects (or other complements), the subject and the object of the conjoined verbs are attached to every conjunct.

The store buys and sells cameras .

nsubj(buys, store)
nsubj(sells, store)
conj(buys, sells)
obj(buys, cameras)
obj(sells, cameras)

However, if the complements of the second verb are not shared, only the subject is attached to every conjunct.

She was reading or watching a movie .

nsubj(reading, She)
nsubj(watching, She)
conj(reading, watching)
obj(watching, movie)

Conjoined subjects and objects

When the subject is a conjoined noun phrase, each of the conjuncts is attached to the predicate.

Paul and Mary are running .

nsubj(running, Paul)
nsubj(running, Mary)
conj(Paul, Mary)

The same is true for conjoined objects.

Paul bought apples and oranges .

nsubj(bought, Paul)
obj(bought, apples)
obj(bought, oranges)
conj(apples, oranges)

This leads to slightly strange dependencies in the case of collective subjects or objects:

Paul and Mary are meeting .

nsubj(meeting, Paul)
nsubj(meeting, Mary)
conj(Paul, Mary)
Mary is eating mac and cheese .

nsubj(eating, Mary)
obj(eating, mac)
conj(mac, cheese)
obj(eating, cheese)

However, as the distinction between distributive and collective readings is often context-dependent, we propose to take the simplest approach and to always attach all conjuncts to the predicate.

When the subject is attached to a control or raising predicate, there is a dependency between the matrix verb and each conjunct and between the embedded verb and each conjunct.

Mary and John wanted to buy a hat .

nsubj(wanted, Mary)
nsubj(wanted, John)
conj(Mary, John)
xcomp(wanted, buy)
nsubj(buy, Mary)
nsubj(buy, John)

Conjoined modifiers

Each conjunct in a conjoined modifier phrase gets attached to the governor of the modifier phrase. For example, the following phrase contains a conjoined adjectival phrase that modifies a noun. In the enhanced representation, there is an additional amod relation between the noun river and the second conjunct wide.

a long and wide river

amod(river, long)
amod(river, wide)
conj(long, wide)

Arguments of passive verbs

(See also the notes on core dependents for a detailed discussion of the new analysis of passive constructions in the basic representation.)

We propose that we no longer use a special nsubjpass relation in the basic representation. However, the distiction between regular subjects and subjects in passive constructions is still highly useful for many NLP tasks. We therefore propose to use the relations nsubj:pass and obl:agent for the arguments of a passivized verb.

The book was written by the author .

nsubj:pass(written, book)
obl:agent(written, author)
She was given the book .

nsubj:pass(given, She)
obj(given, book)

Relative clauses

In basic trees, relative pronouns are attached to the main predicate of the relative clause (typically with a nsubj or dobj relation). In the corresponding enhanced graphs, the relative pronoun is attached to what it is referring to with the special ref relation and the governor of the relative clause is attached as an argument to the main predicate of the relative clause. Note that such graphs contain a cycle.

Basic tree:

The boy who lived .

acl:relcl(boy, lived)
nsubj(lived, who)

Enhanced graph:

The boy who lived .

acl:relcl(boy, lived)
ref(boy, who)
nsubj(lived, boy)

Basic tree:

The book that I read .

acl:relcl(book, read)
obj(read, that)

Enhanced graph:

The book that I read .

acl:relcl(book, read)
ref(book, that)
obj(read, book)

Case Information

Adding prepositions (or case information) to the relation name of non-core dependents often makes it possible to disambiguate its semantic role. We therefore augment nmod, obl, acl and advcl relation labels with the preposition or the case of the modifier.

the house on the hill

nmod:on(house, hill)
case(hill, on)
He went to a diner after leaving work .

obl:to(went, diner)
case(diner, to)
advcl:after(went, leaving)
mark(leaving, after)
die Zerstörung der Stadt \n the destruction the.GEN city.GEN

nmod:gen(Zerstörung, Stadt)

We are aware that adding all of the discussed relations for the enhanced representation will require a significant amount of work. Further, many treebanks have been automatically converted from existing treebanks and many of them might contain only some of the information that is needed to add the relations for the enhanced representation. At the same time, we believe that having some of the additional relations is better than not having any additional relations at all. We therefore leave it up to the treebank maintainers how much of this proposal they want to implement.

However, maintainers should be aware that the different types of relations are not completely independent of each other and adding one type of information (e.g., null nodes) might require changes to existing additional relations. If the additional relations are added manually, we recommend the following order of annotations:

Additional enhancements

Some postprocessing steps such as demoting light nouns that behave like quantificational determiners (as described in this paper) can improve the usability of the dependency graphs for downstream applications. However, as most of these additions are highly language-specific, we do not provide any universal guidelines for such a representation and anything beyond the above additions is not part of the UD standard and should not be added to the officially released treebanks.