home edit page issue tracker

This page pertains to UD version 2.

Syntax: General Principles

Syntactic annotation in the UD scheme consists of typed dependency relations between words. The basic dependency representation forms a tree, where exactly one word is the head of the sentence, dependent on a notional ROOT and all other words are dependent on another word in the sentence, as exemplified below (where we explicitly represent the root dependency which will otherwise be left implicit).

ROOT she wanted to buy and eat an apple
nsubj(wanted, she)
root(ROOT, wanted)
mark(buy, to)
xcomp(wanted, buy)
cc(eat, and)
conj(buy, eat)
det(apple, an)
obj(buy, apple)

In addition to the basic dependency representation, which is obligatory for all UD treebanks, it is possible to give an enhanced dependency representation, which adds (and in a few cases changes) relations in order to give a more complete basis for semantic interpretation. The enhanced representation is in general not a tree but a general graph structure, as shown below (enhanced dependencies in blue).

# visual-style 5 2 nsubj color:blue
# visual-style 7 2 nsubj color:blue
# visual-style 3 7 xcomp color:blue
# visual-style 7 4 mark color:blue
# visual-style 7 9 obj color:blue
1	ROOT	_	_	_	_	0	root	_	_
2	she	_	_	_	_	3	nsubj	5:nsubj|7:nsubj	_
3	wanted	_	_	_	_	1	root	_	_
4	to	_	_	_	_	5	mark	7:mark	_
5	buy	_	_	_	_	3	xcomp	_	_
6	and	_	_	_	_	7	cc	_	_
7	eat	_	_	_	_	5	conj	3:xcomp	_
8	an	_	_	_	_	9	det	_	_
9	apple	_	_	_	_	5	obj	7:obj	_

In the rest of this document, we discuss the fundamental principles of our dependency annotation, focusing on aspects that are common to both the basic and the enhanced representation. For more information about basic and enhanced dependencies, we refer to the detailed annotation guidelines:

The goal of the typed dependency relations is a set of broadly observed “universal dependencies” that work across languages. Such dependencies seek to maximize parallelism by allowing the same grammatical relation to be annotated the same way across languages, while making enough crucial distinctions such that different things can be differentiated. Two things should be noted from the outset:

We now try to lay down some general principles that should guide the use of universal dependencies to achieve as much parallelism as possible (but not more) across languages. These fall under three headings:

The Primacy of Content Words

Dependency relations hold primarily between content words, rather than being indirect relations mediated by function words.

The cat could have chased all the dogs down the street . nsubj(chased, cat) obj(chased, dogs) obl(chased, street)

Function words attach as direct dependents of the most closely related content word.

The cat could have chased all the dogs down the street . det(cat, The) aux(chased, could) aux(chased, have) det(dogs, all) det(dogs, the-7) case(street, down) det(street, the-10)

Punctuation attaches to the head of the clause or phrase to which they belong.

The cat could have chased all the dogs down the street . punct(chased, .)

Putting this together gives a complete dependency tree where internal nodes are content words and where function words and punctuation appear as leaves.

The cat could have chased all the dogs down the street . nsubj(chased, cat) obj(chased, dogs) obl(chased, street) det(cat, The) aux(chased, could) aux(chased, have) det(dogs, all) det(dogs, the-7) case(street, down) det(street, the-10) punct(chased, .)

Preferring content words as heads maximizes parallelism between languages because content words vary less than function words between languages. In particular, one commonly finds the same grammatical relation being expressed by morphology in some languages or constructions and by function words in other languages or constructions, while some languages may not mark the information at all (such as not marking tense or definiteness).

On a dormi ... nsubj(dormi, On) aux(dormi, a)
We slept ... nsubj(slept, We)
Ivan is the best dancer nsubj(dancer, Ivan) cop(dancer, is) det(dancer, the) amod(dancer, best)
Ivan lučšij tancor nsubj(tancor, Ivan) amod(tancor, lučšij)

The Status of Function Words

The primacy of content words implies that function words normally do not have dependents of their own. In particular, it means that multiple function words related to the same content word always appear as siblings, never in a nested structure, regardless of their interpretation. A typical case is that of auxiliary verbs, which never depend on each other.

She/PRON could/AUX have/AUX been/AUX injured/VERB . aux(injured, could) aux(injured, have) aux:pass(injured, been)

Note that copula verbs are also counted as auxiliaries in this respect. In copula constructions, auxiliaries will therefore often be attached to predicates that are not verbs.

She/PRON could/AUX have/AUX been/AUX sick/ADJ . aux(sick, could) aux(sick, have) cop(sick, been)

Similarly, multiple determiners are always attached to the head noun.

All/DET these/DET three/NUM books/NOUN . det(books, All) det(books, these) nummod(books, three)

An exception from the rule that function words are not chained is the demonstrative+classifier construction, occurring e.g. in Chinese. Here the classifier forms a constituent with the demonstrative (which is a det) and is attached as a child of the demonstrative.

乘坐 這 輛 巴士 \n Chéngzuò zhè liàng bāshì \n Take this CLF bus
obj(乘坐, 巴士)
det(巴士, 這)
clf(這, 輛)
obj(Chéngzuò, bāshì)
det(bāshì, zhè)
clf(zhè, liàng)
obj(Take, bus)
det(bus, this)
clf(this, CLF)

We are aware that the choice to treat function words formally as dependents of content words is at odds with many versions of dependency grammar, which prefer the opposite relation for many syntactic constructions. We prefer to view the relations between content words and function words, not as dependency relations in the narrow sense, but as operations that modify the grammatical category of the content word so that it can participate in different dependency relations with other content words. We refer to these relations as functional relations or function word relations when we want to emphasize that they are different from dependency relations between content words. This view makes function words functionally (but not structurally) similar to morphological operations and is compatible with Tesnière’s notion of the nucleus as the locus of syntactic dependencies.

Nevertheless, there are four important exceptions to the rule that function words do not take dependents:

  1. Multiword function words
  2. Coordinated function words
  3. Function word modifiers
  4. Promotion by head elision

Multiword Function Words

The word forms that make up a fixed multiword expression are connected using the special dependency relation u-dep/fixed. By convention, the first word is always taken as the head, so when the multiword expression is a functional element, the initial word form will then superficially look like a function word with dependents.

They saw each/DET other/ADJ fixed(each, other) obj(saw, each)

Deciding whether an expression in a language should be treated as a fixed multiword expression is something that has to be decided for each language, and in some cases this will require somewhat arbitrary conventions, because it involves choosing a cut point along a path of grammaticalization. Nevertheless, most languages have some very common multiword expressions that effectively behave like other function words as linkers, marks, or case particles, and it would be highly undesirable not to recognize them as a multi-word function word. Examples in English include as well as (as a coordinating connective, like and), so that (a complex subordinating connective), and each other (as a reciprocal pronoun). Fixed multiword expressions are contrasted with other headless and/or idiomatic expressions below.

Coordinated Function Words

Head coordination is a syntactic process that can apply to almost any word category, including function words like conjunctions and prepositions. In such cases, the standard analysis of coordination is used and function words have dependents.

She drove to and from work . case(work,to) conj(to, from) cc(from, and)
I will do that if and when it happens . mark(happens,if) conj(if, when) cc(when, and)

Function Word Modifiers

Certain types of function words can take a restricted class of modifiers, mainly light adverbials (including negation). Typical cases are modified determiners like not every (linguist) and exactly two (papers) and modifiers of subordinating conjunctions.

not every linguist det(linguist, every) advmod(every, not)
exactly two papers nummod(papers, two) advmod(two, exactly)
just when you thought it was over mark(thought, when) advmod(when, just)

Negation can modify any function word, but other types of modifiers are disallowed for function words that express properties of the head word often expressed morphologically in other languages. This class, which we refer to as pure function words, includes auxiliary verbs, case markers (adpositions), and articles, but needs to be defined explicitly for each language. When pure function words appear with modifiers other than negation, we take the modifier to apply to the entire phrase and therefore attach it to the head word of the function word, as illustrated in the following example.

right before midnight case(midnight, before) advmod(midnight, right)

The analysis here is that right modifies the entire phrase before midnight and therefore attaches to midnight, which is the head of this phrase. (It is a general property of dependency trees that phrase modification is structurally indistinguishable from head modification.) Further support for this analysis comes from the possibility of replacing before midnight by the adverb then.

right then advmod(then, right)

Making sure that pure function words do not have dependents of their own facilitates comparison with languages where the corresponding properties are expressed morphologically as well as conversion to the enhanced representation where this difference is neutralized.

To sum up, our treatment of function word modifiers can be expressed in three principles:

  1. Pure function words can only be modified by negation.
  2. Other function words can also take (other) light adverbial modifiers.
  3. When in doubt, prefer a flat structure where function words attach to a content word.

Note also that the language-specific documentation should specify what words (if any) are treated as pure function words in that language.

Promotion by Head Elision

When the natural head of a function word is elided, the function word will be “promoted” to the function normally assumed by the content word head. This type of analysis should in general be preferred over an analysis using the u-dep/orphan relation, because it disrupts the structure less. The orphan analysis of ellipsis should only be used when there is no function word that can be promoted. The following examples illustrate promotion of auxiliaries, prepositions and subordinating conjunctions (but only the first example illustrates the exception from the rule than function words have no dependents).

Bill could not answer , but Ann could . nsubj(answer, Bill) aux(answer, could-2) conj(answer, could-8) nsubj(could-8, Ann)
The address she wrote to . acl:relcl(address, wrote) nsubj(wrote, she) obl(wrote, to)
I know how . nsubj(know, I) ccomp(know, how)

The Taxonomy of Typed Dependencies

We now review some of the key ideas underlying our taxonomy of typed dependency relations, focusing first on the central dependency relations between content words.

Core Arguments vs. Oblique Modifiers

The UD taxonomy is centered around the fairly clear distinction between core arguments (subjects, objects, clausal complements) versus other dependents. It does not make a distinction between adjuncts (general modifiers) versus oblique arguments (arguments said to be selected by a head but not expressed as a core argument). The rest of this section expands on the linguistic basis of these choices, and may be skipped.

The definition of core arguments

The core/oblique distinction is ultimately an information packaging distinction. All or nearly all languages have a basic way of expressing the one or two arguments of most verbs (intransitive and transitive verbs), and this unmarked form of argument expression is as a core argument. If additional arguments can appear that are treated similarly to these arguments, they may also be regarded as core arguments. (Some languages have no additional core arguments, while other languages allow multiple object arguments, for instance.) Status as a core argument is decoupled from the semantic roles of participants. Normally, depending on the meaning of a verb, many different semantic roles can be expressed by the same means of encoding core arguments. Nevertheless, there is a correlation: agent and patient or theme roles of predicates in their unmarked valence are normally realized as core arguments.

Syntactically, there is not a single criterion which can be used crosslinguistically to distinguish core arguments from obliques, though there are often good and useful criteria for particular languages. These include:

At the end of the day, the distinction must be drawn and documented on language particular grounds. For example, many languages have certain verbs which take arguments in oblique cases such as dative or an experiencer case, but these arguments should be regarded as core arguments based on their syntactic behavior being parallel to the arguments of other transitive verbs.

Avoiding an argument/adjunct distinction

Many grammatical frameworks suggest that some obliques are selected by or are arguments of a head (for instance, a source argument of from the Queen is an argument of the head receive), while other obliques are general adjuncts, which can appear with any predicate without the head selecting for them (for instance, a temporal argument such as after the holidays).

However, the argument/adjunct distinction is subtle, unclear, and frequently argued over. For instance, syntacticians at certain times have argued for various obliques to be arguments, while at other times arguing that they are adjuncts, particularly for certain semantic roles such as oblique instruments or sources. We take the distinction to be sufficiently subtle (and its existence as a categorical distinction sufficiently questionable) that the best practical solution is to eliminate it. This approach echoes the viewpoint of the original Penn Treebank annotators.

The core-oblique distinction is generally accepted in language typology as being both more relevant and easier to apply cross-linguistically than the argument-adjunct distinction. See, for example:

A Mixed Functional-Structural System

One major role of dependencies is to represent function, but the Universal Dependencies also encode structural notions. On the structural side, languages are taken to principally involve three things:

This three-way distinction is generally encoded in dependency names. For example, if a verb is taking an adverbial modifier, it may bear one of three relations u-dep/obl, u-dep/advcl, or u-dep/advmod depending on which of these three sorts it is:

John talked in the movie theatre case(theatre, in) det(theatre, the) compound(theatre, movie) obl(talked, theatre)
John talked while we were watching the movie mark(watching, while) nsubj(watching, we) aux(watching, were) advcl(talked, watching) det(movie, the) obj(watching, movie)
John talked very quickly advmod(quickly, very) advmod(talked, quickly)

Similarly, the core grammatical relations differentiate core arguments that are clauses (e.g., u-dep/csubj, u-dep/ccomp) from those that are nominal phrases (e.g., u-dep/nsubj, u-dep/obj).

Clausal Dependents

To classify dependents of the main predicate in a clause, the UD taxonomy obeys the following principles:

Additional distinctions (for example, with respect to voice) can be captured via language-specific subtypes (such as nsubj:pass for the subject of a passivized verb). Note that the UD taxonomy does not attempt to differentiate finite from nonfinite clauses.

Coordination

UD in principle assumes a symmetric relation between conjuncts, which have equal status as syntactic heads of the coordinate structure. However, because the dependency tree format does not allow this analysis to be encoded directly, the first conjunct in the linear order is by convention treated as the parent (or “technical head”) and all the other conjuncts are attached to it via the u-dep/conj relation. Coordinating conjunctions and punctuation delimiting the conjuncts are attached using the u-dep/cc and u-dep/punct relations respectively to the associated conjunct.

He came home , took a shower and immediately went to bed .
conj(came, took)
conj(came, went)
punct(took, ,-4)
cc(went, and)

Lexical Relations

UD provides the compound relation for head-modifier combinations that morphosyntactically resemble single lexemes, e.g. apple juice and work out. The criteria for compound need to be established on a language-specific basis.

Multiword Expressions and Headless Structures

Multiword expressions are combinations of words that (in some respect and to different degrees) behave as lexical units rather than compositional syntactic phrases, in particular by being semantically non-compositional. Since the UD annotation is concerned with morphosyntactic structure, most multiword expressions are not recognized as such in the UD annotation. The only exception is the class of fixed expressions like connective as well as and reciprocal pronoun each other, which are completely frozen and (often) morphosyntactically irregular. As discussed above, such expressions are annotated using the fixed relation to indicate that their internal structure is not regular and productive. Some other relations, such as compound and flat, are often appropriate for expressions that also happen to be non-compositional, but they are defined by morphosyntactic criteria and not by non-compositionality or other properties characteristic of multiword expressions.

Structures analyzed with u-dep/fixed and u-dep/flat are headless by definition and are consistently annotated by attaching all non-first elements to the first and only allowing outgoing dependents from the first element.

I like dogs as/ADV well/ADV as/ADP cats . fixed(as-4, well) fixed(as-4, as-6) cc(cats, as-4) conj(dogs, cats)
Barack/PROPN Obama/PROPN won the election . nsubj(won,Barack) flat(Barack,Obama)

By contrast, compounds are annotated to show their modification structure, including a regular concept of head:

I bought a computer disk drive enclosure . nsubj(bought, I) det(enclosure, a) compound(drive, computer) compound(drive, disk) compound(enclosure, drive) obj(bought, enclosure)

Special Relations

Besides core dependency relations, functional relations, and relations for analyzing coordination and headless structures, the UD taxonomy includes a number of special relations for handling things like punctuation (u-dep/punct), orthographic errors in text (u-dep/goeswith), disfluencies in speech (u-dep/reparandum), and list structures without internal syntactic structure (u-dep/list).