home issue tracker

This page pertains to UD version 2.

Universal Dependencies

The following table lists the 37 universal syntactic relations used in UD v2. It is a revised version of the relations originally described in Universal Stanford Dependencies: A cross-linguistic typology (de Marneffe et al. 2014).

The upper part of the table follows the main organizing principles of the UD taxonomy:

The lower part of the table lists relations that are not dependency relations in the narrow sense:

Nominals
Clauses
Modifier words
Function Words
Core arguments
nsubj
obj
iobj
csubj
ccomp
xcomp
Non-core dependents
obl
vocative
expl
dislocated
advcl
advmod*
discourse
aux
cop
mark
Nominal dependents
nmod
appos
nummod
acl
amod
det
clf
case
Coordination
Headless
Loose
Special
Other
conj
cc
fixed
flat
list
parataxis
compound
orphan
goeswith
reparandum
punct
root
dep

acl: clausal modifier of noun (adnominal clause)

acl stands for finite and non-finite clauses that modify a nominal. The acl relation contrasts with the advcl relation, which is used for adverbial clauses that modify a predicate. The head of the acl relation is the noun that is modified, and the dependent is the head of the clause that modifies the noun.

the issues as he sees them
acl(issues, sees)
There are many online sites offering booking facilities .
acl(sites, offering)
I have a parakeet named cookie .
acl(parakeet, named)
A president certain that they are correct is dangerous . 
acl(president, certain)
ccomp(certain, correct)
nsubj(dangerous, president)
I just want a simple way to get my discount .
acl(way, get)
Cette affaire à suivre \n This case to follow 
acl(affaire, suivre)

A relative clause is an instance of acl, characterized by finiteness and usually omission of the modified noun in the embedded clause. Some languages use a language-particular subtype acl:relcl for the traditional class of relative clauses.

I saw the man you love
acl:relcl(man, love)

Some languages allow finite clausal complements for nouns with a subset of nouns like fact or report. These look roughly like relative clauses, but do not have any omitted role in the dependent clause. This is the class of “content clauses” in Huddleston and Pullum 2002). These are also analyzed as acl.

the fact that nobody cares
acl(fact, cares)

This relation is no longer used for optional depictives: advcl should be used instead.

edit acl

acl:relcl: relative clause modifier

A relative clause modifier of a nominal is a clause that modifies the nominal, whereas the nominal is coreferential with a constituent inside the relative clause (here the constituent may be realized as a relative pronoun, another relative word, or it may not be overtly realized at all). The acl:relcl relation points from the head of the modified nominal to the head of the relative clause.

Depending on language, it may be required that relative clauses are finite. For example, English non-finite clauses are traditionally not termed relative; therefore, the girl that was born today is a relative clause because it is finite, while the girl born today is non-finite (the participle is not accompanied by a finite auxiliary) and it uses the plain acl relation. In other languages however, the distinction between finite and non-finite clauses may not exist or may not be used as a criterion for relative clauses.

I saw the man you love
acl:relcl(man, love)
I saw the book which you bought
acl:relcl(book, bought)

edit acl:relcl

advcl: adverbial clause modifier

An adverbial clause modifier is a clause which modifies a verb or other predicate (adjective, etc.), as a modifier not as a core complement. This includes things such as a temporal clause, consequence, conditional clause, purpose clause, etc. The dependent must be clausal (or else it is an advmod) and the dependent is the main predicate of the clause.

The accident happened as night was falling
advcl(happened, falling)
If you know who did it, you should tell the teacher
advcl(tell, know)
He talked to him in order to secure the account
advcl(talked, secure)
He was upset when I talked to him
advcl(upset, talked)
They heard about you missing classes.
advcl(heard, missing)
With the kids in school , I have plenty of free time
advcl(have, school)
mark(school, With)
nsubj(school, kids)
case(school, in)
She entered the room while sad
advcl(entered, sad)

Modifying Nominal Predicates

An advcl never modifies a nominal as such (then it would be acl instead) but it can modify a clausal predicate that is realized as a nominal, with or without copula. One has to distinguish whether the modifier clause modifies the whole predication of the matrix clause, or just the entity denoted by the nominal. Hence we have advcl in

He is a teacher , although he no longer teaches .
advcl(teacher, teaches)

but acl:relcl in

He is a teacher whom the students really love .
acl:relcl(teacher, love)

Optional Depictives

This relation is also used for optional depictive adjectives, where the adjective is introduced in clause structure independently of the nominal it describes (contrast: acl if the adjective is an adnominal predicate). The depictive adjective is treated as an adverbial clause modifier of the higher clause. The adjective also provides a secondary predication, where the nominal predicand may or may not be overt; if it is overt, the secondary predication can be represented with an enhanced dependency. See xcomp for further discussion of resultatives and depictives.

She entered the room sad
advcl(entered, sad)

Sad describes the person entering the room, not the manner of entering—but is still taken to modify the verb. Note the similarity to the while sad example above. Omitting the nominal predicand she does not change the basic analysis:

Entering the room sad is not recommended
advcl(Entering, sad)

edit advcl

advcl:relcl: adverbial relative clause modifier

This relation applies to a relative clause that modifies a clause (as opposed to typical relative clauses, which are adnominal and use acl:relcl).

For example, the antecedent is a clause in:

I tried to explain myself – which was a bad idea .
advcl:relcl(tried, idea)
nsubj(idea, which)

edit advcl:relcl

advmod: adverbial modifier

An adverbial modifier of a word is a (non-clausal) adverb or adverbial phrase that serves to modify a predicate or a modifier word.

In some situations in some languages, a limited set of adverbs can also modify nominals (e.g., only on Monday). The advmod relation or its subtype has to be used in such cases, too (see also advmod:emph).

Note that in some grammatical traditions, the term adverbial modifier covers constituents that function like adverbs regardless whether they are realized by adverbs, adpositional phrases, or nouns in particular morphological cases. We differentiate adverbials realized as adverbs (advmod) and adverbials realized by noun phrases or adpositional phrases (obl). However, we do not differentiate between modifiers of predicates (adverbials in a narrow sense) and modifiers of other modifier words like adjectives or adverbs (sometime called qualifiers). These functions are all subsumed under advmod.

Genetically modified food
advmod(modified, Genetically)
less often
advmod(often, less)
Where/ADV do/AUX you/PRON want/VERB to/ADP go/VERB later/ADV ?/PUNCT
advmod(go, Where)
advmod(go, later)
This is where/ADV I lived when/ADV I was born
nsubj(where, This)
cop(where, is)
advcl:relcl(where, lived)
advcl(lived, born)
advmod(born, when)
About 200 people came to the party
advmod(200, About)

edit advmod

advmod:emph: emphasizing word, intensifier

This is a special class of adverbial modifiers. It corresponds to the words that are attached in the analytical layer of PDT with the label AuxZ. In the tectogrammatical layer they often get the label (functor) RHEM (rhematizers).

While other adverbial modifiers usually modify verbs, adjectives or adverbs, these emphasizers often modify noun phrases, including prepositional phrases.

zvlášť v pondělí \n especially on Monday
advmod:emph(pondělí, zvlášť)
advmod:emph(Monday, especially)
jen 15 procent \n only 15 percent
advmod:emph(procent, jen)
advmod:emph(percent, only)

Other examples:

edit advmod:emph

advmod:lmod: locative adverbial modifier

A locative adverbial modifier is a subtype of the advmod relation: if the modifier is specifying a location, it is labeled as lmod.

Danish:

Han bøjer sig ned . \n He bends himself down .
advmod:lmod(bøjer, ned)

edit advmod:lmod

amod: adjectival modifier

An adjectival modifier of a noun (or pronoun) is any adjectival phrase that serves to modify the noun (or pronoun). The relation applies whether the meaning of the noun is modified in a compositional way (e.g., large house) or an idiomatic way (hot dogs).

An amod dependent may have its own modifiers (e.g., very large house) but the dependent should not be a clause. If it is a clause, then acl should be used.

Sam eats large hot dogs
amod(dogs, large)
amod(dogs, hot)
There is nothing wrong with it
amod(nothing, wrong)

edit amod

appos: appositional modifier

An appositional modifier of a noun is a nominal immediately following the first noun that serves to define, modify, name, or describe that noun. It includes parenthesized examples, as well as defining abbreviations in one of these structures.

Sam , my brother , arrived
appos(Sam-1, brother-4)
Bill ( John 's cousin )
appos(Bill-1, cousin-5)
The Australian Broadcasting Corporation ( ABC )
appos(Corporation-4, ABC-6)

appos is intended to be used between two nominals. In general, modulo punctuation, the two halves of an apposition can be switched. For example, you could also say My brother, Sam, arrived. There are somewhat similar constructions with titles where the title is less than a full nominal, such as state senator Paul Mnuchin, where reversal is impossible or would require insertion of a determiner to make a full nominal. Some grammatical traditions, descending from Latin, call state senator in such cases a “fixed (or close) apposition” and take the name as the head. However, we seem to have only one nominal not two here. For example:

President Obama

*Obama President

state senator Paul Mnuchin

*Paul Mnuchin state senator

appos should not be used in such cases. However, the examples can usually be rendered in a fuller form, corresponding to “loose (or wide) apposition” in the Latin tradition, where there are two full phrases. Then the relation appos is appropriate, for example:

Paul Mnuchin , the senior Oregon state senator
appos(Mnuchin-2, senator-8)

As is often the case, there are borderline cases. In formal writing, punctuation is usually a good signal of apposition, but there are certainly cases of apposition where no punctuation is used:

the leader of the militant Lebanese Shiite group Hassan Nasrallah
appos(leader-2, Hassan-9)
flat(Hassan-9, Nasrallah-10)

Good tests include to ask whether the two halves are full nominals, whether the two halves can be swapped or not, and whether there is case or agreement concord (in a language with rich morphology). So we have:

I met the French actor Gaspard Ulliel
nsubj(met-2, I-1)
det(actor-5, the-3)
amod(actor-5, French-4)
obj(met-2, actor-5)
appos(actor-5, Gaspard-6)
flat(Gaspard-6, Ulliel-7)
I met Gaspard Ulliel the French actor 
nsubj(met-2, I-1)
obj(met-2, Gaspard-3)
flat(Gaspard-3, Ulliel-4)
det(actor-7, the-5)
amod(actor-7, French-6)
appos(Gaspard-3, actor-7)
I met Gaspard Ulliel , the French actor 
nsubj(met-2, I-1)
obj(met-2, Gaspard-3)
flat(Gaspard-3, Ulliel-4)
punct(Gaspard-3, ,-5)
det(actor-8, the-6)
amod(actor-8, French-7)
appos(Gaspard-3, actor-8)
I met French actor Gaspard Ulliel
nsubj(met-2, I-1)
amod(actor-4, French-3)
obj(met-2, actor-4)
flat(actor-4, Gaspard-5)
flat(actor-4, Ulliel-6)

While items like abbreviations are generally reversable, the determiner test suggested above doesn’t quite work there, since the determiner seems to belong with the main item:

The ABC ( Australian Broadcasting Corporation )
appos(ABC-2, Corporation-6)

In the rare cases of more than one appositive nominal, all nouns should be marked as modifying the first noun, rather than being chained:

Sam , my brother , John 's cousin , arrived
appos(Sam-1, brother-4)
appos(Sam-1, cousin-8)

Note however that nested apposition cannot be completely excluded. It may occur in combination with coordination:

You can choose between four subjects , language ( German or French ) , economy , technology and art .
appos(subjects, language)
conj(language, economy)
conj(language, technology)
conj(language, art)
cc(art, and)
appos(language, German)
conj(German, French)
cc(French, or)

appos is also used to link key-value pairs in addresses, signature blocs, etc. (see also the list label):

Steve Jones Phone: 555-9814 Email: jones@abc.edf
flat:name(Steve-1, Jones-2)
list(Steve-1, Phone:-3)
list(Steve-1, Email:-5)
appos(Phone:-3, 555-9814-4)
appos(Email:-5, jones@abc.edf-6)

edit appos

aux: auxiliary

An aux (auxiliary) of a clause is a function word associated with a verbal predicate that expresses categories such as tense, mood, aspect, voice or evidentiality. It is often a verb (which may have non-auxiliary uses as well) but many languages have nonverbal TAME markers and these are also treated as instances of aux.

New from v2: Auxiliares used to construct the passive voice are now also labeled aux, although we strongly encourage the use of the subtype aux:pass in language that have a grammaticalized (periphrastic) passive.

Reagan has died
aux(died-3, has-2)
He should leave
aux(leave-3, should-2)
Do you think that he will have left by the time we come ?
aux(think, Do)
aux(left, will)
aux(left, have)

edit aux

aux:pass: passive auxiliary

A passive auxiliary of a clause is a form of the auxiliary verb být “to be” used to construct the periphrastic passive voice (in any tense or in the infinitive).

Kennedy byl zabit . \n Kennedy was killed .
aux:pass(zabit, byl)
aux:pass(killed, was)
Kennedy bude zabit . \n Kennedy will-be killed .
aux:pass(zabit, bude)
aux:pass(killed, will-be)
Kennedy netušil , že jeho osudem je být zabit . \n Kennedy did-not-anticipate that his fate is to-be killed .
aux:pass(zabit, být)
aux:pass(killed, to-be)

Note that the passive participle may be also used as nominal predicate with copula. Hence it may be difficult to distinguish a passive construction from a copula construction. The former focuses on the process while the latter emphasizes the result.

Smlouva byla podepsána v Bílém domě . \n Contract was signed in White House .
aux:pass(podepsána, byla)
aux:pass(signed, was)
Smlouva byla podepsána červeným inkoustem . \n Contract was signed in-red ink .
cop(podepsána, byla)
cop(signed, was)

edit aux:pass

case: case marking

The case relation is used for any case-marking element which is treated as a separate syntactic word (including prepositions, postpositions, and clitic case markers). Case-marking elements are treated as dependents of the noun they attach to or introduce. (Thus, contrary to SD, UD abandons treating a preposition as a mediator between a modified word and its object.) The case relation aims at providing a more uniform analysis of nominal elements, prepositions and case in morphologically rich languages: a nominal in an oblique case will receive the same dependency structure as a nominal introduced by an adposition.

the Chair 's office
det(Chair-2, the-1)
nmod(office-4, Chair-2)
case(Chair-2, 's-3)
the office of the Chair
det(office-2, the-1)
nmod(office-2, Chair-5)
case(Chair-5, of-3)
det(Chair-5, the-4)

French:

le bureau de le président \n the office of the Chair
det(bureau, le-1)
nmod(bureau, président)
case(président, de)
det(président, le-4)

Hebrew:

hwa/PRON rah/VERB at/PART[Case=Acc] h/DET klb/NOUN \n he saw ACC the dog  
obj(rah-2, klb-5)
case(klb-5, at-3)

When case markers are morphemes, they are not divided off the noun as a separate case dependent, but the noun as a whole is analyzed as obl (if dependent on a predicate) or nmod (if dependent on noun). To overtly mark case, POS tags and features are included in the representation as shown below on a Russian example (put your mouse pointer over the words to see additional morphosyntactic features).

# I wrote the letter with a quill.
1   Я         ja         PRON   _   Case=Nom|Number=Sing|Person=1|PronType=Prs        2   nsubj   _   I
2   написал   napisat'   VERB   _   Gender=Masc|Number=Sing|VerbForm=Part|Voice=Act   0   root    _   wrote
3   письмо    pis'mo     NOUN   _   Case=Acc|Gender=Neut|Number=Sing                  2   obj    _   the-letter
4   пером     pero       NOUN   _   Case=Ins|Gender=Neut|Number=Sing                  2   obl    _   with-a-quill

This treatment provides parallelism between different constructions across and within languages. A good result is that we now have greater parallelism between prepositional phrases and subordinate clauses, which are often introduced by a preposition in some languages (but note that the relation should be mark in those cases):

Sue left after the rehearsal
nsubj(left-2, Sue-1)
obl(left-2, rehearsal-5)
det(rehearsal-5, the-4)
case(rehearsal-5, after-3)
Sue left after we did
nsubj(left-2, Sue-1)
advcl(left-2, did-5)
mark(did-5, after-3)
nsubj(did-5, we-4)

We also obtain parallel constructions for

the Chair 's office
det(Chair-2, the-1)
nmod(office-4, Chair-2)
case(Chair-2, 's-3)
the office of the Chair
det(office-2, the-1)
nmod(office-2, Chair-5)
case(Chair-5, of-3)
det(Chair-5, the-4)
etsiä ilman johtolankaa \n to_search without clue.PARTITIVE
obl(etsiä, johtolankaa)
case(johtolankaa, ilman)
etsiä taskulampun kanssa \n to_search torch.GENITIVE with
obl(etsiä, taskulampun)
case(taskulampun, kanssa)
etsiä johtolangatta \n to_search clue.ABESSIVE
obl(etsiä, johtolangatta)
give the children the toys
obj(give, toys)
iobj(give, children)
give the toys to the children
obj(give, toys)
obl(give, children)
case(children, to)
# give the toys to the children
1     donner    donner   VERB   _   VerbForm=Inf               0   root   _   give
2     les       le       DET    _   Definite=Def|Number=Plur   3   det    _   the
3     jouets    jouet    NOUN   _   Gender=Masc|Number=Plur    1   obj   _   toys
4-5   aux       _        _      _   _                          _   _      _   _
4     à         à        ADP    _   _                          6   case   _   to
5     les       le       DET    _   Definite=Def|Number=Plur   6   det    _   the
6     enfants   enfant   NOUN   _   Gender=Masc|Number=Plur    1   obl   _   children

Another advantage of this new analysis is that it provides a treatment of prepositional phrases that are predicative complements of “be” that is consistent with the treatment of nominal predicative complements:

Sue is in shape
nsubj(shape-4, Sue-1)
cop(shape-4, is-2)
case(shape-4, in-3)

When prepositions are stacked (that is, there is a sequence of prepositions), there are two possible analyses. If the sequence is a frozen combination with a specific meaning, then the best analysis is as fixed. An English example of this is out of:

Out of all this , something good will come .
case(this-4, Out-1)
fixed(Out-1, of-2)
det(this-4, all-3)
obl(come, this-4)

However, if various combinations of prepositions can be used to express different meaning combinations or nuances, then each preposition is independently analyzed as a case dependent. Examples of this in English include up beside (which can alternate with down beside or up near) or except during which can alternate with as during or except after:

The cafe up beside the lookout
det(cafe-2, The-1)
case(lookout-6, up-3)
case(lookout-6, beside-4)
det(lookout-6, the-5)
nmod(cafe-2, lookout-6)

edit case

cc: coordinating conjunction

A cc is the relation between a conjunct and an associated coordinating conjunction.

Bill is big and honest
conj(big, honest)
cc(honest, and)
We have apples , pears , oranges , and bananas . obj(have, apples) conj(apples, pears) conj(apples, oranges) conj(apples, bananas) cc(bananas, and) punct(pears, ,-4) punct(oranges, ,-6) punct(bananas, ,-8)

A coordinating conjunction may also appear at the beginning of a sentence. This is also attached as cc, even though the sentence lacks multiple conjuncts joined with a conj relation.

And then we left .
cc(left, And)

edit cc

cc:preconj: preconjunct

A preconjunct is the relation between the head of coordination and the word that appears at the beginning of the coordination (which could be seen as the first part of a multi-word coordinating conjunction). English examples include either … or, neither … nor, both … and.

Both the boys and the girls are here
cc:preconj(boys, Both)

edit cc:preconj

ccomp: clausal complement

A clausal complement of a verb or adjective is a dependent clause which is a core argument. That is, it functions like an object of the verb, or adjective.

He says that you like to swim
ccomp(says, like)
mark(like, that)
He says you like to swim
ccomp(says, like)

Such clausal complements may be finite or nonfinite. However, if the subject of the clausal complement is controlled (that is, must be the same as the higher subject or object, with no other possible interpretation) the appropriate relation is xcomp.

The boss said to start digging
ccomp(said, start)
mark(start, to)
We started digging
xcomp(started, digging)

The key difference here is that, while it is possible to interpret the first sentence to mean that the boss will not be doing any digging, in the second sentence it is clear that the subject of digging can only be we. This is what distinguishes ccomp and xcomp.

Adjectives may also license ccomp:

I was afraid/ADJ that this would happen
ccomp(afraid, happen)

Reported Speech

With a speech verb like say, the content of reported speech is considered to be part of the verb’s valency. It therefore attaches as ccomp—not only when integrated within the clause as an indirect quotation (said that…), but also when set off as a direct quotation, even with inverted order:

He said that he knew the muffin man .
ccomp(said, knew)
I asked : " Do you know the muffin man ? "
ccomp(asked, know)
" Do you know the muffin man ? " I asked .
ccomp(asked, know)
" I had hoped to remain anonymous , " said the muffin man , who was tracked down Sunday at his home on Drury Lane .
ccomp(said, hoped)
nsubj(said, man)

Quoted content is considered to be ccomp even if it is a sentence fragment:

" Three/NUM muffins/NOUN , " he answered .
nummod(muffins, Three)
ccomp(answered, muffins)

If the speech verb interrupts the reported speech content, parataxis is used instead. The speech verb attaches to the root of the reported speech (all in the following example):

" Three muffins , " he answered , " are all that I need today . "
parataxis(all, answered)
nsubj(all, muffins)
Weapons of mass destruction , the report explained , are designed to target civilian populations .
parataxis(designed, explained)
nsubj:pass(designed, Weapons)
the impact that the group 's practices , law enforcement officials say , are having on the most vulnerable within the sect
acl:relcl(impact, having)
nsubj(having, practices)
parataxis(having, say)

Changed:

edit ccomp

clf: classifier

A clf (classifier) is a word which accompanies a noun in certain grammatical contexts. The most canonical use is numeral classifiers, where the word is used with a number for counting objects. A classifier generally reflects some kind of conceptual classification of nouns, based principally on features of their referents. Etymologically, classifiers are normally historically nouns, and the words may still also be used as independent nouns, but in their classifier use they have scant semantics left. In most cases, the most appropriate UPOS to give classifiers will still be NOUN, though you may wish to give the words a feature indicating their special status as a classifier. (There is at present no Universal feature for classifiers, but NounType=Clf might be apt.) The clf function is intended for languages which have highly grammaticalized systems of classifiers. The greatest density of such languages is in Asia. As well as core classifiers, there are often also other words, sometimes called “massifiers” that are used in counting with similar behavior to classifiers. These typically include words for containers (“cup”, “box”) and units (“month”, “inch”), such as Chinese 袋 ‘bag’ in 一袋米 [one bag rice] ‘a bag of rice’. In a classifier language, it is usually most appropriate to also analyze these words as classifiers. Most other languages also count things with units, however, for these languages, such as English, clf is not used and rather standard noun phrase relations are still used (despite there also being incipient grammaticalization in many cases, including English). See the examples for English at the end.

Here are some examples from Mandarin/Putonghua Chinese:

Syntactically, the classifier groups with the numeral rather than the noun and we therefore treat classifiers as functional dependents of numerals (or possessives) using the new clf relation. (This is one of Greenberg’s universals and is true in almost all cases. A couple of exceptions are noted in Aikhenvald (2000: 105) Classifiers, OUP, but it is noticeable that in those languages the putative head noun is in the genitive case.)

sān gè xuéshēng \n three clf student nummod(xuéshēng, sān) clf(sān, gè)

Sometimes a classifier is inserted between a demonstrative and a noun (instead of numeral and noun) [zh]:

乘坐 這 輛 巴士 \n Chéngzuò zhè liàng bāshì \n Take this CLF bus
obj(乘坐, 巴士)
det(巴士, 這)
clf(這, 輛)
obj(Chéngzuò, bāshì)
det(bāshì, zhè)
clf(zhè, liàng)
obj(Take, bus)
det(bus, this)
clf(this, CLF)

Classifier words also occur in various other constructions, and so it is important to distinguish the word in a particular language from the universal classifier function proposed in UD. We go through here some further examples with Chinese classifiers.

No noun may appear with the number and classifier. In this case, the classifier takes the role of the missing noun, and we promote the classifier to be the head. So 我 買 兩 本 “I am buying two” is regarded as “I am buying two [books-CLF]”.

我 買 兩 本 \n I buy two CLF
obj(買, 本)
nummod(本, 兩)

In some languages, including Chinese, a classifier can also appear without a number, and frequently then has some sort of determinative function. We use the relation det for such uses of a classifier. For instance, in Cantonese ‘She bought a/the book’:

佢 買 咗 本 書 \n keoi maai zo bun syu \n 3sg buy PERF CLF book
obj(買, 書)
det(書, 本)

For languages without highly grammaticalized classifier systems, standard nominal modification relationships are used even when things are being counted in groups (with “massifiers”). For example, in English:

three cups of rolled oats
nummod(cups, three)
case(oats, of)
amod(oats, rolled)
nmod(cups, oats)
three cups rolled oats
nummod(cups, three)
amod(oats, rolled)
nmod(cups, oats)

edit clf

compound: compound

The compound relation is used to analyze compounds, that is, combinations of lexemes that morphosyntactically behave as single words. Commonly occurring cases are:

Musa bé lá èbi \n Musa came took knife \n Musa came to take the knife
nsubj(bé, Musa)
compound:svc(bé, lá)
obj(bé, èbi)

Each language that uses compound should develop its own specific criteria based on morphosyntax (rather than lexicalization or semantic idiomaticity), though elsewhere the term “compound” may be used more broadly.

See also:

English Examples

phone book
compound(book, phone)
ice cream flavors
compound(cream, ice)
compound(flavors, cream)
Sam took out a 3 million dollar loan
compound(loan, dollar)
Sam took out a $ 3 million loan
compound(loan, $)
put up
compound:prt(put, up)

Not compound

Just because an expression is lexicalized or idiomatic does not mean compound applies. In English, adjective-noun combinations, prepositional phrases, and light verb constructions are better described with other relations:

hot dog
amod(dog, hot)
the state of play
det(state, the)
nmod(state, play)
case(play, of)
make a decision
obj(make, decision)
det(decision, a)

edit compound

compound:lvc: light verb construction

This subtype of compound covers light verbs. In a light-verb construction the verb does not have much semantic content. The semantics of the construction are determined by the non-head word, often a noun or adjective.

Onlar treni tercih ediyor . \n They prefer the train .
compound:lvc(ediyor, tercih)
obj(ediyor, treni)
subj(ediyor, Onlar)

Most common verbs that act like as a light verb is et-. However, many other are possible.

Yıllarca çile çektiler . \n They suffered for years .
compound:lvc(çektiler, çile)

Although the semantically loaded component of a light-verb construction is generally an adjective or a noun, it is common to observe verbs in this position particularly in code-switching settings.

Partiyi  cancel ettik . \n We canceled the party
compound:lvc(ettik, cancel)

edit compound:lvc

compound:prt: phrasal verb particle

The phrasal verb particle relation identifies an idiomatic phrasal verb, and holds between the verb and its particle (tagged as ADP). It is a subtype of the compound relation.

They shut down the station
compound:prt(shut, down)
They shut the station down
compound:prt(shut, down)

This relation excludes literal/directional uses of prepositions/particles, such as up, down, in, out, etc. These would typically become an ADV with the relation advmod:

The house was on fire and they ran out screaming.
advmod(ran, out)

edit compound:prt

compound:redup: reduplicated compounds

This subtype of compound covers a range of reduplicated forms in Turkish. Reduplication is a common process especially for adverbs and adjectives. Except for m-reduplication (see below), the head is the last word.

The reduplication typically involves two identical words, but some morpho-phonological alternations (as in m-reduplication in example 3 below) are possible.

Koca koca adamlar oyun oynuyorlar . \n _Big (+emph)_ men are playing games .
compound:redup(koca-2, Koca-1)
Açık açık söylüyorum . \n I am telling it _clearly_
compound:redup(açık-2, Açık-1)
Araba maraba almışlar . \n They bought (a) car (and things like that)
compound:redup(Araba, maraba)

For lexicalized multi-word items with repetition where one or more of the words are not free lexemes, (e.g. paldır küldür, ufak tefek), we use fixed.

edit compound:redup

compound:svc: serial verb compounds

The relation compound:svc is used for serial verb constructions. In this type of construction, several verbs are combined to describe the same action.

# visual-style 2 4 compound:svc	color:blue
# visual-style 4	bgColor:blue
# visual-style 4	fgColor:white
# visual-style 2	bgColor:blue
# visual-style 2	fgColor:white
1	dem	them	PRON	PRON	_	2	nsubj	_
2	enter	enter	VERB	VERB	_	0	root	_	_
3	bus	bus	NOUN	NOUN	_	2	obj	_	_
4	go	go	VERB	VERB	_	2	compound:svc	_	_
5	work	work	NOUN	NOUN	_	4	obj	_	_

1	they	_	_	_	_	0	_	_	_
2	enter	_	_	_	_	0	_	_	_
3	bus	_	_	_	_	0	_	_	_
4	go	_	_	_	_	0	_	_	_
5	work	_	_	_	_	0	_	_	_

1	They	_	_	_	_	0	_	_	_
2	take	_	_	_	_	0	_	_	_
3	the	_	_	_	_	0	_	_	_
4	bus	_	_	_	_	0	_	_	_
5	to	_	_	_	_	0	_	_	_
6	work	_	_	_	_	0	_	_	_

The verbs in a serial verb construction share the same subject but not necessarily the same object.

# visual-style 4 7 compound:svc	color:blue
# visual-style 4	bgColor:blue
# visual-style 4	fgColor:white
# visual-style 7	bgColor:blue
# visual-style 7	fgColor:white
# visual-style 13 15 compound:svc	color:blue
# visual-style 13	bgColor:blue
# visual-style 13	fgColor:white
# visual-style 15	bgColor:blue
# visual-style 15	fgColor:white
1	so	so	ADV	SCONJ	_	4	advmod	_	_
2	we	we	PRON	PRON	_	4	nsubj	_	_
3	don	don	AUX	AUX	_	4	aux	_	_
4	carry	carry	VERB	VERB	_	0	root	_	_
5	di	the	DET	DET	_	6	det	_	_
6	matter	matter	NOUN	NOUN	_	4	obj	_	_
7	come	come	VERB	VERB	_	4	compound:svc	_	_
8	again	again	ADV	ADV	_	7	advmod	_	_
9	as	as	SCONJ	ADP	_	13	mark	_	_
10	we	we	PRON	PRON	_	13	nsubj	_	_
11	dey	be	AUX	AUX	_	13	aux	_	_
12	always	always	ADV	ADV	_	13	advmod	_	_
13	carry	carry	VERB	VERB	_	7	advcl	_	_
14	am	he	PRON	PRON	_	13	obj	_	_
15	come	come	VERB	VERB	_	13	compound:svc	_	_

1	so	_	_	_	_	0	_	_	_
2	we	_	_	_	_	0	_	_	_
3	have	_	_	_	_	0	_	_	_
4	carry	_	_	_	_	0	_	_	_
5	the	_	_	_	_	0	_	_	_
6	matter	_	_	_	_	0	_	_	_
7	come	_	_	_	_	0	_	_	_
8	again	_	_	_	_	0	_	_	_
9	as	_	_	_	_	0	_	_	_
10	we	_	_	_	_	0	_	_	_
11	be	_	_	_	_	0	_	_	_
12	always	_	_	_	_	0	_	_	_
13	carry	_	_	_	_	0	_	_	_
14	it	_	_	_	_	0	_	_	_
15	come	_	_	_	_	0	_	_	_

1	so	_	_	_	_	0	_	_	_
2	we	_	_	_	_	0	_	_	_
3	have	_	_	_	_	0	_	_	_
4	brought	_	_	_	_	0	_	_	_
5	the	_	_	_	_	0	_	_	_
6	issue	_	_	_	_	0	_	_	_
7	again	_	_	_	_	0	_	_	_
8	as	_	_	_	_	0	_	_	_
9	we	_	_	_	_	0	_	_	_
10	always	_	_	_	_	0	_	_	_
11	do	_	_	_	_	0	_	_	_

An adjective may be used in place of a verb in a serial verb construction.

# visual-style 3 4 compound:svc	color:blue
# visual-style 4	bgColor:blue
# visual-style 4	fgColor:white
# visual-style 3	bgColor:blue
# visual-style 3	fgColor:white
1	di	the	DET	DET	_	2	det	_	_
2	guy	guy	NOUN	NOUN	_	3	nsubj	_	_
3	fine	fine	ADJ	ADJ	_	0	root	_	_
4	reach	arrive	VERB	VERB	_	3	compound:svc	_	_
5	me	I	PRON	PRON	_	4	obj	_	_

1	the	_	_	_	_	0	_	_	_
2	guy	_	_	_	_	0	_	_	_
3	fine	_	_	_	_	0	_	_	_
4	reach	_	_	_	_	0	_	_	_
5	me	_	_	_	_	0	_	_	_

1	Is	_	_	_	_	0	_	_	_
2	the	_	_	_	_	0	_	_	_
3	guy	_	_	_	_	0	_	_	_
4	as	_	_	_	_	0	_	_	_
5	handsome	_	_	_	_	0	_	_	_
6	as	_	_	_	_	0	_	_	_
7	I	_	_	_	_	0	_	_	_
8	am	_	_	_	_	0	_	_	_

Comparatives

In Naija serial verbs constructions are also used for comparatives. In these constructions the adjective which is being used to draw the comparison is followed by the verb pass.

# visual-style 2 3 compound:svc	color:blue
# visual-style 3	bgColor:blue
# visual-style 3	fgColor:white
# visual-style 2	bgColor:blue
# visual-style 2	fgColor:white
1	farmer	farmer	NOUN	NOUN	_	2	nsubj	_	_
2	happy	happy	ADJ	ADJ	_	0	root	_	_
3	pass	pass	VERB	VERB	_	2	compound:svc	
4	when	when	ADV	ADV	_	6	mark	_	_
5	rain	rain	NOUN	NOUN	_	6	nsubj	_	_
6	fall	fall	VERB	VERB	_	2	advcl	_	_
7	like	like	ADP	ADP	_	8	case	_	_
8	dis	this	DET	DET	_	6	obl	_	_

1	farmers	_	_	_	_	0	_	_	_
2	happy	_	_	_	_	0	_	_	_
3	exceed	_	_	_	_	0	_	_	_
4	when	_	_	_	_	0	_	_	_
5	rain	_	_	_	_	0	_	_	_
6	fall	_	_	_	_	0	_	_	_
7	like	_	_	_	_	0	_	_	_
8	this	_	_	_	_	0	_	_	_

1	Farmers	_	_	_	_	0	_	_	_
2	become	_	_	_	_	0	_	_	_
3	happier	_	_	_	_	0	_	_	_
4	when	_	_	_	_	0	_	_	_
5	rain	_	_	_	_	0	_	_	_
6	falls	_	_	_	_	0	_	_	_
7	like	_	_	_	_	0	_	_	_
8	this	_	_	_	_	0	_	_	_

edit compound:svc

conj: conjunct

A conjunct is the relation between two elements connected by a coordinating conjunction, such as and, or, etc. Coordinate structures are in principle symmetrical, but the first conjunction is by convention treated as the parent (or “technical head”) of all subsequent coordinated clauses via the conj relation.

Bill is big and honest
conj(big, honest)
We have apples , pears , oranges , and bananas . obj(have, apples) conj(apples, pears) conj(apples, oranges) conj(apples, bananas) cc(bananas, and) punct(pears, ,-4) punct(oranges, ,-6) punct(bananas, ,-8)

Coordinated clauses are treated the same way as coordination of other constituent types:

He came home , took a shower and immediately went to bed .
conj(came, took)
conj(came, went)
punct(took, ,-4)
cc(went, and)

Coordination may be asyndetic, which means that the coordinating conjunction is omitted. Commas or other punctuation symbols will delimit the conjuncts in the typical case. Asyndetic coordination may be more frequent in some languages, while in others, conjunction will appear between every two conjuncts (John and Mary and Bill).

Veni , vidi , vici .
conj(Veni, vidi)
conj(Veni, vici)
punct(vidi, ,-2)
punct(vici, ,-4)

Shared Dependents and Effective Parents in Coordination

Note that the current basic annotation scheme cannot distinguish between a dependent of the first conjunct and a shared dependent of the whole coordination:

He met her at the station and kissed her .
conj(met, kissed)
nsubj(met, He)

vs.

He met her at the station and she kissed him .
conj(met, kissed)
nsubj(met, He)
nsubj(kissed, she)

In contrast, the additional dependencies in the enhanced representation can be used to encode the fact that in the first case, he is also subject of kissed:

He met her at the station and kissed her .
conj(met, kissed)
nsubj(met, He)
nsubj(kissed, He)

Furthermore, the enhanced representation can also capture the relation of each conjunct to the parent of the coordination. Nevertheless, the effective parents can be found algorithmically and showing them explicitly is for convenience only, while the information about shared dependents is otherwise not available.

I saw that he met her at the station and kissed her .
conj(met, kissed)
nsubj(met, he)
nsubj(kissed, he)
ccomp(saw, met)
ccomp(saw, kissed)

If a dependent is shared among conjuncts, the basic representation always links it to the first conjunct (coordination head), while the enhanced representation shows all dependencies. In the following example, relations that are only part of the enhanced representation are shown in red.

# visual-style 6 1 amod color:red
# visual-style 4 3 amod color:red
# visual-style 6 3 amod color:red
1 American   _ _ _ _ 4 amod 6:amod        _
2 and        _ _ _ _ 3 cc   _             _
3 British    _ _ _ _ 1 conj 4:amod|6:amod _
4 professors _ _ _ _ 0 root _             _
5 and        _ _ _ _ 6 cc   _             _
6 students   _ _ _ _ 4 conj 0:root        _

Nested Coordination

Note further that the basic annotation scheme has only a limited capability to capture nested coordination such as apples and pears or oranges and lemons. Consider coordinations

The first two cases, i.e., (A, B, C) and ((A, B), C), lead to the same tree:

A B C
conj(A, B)
conj(A, C)

Only the right-nesting case (A, (B, C)) can be distinguished because its tree is different:

A B C
conj(B, C)
conj(A, B)

Etc.

The item etc., used as a set-expander—especially in coordinations after at least two other items, and typically not preceded by a conjunction (though and etc. is attested in English)—is treated as a NOUN and final conjunct. Its distribution is, however, atypical of nouns in that it is restricted to enumeration contexts, does not permit modification except by reduplication, and may be post-coordinated with things that are not nominals. Note that this guideline applies to English and other languages that borrowed the string etc. from Latin. The situation may be different in languages that have their own equivalent of etc. For example, German usw. (und so weiter) and Czech atd. (a tak dále), both meaning literally “and so further”, are ADV rather than NOUN, because their main element is an adverb; yet they are still attached as conj to the head of the preceding list or coordination.

We have apples/NOUN , pears/NOUN , etc./NOUN nsubj(have, We) obj(have, apples) conj(apples, pears) conj(apples, etc.) punct(pears, ,-4) punct(etc., ,-6)
nur ein paar Minuten Fußmarsch zu Fisherman/PROPN 's Wharf , Lombard/PROPN Street , usw/ADV ... advmod(Minuten, nur) det(paar, ein) det(Minuten, paar) nmod(Minuten, Fußmarsch) case(Fisherman, zu) flat(Fisherman, 's) flat(Fisherman, Wharf) conj(Fisherman, Lombard) punct(Lombard, ,-10) flat(Lombard, Street) conj(Fisherman, usw) punct(usw, ,-13) punct(Minuten, ...)
People were running/VERB , jumping/VERB , dancing/VERB , etc./NOUN all around us . nsubj(running, People) aux(running, were) conj(running, jumping) conj(running, dancing) conj(running, etc.) punct(jumping, ,-4) punct(dancing, ,-6) punct(etc., ,-8) obl(running, us) case(us, around) punct(running, .)
They gave Amy an apple , Bob a banana , Carl a carrot , etc./NOUN nsubj(gave, They) iobj(gave, Amy) obj(gave, apple) conj(gave, banana) conj(gave, carrot) conj(gave, etc.) orphan(banana, Bob) orphan(carrot, Carl) det(apple, an) det(banana, a-8) det(carrot, a-12) punct(banana, ,-6) punct(carrot, ,-10)
It is commonplace to buy flowers etc./NOUN for Valentine 's Day . conj(flowers, etc.)

edit conj

cop: copula

A cop (copula) is the relation of a function word used to link a subject to a nonverbal predicate, including the expression of identity predication (e.g. sentences like “Kim is the President”). It is often a verb but nonverbal (pronominal) copulas are also frequent in the world’s languages. Verbal copulas are tagged AUX, not VERB. Pronominal copulas are tagged PRON or DET.

The cop relation should only be used for pure copulas that add at most TAME categories to the meaning of the predicate, which means that most languages have at most one copula, and only when the nonverbal predicate is treated as the head of the clause.

As a concrete example, in many European languages the equivalent of the English verb to be is the only word that can appear with the cop relation. In Spanish and related languages, both ser and estar can be copulas. In Czech and related languages, both být and bývat are copulas (because they are morphological variants of the same lexeme, and the reason they have two lemmas is that aspect-related morphology is treated as derivational in these languages). In contrast, the equivalents of to become are not copulas despite the fact that traditional grammar may label them as such. Existential to be can be copula only if it is the same verb as in equivalence clauses (John is a teacher). If a language uses two different verbs, then the existential one is not a copula. Some more discussion of the topic is archived here.

Bill is honest
nsubj(honest, Bill)
cop(honest, is)
Ivan is the best dancer
nsubj(dancer-5, Ivan-1)
cop(dancer-5, is-2)
det(dancer-5, the-3)
amod(dancer-5, best-4)

The copula be is not treated as the head of a clause, but rather the nonverbal predicate, as exemplified above.

Such an analysis is motivated by the fact that many languages often or always lack an overt copula in such constructions, as in the the following Russian and Hebrew examples:

Ivan lučšij tancor \n Ivan best dancer
nsubj(tancor, Ivan)
amod(tancor, lučšij)
ani Kim \n I am Kim
nsubj(Kim-2, ani-1)

In informal English, this may also arise.

Email usually free if you have Wifi.
nsubj(free, Email)

This analysis is adopted also when the predicate is a prepositional phrase, provided that the same copula (or absence thereof) is used here, in which case the nominal part of the prepositional phrase is the head of the clause.

Sue is in shape
nsubj(shape, Sue)
cop(shape, is)
case(shape, in)

If the copula is accompanied by other verbal auxiliaries for tense, aspect, etc., then they are also given a flat structure, and taken as dependents of the lexical predicate:

Sue has been helpful
nsubj(helpful, Sue)
cop(helpful, been)
aux(helpful, has)

The motivation for this choice is that this structure is parallel to the flat structure which we give to auxiliary verbs accompanying verbs. In particular, in languages such as English, it is often very difficult to decide whether to regard a participle as a verb or an adjective. Perhaps the following sentence is such a case:

The presence of troops will be destabilizing .
nsubj(destabilizing, presence)
cop/aux(destabilizing, be)
aux(destabilizing, will)

While a part of speech (and associated deprel: cop vs. aux) has to be decided in such cases, it would be unfortunate if the choice of part of speech also changed the dependency structure. Note, however, that the exact distribution of the copula construction is subject to language-specific variation.

Finally, the cop may mark a predicate clause, i.e., a full clause serving as the predicate within an outer copular clause. In such cases, nsubj:outer or csubj:outer can be used to distinguish the outer subject:

-ROOT- The problem is that this has never been tried .
nsubj:outer(tried, problem)
cop(tried, is)
mark(tried, that)
nsubj:pass(tried, this)
aux(tried, has)
advmod(tried, never)
aux:pass(tried, been)
root(-ROOT-, tried)
The important thing is to keep calm .
nsubj:outer(keep, thing)
cop(keep, is)
mark(keep, to)
xcomp(keep, calm)

edit cop

csubj: clausal subject

A clausal subject is a clausal syntactic subject of a clause, i.e., the subject is itself a clause. The governor of this relation might not always be a verb: when the verb is a copular verb, the root of the clause is the complement of the copular verb. The dependent is the main lexical verb or other predicate of the subject clause. In the following examples, what she said (that is, said) is the clausal subject of makes and interesting, respectively.

New from v2: The csubj relation is also used for the clausal subject of a passive verb or verb group. For languages that have a grammaticalized passive transformation, it is strongly recommended to use the subtype csubj:pass in such cases. If the subject is of a copular clause whose predicate is itself a clause, csubj:outer may be used.

What she said makes sense
csubj(makes, said)
What she said is interesting
csubj(interesting, said)
What she said was well received
csubj:pass(received, said)

See also expletive subject examples under expl that use csubj.

edit csubj

csubj:outer: outer clause clausal subject

This relation specifies a clausal subject of a copular clause whose predicate is itself a clause, to signal that it is not the subject of the nested clause. See discussion of Predicate Clauses.

-ROOT- To hike in the mountains is to experience the best of nature .
root(-ROOT-, experience)
csubj:outer(experience, hike)
obl(hike, mountains)
mark(hike, To)
cop(experience, is)
mark(experience, to)
obj(experience, best)
For us to not attempt to solve the problem is for us to acknowledge defeat .
mark(attempt, For)
nsubj(attempt, us-2)
mark(attempt, to-3)
xcomp(attempt, solve)
csubj:outer(acknowledge, attempt)
cop(acknowledge, is)
mark(acknowledge, for)
nsubj(acknowledge, us-12)
obj(acknowledge, defeat)

The nominal counterpart of this relation is nsubj:outer.

The :outer subtype is not intended for most clausal subjects of copular clauses—only those where the predicate is itself a clause. Plain csubj (or another subtype) will be appropriate if the copular clause predicate is a nominal, adjective, etc.:

It is very important that your students respect you .
expl(important, It)
csubj(important, respect)

edit csubj:outer

csubj:pass: clausal passive subject

A clausal passive subject is a clausal syntactic subject of a passive clause.

Bylo mi doporučeno , abych to velmi dobře zvážil . \n It-has-been to-me recommended , that-I it very well weigh .
csubj:pass(doporučeno, zvážil)
csubj:pass(recommended, weigh)

Reflexive passive (the meaning is “You are not expected to come before nine o’clock.”)

Nepředpokládá se , že přijdete před devátou . \n It-does-not-expect itself , that you-will-come before nine .
csubj:pass(Nepředpokládá, přijdete)
csubj:pass(It-does-not-expect, you-will-come)

edit csubj:pass

dep: unspecified dependency

A dependency can be labeled as dep when it is impossible to determine a more precise relation. This may be because of a weird grammatical construction, or a limitation in conversion or parsing software. The use of dep should be avoided as much as possible.

my dad does nt really not that good
nmod(dad, my)
nsubj(does, dad)
advmod(does, nt)
advmod(does, really)
dep(does, good)
advmod(good, not)
advmod(good, that)

edit dep

det: determiner

The relation determiner (det) holds between a nominal head and its determiner. Most commonly, a word of POS DET will have the relation det and vice versa. The known exceptions at present are:

The man is here
det(man, The)
Which book do you prefer ?
det(book, Which)

edit det

det:numgov: pronominal quantifier governing the case of the noun

Pronominal quantifiers in Slavic languages are labeled det:numgov instead of det because they normally do not agree with the quantified noun in case (unlike non-quantifying determiners).

The quantifier requires the counted noun to be in its genitive form. The whole phrase (quantifier + noun) is treated as a singular neuter noun phrase and it can fill roles where nominative, accusative or vocative noun phrases are expected.

To increase parallelism across languages (and also across morphological cases within one language), the quantifier is not annotated as the head of the nominal. However, the det:numgov label is used to preserve the information about case conditions.

Czech:

Kolik mužů hrálo karty ? \n How-many men played cards ?
det:numgov(mužů, Kolik)
nsubj(hrálo, mužů)
obj(hrálo, karty)
punct(hrálo, ?-5)
det:numgov(men, How-many)
nsubj(played, men)
obj(played, cards)
punct(played, ?-11)

See also nummod:gov and det:nummod.

edit det:numgov

det:nummod: pronominal quantifier agreeing in case with the noun

Pronominal quantifiers in Slavic languages are labeled det:nummod or det:numgov instead of det because they normally do not agree with the quantified noun in case (unlike non-quantifying determiners). They do agree only if the whole phrase (quantifier + noun) fills a role where genitive, dative, locative or instrumental noun phrases are expected. In these situations they are labeled det:nummod.

Czech:

Nepamatuji si , s kolika muži jsem hrál karty . \n I-do-not-remember myself , with how-many men I-have played cards .
ccomp(Nepamatuji, hrál)
expl:pv(Nepamatuji, si)
punct(hrál, ,-3)
aux(hrál, jsem)
obj(hrál, karty)
iobj(hrál, muži)
case(muži, s)
det:nummod(muži, kolika)
punct(Nepamatuji, .-10)
ccomp(I-do-not-remember, played)
expl:pv(I-do-not-remember, myself)
punct(played, ,-14)
aux(played, I-have)
obj(played, cards)
iobj(played, men)
case(men, with)
det:nummod(men, how-many)
punct(I-do-not-remember, .-21)

See also nummod:gov and det:numgov.

edit det:nummod

det:poss: possessive determiner

Whenever there is a possessive determiner, det:poss should be used instead of det. All possessive determiners have the feature Possessive defined as Yes and the only instances of the det:poss relation attested in the Italian Treebank appear with those elements.

Sarà mia cura verificare . 
det:poss(cura, mia)
Ha da poco annunciato le proprie dimissioni . 
det:poss(dimissioni, proprie)

edit det:poss

discourse: discourse element

This is used for interjections and other discourse particles and elements (which are not clearly linked to the structure of the sentence, except in an expressive way). We generally follow the guidelines of what the Penn Treebanks count as an INTJ. They define this to include: interjections (oh, uh-huh, Welcome), fillers (um, ah), and non-adverbial discourse markers (well, like, but not you know or actually).

These discourse elements are attached to the head of the most relevant nearby clause, which is why they are grouped with non-core clausal dependents even though they are normally not dependents of the predicates as such.

Iguazu is in Argentina :)
discourse(Argentina-4, :)-5)

edit discourse

dislocated: dislocated elements

The dislocated relation is used for fronted or postposed elements that do not fulfill the usual core grammatical relations of a sentence. These elements often appear to be in the periphery of the sentence, and may be separated off with a comma intonation.

It is used for fronted elements that introduce the topic of a sentence, as in the following Japanese and Greek examples. The dislocated element attaches to the head of the clause to which it belongs:

象 は 鼻 が 長い \n zoo wa hana ga naga-i \n elephant TOPIC nose SUBJ long-PRES
dislocated(長い-5, 象-1)
to jani ton kserume poli kala \n the John-Acc him know-1pl very well 
dislocated(kserume, jani)

However, it would not be used for a topic-marked noun that is also the subject of the sentence; this would be an nsubj.

It is also used for postposed elements. The dislocated elements attach to the same governor as the dependent that they double for. Right dislocated elements are frequent in spoken languages. French and Greek examples follow.

Il faut pas la manger , la plasticine \n It must not it eat , the playdough
obj(manger, la-4)
dislocated(manger, plasticine)
obj(eat, it-13)
dislocated(eat, playdough)
ton kserume oli mas edho poli kala, to jani 
dislocated(kserume, jani)

edit dislocated

expl: expletive

This relation captures expletive or pleonastic nominals. These are nominals that appear in an argument position of a predicate but which do not themselves satisfy any of the semantic roles of the predicate. The main predicate of the clause (the verb or predicate adjective or noun) is the governor. In English, this is the case for some uses of it and there: the existential there, and it when used in extraposition constructions. (Note that both it and there also have non-expletive uses.)

There is a ghost in the room
expl(is, There)
It is clear that we should decline .
expl(clear, It)

Some languages do not have expletives of the English sort, including most languages with free pro-drop (the ability to use zero anaphora rather than overt pronouns). In languages with expletives of this sort, they can be positioned where normally a core argument appears: the subject and direct object (and even indirect object) slots, as in the examples below. Note that in the analysis of these examples, we treat the postposed subject or clausal argument as a regular core argument, and mark the expletive with expl.

There is a ghost in the room
expl(is, There)
nsubj(is, ghost)
obl(is, room)
I believe there to be a ghost in the room
nsubj(believe, I)
expl(believe, there)
xcomp(believe, be)
nsubj(be, ghost)
obl(be, room)
It is clear that we should decline .
expl(clear, It)
csubj(clear, decline)
That we should decline is clear .
csubj(clear, decline)
I mentioned it to Mary that Sue is leaving
nsubj(mentioned, I)
expl(mentioned, it)
obl(mentioned, Mary)
ccomp(mentioned, leaving)

A second, related, use of the expl relation is for cases of true clitic doubling. For languages in which clitics and lexical nominals are usually in complementary distribution – languages, such as French, which obey “Kayne’s generalization” – then whichever of a clitic or a lexical nominal occurs will get the appropriate role, such as obj or iobj. In such languages, when doubling does occur, such as in spoken French, the right analysis is to regard the lexical nominal as dislocated (see the examples there). As such, the analysis will be the same as when a noun phrase doubles another noun phrase or a regular pronoun that fills a nominal argument position. However, other languages, such as Greek and Bulgarian, standardly allow doubling of a lexical nominal and a pronominal clitic, with the former still appearing in its regular role as an argument of the predicate. In these cases, if only one of the lexical nominal and the clitic appear in a clause, then whichever appears will be given the grammatical role of obj, iobj, etc. – parallel to the treatment of lexical nominals and pronouns in other languages, modulo the clitic pronoun having a different position in the sentence. However, if both occur, the lexical nominal will be given the grammatical role of obj, iobj, etc., and the clitic will be treated as a pronominal copy, which does not receive its own semantic role, and hence will get the role expl. Modulo the different word order, this is fairly parallel to the treatment of it and there in English mentioned above, where another phrase satisfies the semantic role of the predicate. Examples from Greek and Bulgarian follow:

Της τον έδωσε της Καίτης τον αναπτήρα \n PRON.Fem.Gen PRON.Masc.Acc gave ART.Fem.Gen Keti.Gen ART.Masc.Acc lighter.Acc
expl(έδωσε, Της-1)
iobj(έδωσε, Καίτης)
det(Καίτης, της-4)
expl(έδωσε, τον-2)
obj(έδωσε, αναπτήρα)
det(αναπτήρα, τον-6)
Marija mu izprati pismo na rabotnika \n Maria 3.S.M.IO sent letter to the.worker
expl(izprati, mu)
obj(izprati, pismo)
iobj(izprati, rabotnika)
case(rabotnika, na)

Reflexives

The expletive relation is also used for reflexive pronouns (see the feature u-feat/Reflex) attached to inherently reflexive verbs, i.e. verbs that cannot occur without the reflexive pronoun and thus the pronoun does not play the role of a normal object (otherwise it would be possible to substitute it with an irreflexive pronoun or other nominal).

UD recognizes several functions of reflexive pronouns (clitics) that are usually distinguished with the help of subtypes of the expl relation (see also the report from the 2015 Uppsala discussion of clitics where this approach was approved):

A Czech example:

Martin se bojí zvířat . \n Martin REFLEX fears animals .
expl:pv(bojí, se)
expl:pv(fears, REFLEX)

Further general discussion of expletives can be found in Postal, P. M., and G. K. Pullum (1988) “Expletive Noun Phrases in Subcategorized Positions,” Linguistic Inquiry 19(4): 635–670. The status of clitic doubling, and arguments for the lexical nominal being an argument with the clitic a kind of pronominal copy, appear inter alia in Boris Harizanov (2014) Clitic doubling at the syntax-morphology interface: A-movement and morphological merger in Bulgarian. Natural Language and Linguistic Theory.

edit expl

expl:impers: impersonal expletive

The relation expl:impers is a sub-class of expl, specific for the impersonal use of the clitic pronoun si. We can have an impersonal construction for every verb (transitive or intransitive) when the role of subject is played by the clitic itself, as an undefined subject.

Si prevede che viaggerà .
expl:impers(prevede, Si)

If there’s a clitic in a construction with a modal or an auxiliary verb, than generally it is an impersonal construction.

Si può procedere a sequestro .
expl:impers(procedere, Si)

In the construction with both ci and si (construction of the impersonal ci), the first clitic is marked as expl, while si as expl:impers, as follows.

E' stata quello che ci si attendeva .
expl:impers(attendeva, si)
expl(attendeva, ci)

edit expl:impers

expl:pass: reflexive pronoun used in reflexive passive

Reflexive pronouns (see the feature cs-feat/Reflex) are used in various constructions in Czech, including so-called reflexive passive. In PDT, their relation to the verb is labeled AuxR. The corresponding label in Czech UD is called expl:pass (since UD 2.0; in previous versions it was labeled auxpass:reflex).

To se řekne snadno . \n It is said easily .
expl:pass(řekne, se)
expl:pass(said, is)

edit expl:pass

expl:pv: reflexive clitic with an inherently reflexive verb

Reflexive pronouns (see the feature cs-feat/Reflex) usually replace objects of verbs. However, some verbs are inherently reflexive, i.e. the verb always occurs with a reflexive prounoun, and the pronoun cannot be replaced by a non-reflexive pronoun.

With these verbs, the reflexive pronoun is attached as expl:pv instead of obj. (Note that the expl relation is first used for this purpose in the UD release 1.2, and it is further subtyped as expl:pv since UD 2.0, to increase parallelism with other languages. In the previous releases this usage of reflexive se/si was labeled compound:reflex.)

Martin se bojí zvířat . \n Martin REFLEX fears animals .
expl:pv(bojí, se)
expl:pv(fears, REFLEX)

edit expl:pv

fixed: fixed multiword expression

The fixed relation is used for certain fixed grammaticized expressions. Such expressions tend to behave like function words. For example, in spite of is a fixed expression functioning as a preposition in English; bien que (‘although’, lit. ‘well that’) functions as a subordinating conjunction in French; and vare sig (‘either’, lit. ‘be itself’) functions as a (pre)conjunction in Swedish. The scope of fixed MWEs corresponds roughly to the fixed expressions category of Sag et al. and should not be used for multiword expressions that are morphosyntactically flexible.

Criteria

Fixed expressions typically do not allow intervening words, except in a few special cases such as clitics that go in a fixed position in the clause and can interrupt even fixed expressions. In addition, there may be inherently discontiguous fixed expressions, such as för … sedan in Swedish, corresponding to the English ago, which is syntactically irregular and always encloses a temporal expression, as in för 10 år sedan [“10 years ago”].

The creation of fixed multiword expressions is the end phase of a process of grammaticalization and there are always going to be cases of multiword expressions that are only somewhat grammaticalized. For practical treebanking, it is recommended to restrict this relation to the most grammaticalized cases and to treat them as a closed class by writing language-specific documentation listing the fixed expressions of the language.

Structure

Fixed MWEs are annotated in a flat structure, where all subsequent words in the expression are attached to the first one using the fixed label. The assumption is that these expressions do not have any internal syntactic structure (except from a historical perspective) and that the structural annotation is in principle arbitrary. In practice, however, it is highly desirable to use a consistent annotation of all fixed MWEs in all languages.

Fixed MWEs should not have any internal modification. Therefore, if a word attaches as fixed, it should not have any dependents (except perhaps punct, goeswith, and reparandum dependents, as these are not true syntactic relations).

I like dogs as well as cats
fixed(as-4, well-5)
fixed(as-4, as-6)
He cried because of you
fixed(because, of)
Je préfère prendre un dessert plutôt qu' une entrée \n I prefer getting a dessert rather than an appetizer
fixed(plutôt, qu')

New from v2: The fixed relation replaces the old mwe relation to prevent misunderstanding regarding its scope. For v2.14, this page has been revised to more clearly articulate the relationship to multiword expressions.

edit fixed

flat: flat expression

The flat relation is used to combine the elements of an expression where none of the immediate components can be identified as the sole head using standard substitution tests. This includes both cases where more than one component passes the head test – as in the name John Smith, where either John or Smith can replace the whole in most contexts – and cases where no component does – as in San Francisco (in English). Note also that the flat relation is appropriate in such cases only when no more specific relation applies. For example, in coordination structures annotated with the conj relation, any of the conjuncts can usually replace the whole.

Flat expressions are annotated with a flat structure, where all subsequent components in the expression are attached to the first one using the flat label. The assumption is that in these expressions, the flat relations are not syntactic head-modifier relations, and that the structural annotation is in principle arbitrary. The components of a flat expression may have their own dependents, including nested flat structures. For example, in the name Mary Jane Tyler Smith, both the first name (Mary Jane) and the last name (Tyler Smith) are flat expressions, which are combined into a larger flat name (the tree appears below).

The prototypes for flat are: (i) personal names, (ii) foreign expressions, (iii) iconic sequences, and (iv) items separated for readability. These are illustrated in the sections below. The application of flat may extend beyond these prototypes to, e.g., various kinds of name and number expressions. However, even if an expression is idiosyncratic or follows a specialized pattern, every effort should be made to find a head rather than employing flat. If a head can be found but no substantive dependency relation is appropriate, dep can be used.

Note that what is considered to be transparent linguistic syntax (as opposed to flat structure) is subject to treebank-specific policies. (E.g., some treebanks might provide proper grammatical analyses in the presence of code-switching, or treat mathematical notation as following linguistic strategies like predication.)

Some languages opt to subcategorize usages of flat via subtypes. In particular, many treebanks use the flat:name and flat:foreign subtypes converted from the v1 relations name and foreign. The examples on this page simply use plain flat.

Names

A person’s name (or parts thereof) may lack the hallmarks of general constructions in the language, such that no single word can be identified as the head, in which case a flat structure applies.

Hillary Rodham Clinton
flat(Hillary, Rodham)
flat(Hillary, Clinton)

Nesting is possible:

Mary Jane Tyler Smith
flat(Mary, Jane)
flat(Tyler, Smith)
flat(Mary, Tyler)

On occasion, an expression with no clear head at the top level will have internal syntactic modifiers or punctuation:

Dwayne " The Rock " Johnson
flat(Dwayne, Rock)
flat(Dwayne, Johnson)
det(Rock, The)
punct(Rock, "-2)
punct(Rock, "-5)

The scope of flat may extend beyond names of persons to names of other kinds of entities that depart from general headed structure. The expressions under this category must be established by language-specific criteria.

Flat vs. non-flat names

Names that have a regular syntactic structure, like The Lord of the Rings and Captured By Aliens, should be annotated with regular syntactic relations rather than flat structures:

The Lord of the Rings
det(Lord, The)
nmod(Lord, Rings)
case(Rings, of)
det(Rings, the)
The king of Sweden
det(king-2, The-1)
nmod(king-2, Sweden-4)
case(Sweden-4, of-3)

For organization names with clear syntactic modification structure, the dependencies should also reflect the syntactic modification structure using regular syntactic relations, as in:

Natural Resources Conservation Service
amod(Resources-2, Natural-1)
compound(Conservation-3, Resources-2)
compound(Service-4, Conservation-3)

In addition, regular syntactic relations are used: (i) for a modifying determiner or similar function word and (ii) to connect together the words of a description or name which involve embedded prepositional phrases, sentences, etc., when these relations are (i) recognized in the language being annotated (i.e., the analyses below are for French, German, and Spanish, not English) and (ii) deemed not to be grammaticalized to the extent that the original role of the function words has been lost.

Le Japon
det(Japon-2, Le-1)
Ludwig van Beethoven
case(Beethoven, van)
nmod(Ludwig, Beethoven)
Miguel de Cervantes y Saavedra
conj(Cervantes, Saavedra)
cc(Saavedra, y)
case(Cervantes, de)
nmod(Miguel, Cervantes)
Río de la Plata
case(Plata-4, de-2)
det(Plata-4, la-3)
nmod(Río-1, Plata-4)

A name may combine flat and non-flat structure. In a Portuguese text, the surname Paulo da Silva would be analyzed as follows:

Roberto Paulo da Silva
flat(Roberto, Paulo)
nmod(Paulo, Silva)
case(Silva, da)

The above analyses of Ludwig van Beethoven and Miguel de Cervantes y Saavedra assume that van resp. de are prepositions. This is true in the languages of the names’ origin, but it can be expected to change when the name is used in foreign text or when sufficient grammaticalization has taken place. For example, when names like this are annotated in English, the appropriate analysis is as a flat name:

Ludwig van Beethoven was a famous German composer .
flat(Ludwig, van)
flat(Ludwig, Beethoven)
det(composer, a)
amod(composer, famous)
amod(composer, German)
cop(composer, was)
nsubj(composer, Ludwig)
punct(composer, .)
Río de la Plata
flat(Río-1, de-2)
flat(Río-1, la-3)
flat(Río-1, Plata-4)
Al Arabiya is a Saudi-owned news organization
flat(Al-1, Arabiya-2)
nsubj(organization-7, Al-1)

And in Modern German or French, these prepositions have generally just become a fossilized part of a family name and regularly appear without the given name. Again, here, the flat analysis seems correct:

Von Hohenlohe gewann das Rennen . \n Von Hohenlohe won the race .
flat(Von-1, Hohenlohe-2)
nsubj(gewann-3, Von-1)

Foreign expressions

This encompasses expressions that may have been borrowed or quoted, but whose original grammatical structure is not necessarily accessible to speakers of the language(s) being annotated.

And then she went : gjiko frac zen .
parataxis(went, gjiko)
flat(gjiko, frac)
flat(gjiko, zen)

“Foreign” includes not just natural languages but also notational systems that are considered external to natural language proper and are governed by separate rules (e.g., musical chord progressions, software code excerpts).

The Vienna Game move order is 1. e4 e5 2. Nc3 .
nsubj(1., order)
cop(1., is)
flat(1., e4)
flat(1., e5)
flat(1., 2.)
flat(1., Nc3)

See further discussion at Foreign Expressions and Code-Switching.

History: UD v1 had a foreign relation, but this is no longer part of the relation taxonomy and has been subsumed under flat.

Iconic sequences

Sequences for which neither head-dependent nor coordination relationships apply include onomatopoeia (quack quack quack), “filler” words (do re mi), and gibberish (blargety blarg blarg).

The duck said quack quack quack
obj(said, quack-4)
flat(quack-4, quack-5)
flat(quack-4, quack-6)

Items separated for readability

Here the units separated by spaces or punctuation cannot really be construed as separate lexemes. A common case is telephone numbers:

Call 0118 999 881 999 119 725 3
obj(Call, 0118)
flat(0118, 999-3)
flat(0118, 881)
flat(0118, 999-5)
flat(0118, 119)
flat(0118, 725)
flat(0118, 3)

But not all “unnecessary” spaces are flat:

edit flat

flat:foreign: foreign words

Some treebanks use flat:foreign to label sequences of foreign words. These are given a linear analysis: the head is the first token in the foreign phrase.

flat:foreign does not apply to loanwords or to foreign names. It applies to quoted foreign text incorporated in a sentence/discourse of the host language (unless we want to and know how to annotate the internal structure according to the syntax of the foreign language).

Jarmusch se objevil ve Wangově snímku Modrá ve tváři ( Blue in the Face ) .
flat:foreign(Blue, in)
flat:foreign(Blue, the)
flat:foreign(Blue, Face)

See the general policy on Foreign Expressions and Code-Switching.

edit flat:foreign

flat:name: names

The flat:name relation is a specialization of flat used for names.

Ecco l'arringa di Tiziana Maiolo . 
name(Tiziana, Maiolo)

Names are annotated in a flat, head-initial structure, in which all words in the name modify the first one using the flat:name label. This also works for prepositions or determiners and numerals that are part of the names.

Formula 1/NUM . 
flat:name(Formula, 1)
Marcello Dell' Utri . 
flat:name(Marcello, Dell')
flat:name(Marcello, Utri)

Words joined by flat:name should all be part of a minimal noun phrase; otherwise regular syntactic relations should be used. For organization names with clear syntactic modification structure, the dependencies should reflect the syntactic modification structure using regular syntactic relation.

L' ordine Mauriziano
det(ordine, L')
amod(ordine, Mauriziano)
Il Ministero di gli Interni 
det(Ministero, Il)
nmod(Ministero, Interni)
det(Interni, gli)
case(Interni, di)

In addition, regular syntactic relations are used:

Mariatersa Di Lascia
name(Mariatersa, Lascia)
case(Lascia, Di)
Università di Pristina 
name(Università, Pristina)
case(Pristina, di)

edit flat:name

goeswith: goes with

This relation links two or more parts of a word that are separated in text that is not well edited. These parts should be written together as one word according to the orthographic rules of a given language. The head is always the first part, the other parts are attached to it with the goeswith relation (for consistency, similarly as in flat, fixed and conj).

The first part of the word is given the part of speech that the word would have been given if written together, while the later parts of the word are given the POS X. Similarly, only the first part can have a lemma and morphological features. And while the annotation of morphological features is optional, if the treebank does have features, then Typo=Yes must be used with the goeswith head.

Note also that only the last word part may be annotated with SpaceAfter=No.

They come here with/ADP[Typo=Yes] out/X legal permission
goeswith(with-4, out-5)
never/ADV[Typo=Yes] the/X less/X[SpaceAfter=No] ,
goeswith(never, the)
goeswith(never, less)
For/VERB[Mood=Imp|Typo=Yes|VerbForm=Fin] get/X that !
goeswith(For, get)
obj(For, that)
punct(For, !)

edit goeswith

iobj: indirect object

  WARNING
⚠️ The traditional term “indirect object”, associated with morphosyntactic encoding of certain types of arguments (especially datives/recipients) in a clause, has a wide range of interpretations across languages and linguistic frameworks. In UD, universal-level relations do not distinguish arguments and adjuncts; rather, the distinction is between core arguments and oblique modifiers. iobj must only be used for core arguments, never for obliques, as described below. The naming of this relation may be changed in the next major revision of the UD guidelines.

In UD, the indirect object of a verb is any nominal phrase that is a core argument of the verb but is not its subject or (direct) object. The prototypical example is the recipient of ditransitive verbs of exchange:

She gave me a raise
iobj(gave, me)
nsubj(gave, She)

However, many languages allow other semantic roles as additional objects. The most common case is allowing benefactives, but some languages allow other roles. Examples include instruments, such as in the Kinyarwanda example below, or comitatives. At the other extreme, some languages lack all indirect objects.

Umukoóbwa a-ra-andik-iish-a íbárúwa íkárámu \n girl 1-PRS-write-APPL-ASP letter pen
obj(a-ra-andik-iish-a, íbárúwa)
iobj(a-ra-andik-iish-a, íkárámu)

In languages distinguishing morphological cases, the recipient will often be marked by the dative case. However, the iobj relation can be used only for a core argument. The morphological dative may signal a core argument in some languages (such as Basque) but in many others it is just oblique (like the English preposition to). For instance, in many Indo-European languages, the recipient should be attached as obl and not iobj, regardless of the traditional grammar which may label it as “indirect object”.

In the following Czech example, the verb takes two objects. Both are nouns in the accusative case, which is rather unusual—for most other verbs, one of the arguments would be in the dative and would thus be treated as oblique in UD. However, a bare accusative signals a core object and a verb with one nominative and two accusatives is ditransitive in UD. One of the accusatives is direct object (patient), the other is indirect (recipient). It is parallel to how the English translation would be annotated (where there is no morphological case marking) and also to verbs of giving in English (consider a similar sentence, he gave my daughter a class of maths).

On učí mou dceru matematiku . \n He teaches my daughter.Acc maths.Acc .
obj(učí, matematiku)
iobj(učí, dceru)
obj(teaches, maths.Acc)
iobj(teaches, daughter.Acc)

Predicates in Basque can cross-reference (by morphological agreement on the auxiliary verb) up to three arguments in different morphological cases: ergative, absolutive, and dative. The morphological cross-reference is a strong indicator that all three are core arguments. Therefore, if all three are present, we have a double-object situation and the dative argument will be iobj (while the ergative argument will be nsubj and the absolutive obj). Even if the absolutive argument is omitted for a verb which licenses three arguments, the dative argument is still iobj.

(Nik)/Case=Erg (zuri)/Case=Dat liburua/Case=Abs eman dizut . \n (I) (you) book given I-have-you-it .
nsubj(eman, (Nik))
iobj(eman, (zuri))
obj(eman, liburua)
aux(eman, dizut)
punct(eman, .-6)
nsubj(given, (I))
iobj(given, (you))
obj(given, book)
aux(given, I-have-you-it)
punct(given, .-13)
Mariari/Case=Dat eman nion liburua/Case=Abs . \n To-Maria given I-have-her-it book .
iobj(eman, Mariari)
obj(eman, liburua)
aux(eman, nion)
punct(eman, .-5)
iobj(given, To-Maria)
obj(given, book)
aux(given, I-have-her-it)
punct(given, .-11)
Mariari/Case=Dat eman nion . \n To-Maria given I-have-her-it .
iobj(eman, Mariari)
aux(eman, nion)
punct(eman, .-4)
iobj(given, To-Maria)
aux(given, I-have-her-it)
punct(given, .-9)
Liburua/Case=Abs eman nion . \n Book given I-have-her-it .
obj(eman, Liburua)
aux(eman, nion)
punct(eman, .-4)
obj(given, Book)
aux(given, I-have-her-it)
punct(given, .-9)

Nevertheless, Basque has also a class of verbs that license only two core arguments, one ergative and one dative. Here the ergative has the A function and the dative the P function (Zúñiga and Fernández 2014), meaning that the dative is obj rather than iobj, as in “The teacher has looked angrily at the students.”

Irakasleak/Case=Erg haserre begiratu die ikasleei/Case=Dat . \n Teacher angrily looked he-has-them to-students .
nsubj(begiratu, Irakasleak)
advmod(begiratu, haserre)
aux(begiratu, die)
obj(begiratu, ikasleei)
punct(begiratu, .-6)
nsubj(looked, Teacher)
advmod(looked, angrily)
aux(looked, he-has-them)
obj(looked, to-students)
punct(looked, .-13)

Another class of transitive verbs in Basque license one dative and one absolutive argument. Here the dative has the A function and the absolutive the P function, meaning that the dative is nsubj and the absolutive is obj, as in “The boy likes the soup very much.”

Zopa/Case=Abs izugarri gustatzen zaio mutilari/Case=Dat . \n Soup greatly pleasing it-is-him to-boy .
obj(gustatzen, Zopa)
advmod(gustatzen, izugarri)
aux(gustatzen, zaio)
nsubj(gustatzen, mutilari)
punct(gustatzen, .-6)
obj(pleasing, Soup)
advmod(pleasing, greatly)
aux(pleasing, it-is-him)
nsubj(pleasing, to-boy)
punct(pleasing, .-13)

In Tagalog, core arguments are marked by the prepositions ang and ng (or by corresponding inflection of personal pronouns), while oblique dependents are typically marked by the preposition sa (sometimes glossed as the dative). Giving somebody something is a (mono)transitive predicate.

# text = Nagbigay ang lalaki ng libro sa babae.
# text_en = The man gave a book to the woman.
1	Nagbigay	bigay	VERB	_	Aspect=Perf|Mood=Ind|VerbForm=Fin|Voice=Act	0	root	_	Gloss=gave
2	ang	ang	ADP	_	Case=Nom	3	case	_	Gloss=the
3	lalaki	lalaki	NOUN	_	_	1	nsubj	_	Gloss=man
4	ng	ng	ADP	_	Case=Gen	5	case	_	_
5	libro	libro	NOUN	_	_	1	obj	_	Gloss=book
6	sa	sa	ADP	_	Case=Dat	7	case	_	Gloss=DIR
7	babae	babae	NOUN	_	_	1	obl	_	Gloss=woman|SpaceAfter=No
8	.	.	PUNCT	_	_	1	punct	_	Gloss=.

However, locative dependents can be topicalized if the verb morphology signals the “locative voice”. Then the locative noun phrase switches to nominative, it becomes a core argument, while the original two core arguments keep core coding, too. Therefore we have a ditransitive clause with three core arguments, even for verbs that are not associated with ditransitives in other languages:

# sent_id = 3.111c/tl
# text = Aalisan ng babae ng bigas ang sako para sa bata.
# gloss = FUT-take.out-DP ACT woman OBJ rice PIV sack BEN child
# text_en = A/the woman will take some rice out of the sack for a/the child.
# DP = directional pivot; PIV = pivot marker
1	Aalisan	alis	VERB	_	Aspect=Prog|Mood=Ind|VerbForm=Fin|Voice=Lfoc	0	root	_	Gloss=will-take-out|MSeg=a-alis-an|MGloss=FUT-take.out-DP
2	ng	ng	ADP	_	Case=Gen	3	case	_	_
3	babae	babae	NOUN	_	_	1	iobj:agent	_	Gloss=woman
4	ng	ng	ADP	_	Case=Gen	5	case	_	_
5	bigas	bigas	NOUN	_	_	1	obj:patient	_	Gloss=rice
6	ang	ang	ADP	_	Case=Nom	7	case	_	Gloss=the
7	sako	sako	NOUN	_	_	1	nsubj:loc	_	Gloss=sack
8	para	para	ADP	_	_	10	case	_	Gloss=for
9	sa	sa	ADP	_	Case=Dat	10	case	_	Gloss=BEN
10	bata	bata	NOUN	_	_	1	obl	_	Gloss=child|SpaceAfter=No
11	.	.	PUNCT	_	_	1	punct	_	Gloss=.

In Plains Cree (Wolvengrey 2011), transitive verbs cross-reference subjects and animate objects but not inanimate objects. With a verb of giving, the theme is typically inanimate while the recipient is typically animate. Assuming that nsubj and obj are reserved for the two core arguments cross-referenced by the verb, the theme has to be iobj (if it is a core argument at all; otherwise it would have to be obl; but real oblique nominals in Plains Cree take a locative case affix, which is not present here).

# text = Nikī-miyāw anima masinahikan.
# text_en = I gave him/her that book.
1	Nikī-miyāw	miy	VERB	_	Animacy=Anim|Mood=Ind|Number[high]=Sing|Number[low]=Sing|Person[high]=1|Person[low]=3|Tense=Past|Voice=Dir	0	root	_	Gloss=I-gave-him/her|MSeg=ni-kī-miy-ā-w|MGloss=1-PAST-give.to-DIR-3SG
2	anima	anima	DET	_	Animacy=Inan|Number=Sing|PronType=Dem	3	det	_	Gloss=that|MGloss=DEM.0's
3	masinahikan	masinahikan	NOUN	_	Animacy=Inan|Number=Sing	1	iobj	_	Gloss=book|SpaceAfter=No
4	.	.	PUNCT	_	_	1	punct	_	Gloss=.

In the above example, the verb stem used is for animate objects, while masinahikan “book” is inanimate. That is a proof that the 3rd person singular cross-reference on the verb does not refer to the book but to an animate recipient that is not overtly represented in the sentence.

If the language has a prototypical iobj (occurring in a double object construction with obj), then morphosyntactic criteria need to be established for when a sole object is obj and when it is iobj.1 Depending on the language, potential reasons to consider a sole object in a clause as an iobj include:

For example, in English, the verb teach may occur with obj, iobj, or both:

She teaches the students introductory logic .
iobj(teaches, students)
obj(teaches, logic)
She teaches introductory logic .
obj(teaches, logic)
She teaches the first-year students .
iobj(teaches, students)
She teaches her students that good writing is important .
iobj(teaches, students)
ccomp(teaches, important)
She teaches her students to write well .
iobj(teaches, students)
xcomp(teaches, write)

However, not all verbs license two objects (or an object plus ccomp), in which case the sole object should be plain obj even if it has recipient-like semantics:

She questions her students about their interests .
obj(questions, students)
obl(questions, interests)
She helps her students to succeed .
obj(helps, students)
xcomp(helps, succeed)

References

  1. This is an amended policy as described on the changes page

edit iobj

list: list

The list relation is used for chains of comparable items. In lists with more than two items, all items of the list should modify the first one. If a list is something like a list of paragraphs (for example, describing items in a catalogue), then each item will be one or more sentences and no list relations appear, as we do not have relations between sentences. However, informal and web text often contains passages which are meant to be interpreted as lists but are parsed as single sentences. For example, email signatures often contain these structures, in the form of contact information: the different contact information items are labeled as list.

Steve Jones sj@abc.xyz University of Arizona
flat:name(Steve, Jones)
list(Steve, sj@abc.xyz)
list(Steve, University)
nmod(University, Arizona)
case(Arizona, of)

If the fields in the list are explicit and have a key-value structure, the key-value pair relations are labeled as appos.

Steve Jones Phone: 555-9814 Email: jones@abc.edf
flat:name(Steve-1, Jones-2)
list(Steve-1, Phone:-3)
list(Steve-1, Email:-5)
appos(Phone:-3, 555-9814-4)
appos(Email:-5, jones@abc.edf-6)

Another place where list has been used is for a sequence of attributes or descriptive terms used as the title line of a review (such as product or restaurant reviews, etc.):

Long Lines , Silly Rules , Rude Staff , Ok Food
list(Lines, Rules)
list(Lines, Staff)
list(Lines, Food)
amod(Lines, Long)
amod(Rules, Silly)
amod(Staff, Rude)
amod(Food, Ok)
punct(Rules, ,-3)
punct(Staff, ,-6)
punct(Food, ,-9)

However, list should not be over-used. If a construction can easily be analyzed using the grammatical relations of standard sentences, typically as a coordinated structure, then it should be analyzed with these more standard relations, even if it is laid out as a list typographically. In particular, when the list is written as a single sentence, with commas and overt coordination, then it should be analyzed as a coordinated structure.

For list items, the de facto decision taken in issue 156 is that, for enumerated lists, regardless of whether the items are numbered with arabic, roman, or other numerals, or are given letters, we will regard the item contents as the head, and the item enumerator will be a nummod of it and given the part of speech NUM. Any punctuation with the list item will be a punct dependent of the item enumerator. For itemized lists with bullet, dash or similar markers, the current standard is to give the marker a PUNCT part of speech and then to give it the dependency relation punct to the head of the item content.

edit list

mark: marker

A marker is the word marking a clause as subordinate to another clause. For a complement clause, this is words like [en] that or whether. For an adverbial clause, the marker is typically a subordinating conjunction like [en] while or although. The marker is a dependent of the subordinate clause head. In a relative clause, it is a normally uninflected word, which simply introduces a relative clause, such as [he] še. (In this last use, one needs to distinguish between relative clause markers, which are mark, from relative pronouns such as [en] who or that, which fill a regular verbal argument or modifier grammatical relation.)

Forces engaged in fighting after insurgents attacked
mark(attacked, after)
He says that you like to swim
mark(like, that)

Infinitive markers (e.g. English to, German zu) in infinitival clauses are also attached as mark:

Er kam wieder , um das Werk zu Ende zu bringen \n He came again , so-that the work to end to bring
mark(bringen, um)
mark(bringen, zu-10)
mark(bring, so-that)
mark(bring, to-22)

edit mark

nmod: nominal modifier

The nmod relation is used for nominal dependents of another noun or noun phrase and functionally corresponds to an attribute, or genitive complement.

New from v2: The nmod relation was previously used also for nominal dependents of verbs, adjectives, and adverbs. These are now covered by the new obl relation.

In conjunction with the case relation, nmod provides a uniform analysis for the possessive alternation (with the option of a subtype like nmod:poss to distinguish non-adpositional case):

the office of the Chair
det(office-2, the-1)
nmod(office-2, Chair-5)
case(Chair-5, of-3)
det(Chair-5, the-4)
the Chair 's office
det(Chair-2, the-1)
nmod:poss(office-4, Chair-2)
case(Chair-2, 's-3)

edit nmod

nmod:poss: possessive nominal modifier

nmod:poss is used for a possessive nominal modifier. In English, for example, it is marked with the genitive case clitic ‘s or one of its variant forms.

Marie 's book
nmod:poss(book, Marie)
case(Marie, 's)

edit nmod:poss

nmod:tmod: temporal modifier

A temporal nominal modifier of another nominal is a subtype of the nmod relation: if the modifier is specifying a time, it is labeled as tmod.

Are you free for lunch some day this week ?
nmod:tmod(day, week)

edit nmod:tmod

nsubj: nominal subject

A nominal subject (nsubj) is a nominal which is the syntactic subject and the proto-agent of a clause. That is, it is in the position that passes typical grammatical test for subjecthood, and this argument is the more agentive, the do-er, or the proto-agent of the clause. This nominal may be headed by a noun, or it may be a pronoun or relative pronoun or, in ellipsis contexts, other things such as an adjective.

New from v2: The nsubj relation is also used for the nominal subject of a passive verb or verb group, even though the subject is then not typically the proto-agent argument due to valency changing operations. For languages that have a grammaticalized passive transformation, it is strongly recommended to use the subtype nsubj:pass in such cases. If the subject is of a copular clause whose predicate is itself a clause, nsubj:outer may be used.

The governor of the nsubj relation might not always be a verb: when the verb is a copular verb, the root of the clause is the complement of the copular verb, which can be an adjective or noun, including a noun marked by a preposition, as in the examples below.

The nsubj role is only applied to semantic arguments of a predicate. When there is an empty argument in a grammatical subject position (sometimes called a pleonastic or expletive), it is labeled as expl. If there is then a displaced subject in the clause, as in the English existential there construction, it will be labeled as nsubj.)

Clinton defeated Dole
nsubj(defeated, Clinton)
Dole was defeated by Clinton
nsubj:pass(defeated, Dole)
The car is red .
nsubj(red, car)
Sue is a true patriot .
nsubj(patriot, Sue)
We are in the barn .
nsubj(barn, We)
Agatha is in trouble .
nsubj(trouble, Agatha)
There is a ghost in the room .
expl(is, There)
nsubj(is, ghost)
These links present the many viewpoints that existed .
acl:relcl(viewpoints, existed)
nsubj(existed, that)

edit nsubj

nsubj:outer: outer clause nominal subject

This relation specifies a nominal subject of a copular clause whose predicate is itself a clause, to signal that it is not the subject of the nested clause. See discussion of Predicate Clauses.

-ROOT- The problem is that this has never been tried .
nsubj:outer(tried, problem)
cop(tried, is)
mark(tried, that)
nsubj:pass(tried, this)
aux(tried, has)
advmod(tried, never)
aux:pass(tried, been)
root(-ROOT-, tried)
The title is Some Like It Hot .
nsubj:outer(Like, title)
cop(Like, is)
nsubj(Like, Some)
obj(Like, It)
xcomp(Like, Hot)

There may be an outer subject with no inner subject:

The important thing is to keep calm .
nsubj:outer(keep, thing)
cop(keep, is)
mark(keep, to)
xcomp(keep, calm)

The clausal counterpart of this relation is csubj:outer.

Only subjects are required to be distinguished in this way. There may, for example, be inner and outer copulas, both attaching as cop:

The important thing is to be calm .
nsubj:outer(calm, thing)
cop(calm, is)
mark(calm, to)
cop(calm, be)

The :outer subtype is not intended for most nominal subjects of copular clauses—only those where the predicate is itself a clause. Plain nsubj (or another subtype) will be appropriate if the copular clause predicate is a nominal, adjective, etc.:

That book is very good .
nsubj(good, book)
The title is Green Eggs and Ham .
nsubj(Eggs, title)

edit nsubj:outer

nsubj:pass: passive nominal subject

A passive nominal subject is a noun phrase which is the syntactic subject of a passive clause.

Schwarzenberg byl poražen Zemanem . \n Schwarzenberg was defeated by-Zeman .
nsubj:pass(poražen, Schwarzenberg-1)
nsubj:pass(defeated, Schwarzenberg-7)

Reflexive passive (the meaning is “This will be solved tomorrow.”)

Tohle se bude řešit zítra . \n This itself will solve tomorrow .
nsubj:pass(řešit, Tohle)
nsubj:pass(solve, This)

edit nsubj:pass

nummod: numeric modifier

A numeric modifier of a noun is any number phrase that serves to modify the meaning of the noun with a quantity.

Sam ate 3 sheep
nummod(sheep, 3)
Sam spent forty dollars
nummod(dollars, forty)
Sam spent $ 40
nummod($, 40)

Note that indefinite quantifiers such as few, many are tagged u-pos/DET rather than u-pos/NUM. Therefore their relation to the quantified noun is not nummod but det:

Sam ate many sheep
det(sheep, many)

Furthermore, a number that serves as a label for an entity rather than denoting quantity is not nummod. For example, in The meeting will be in room 4, the number is the name of a particular room, it is different from the expression 4 rooms. Note that the label of the room could also be non-numeric, as in The meeting will be in room A. UD analyzes the number as a nominal (even if keeping the UPOS tag NUM for it). Hence the number is attached as nmod to the noun it modifies, unless there is clear morphosyntactic evidence in the language for the opposite direction. See also §3.6.3 of de Marneffe et al. (2021).

The meeting will be in room 4
det(meeting, The)
nsubj(room, meeting)
aux(room, will)
cop(room, be)
case(room, in)
nmod(room, 4)

edit nummod

nummod:gov: numeric modifier governing the case of the noun

nummod:gov differs from nummod in that the numeral requires the counted noun to be in its genitive form. The whole phrase (numeral + noun) is treated as a singular neuter noun phrase and it can fill roles where nominative, accusative or vocative noun phrases are expected. This construction occurs in many Slavic languages.

To increase parallelism across languages (and also across morphological cases within one language), the numeral is not annotated as the head of the nominal. However, the nummod:gov label is used to preserve the information about case conditions.

Czech:

Pět mužů hrálo karty . \n Five men played cards .
nummod:gov(mužů, Pět)
nsubj(hrálo, mužů)
obj(hrálo, karty)
punct(hrálo, .-5)
nummod:gov(men, Five)
nsubj(played, men)
obj(played, cards)
punct(played, .-11)

See also det:numgov and det:nummod.

edit nummod:gov

obj: object

The object of a verb is the second most core argument of a verb after the subject. Typically, it is the noun phrase that denotes the entity acted upon or which undergoes a change of state or motion (the proto-patient).

She gave me a raise
obj(gave, raise)

In languages distinguishing morphological cases, the object will often be marked by the accusative case. If a verb dictates another case (dative, genitive…), the fundamental question is whether such cases qualify as core in the given language. Often these cases are oblique, regardless of the presence or absence of an adposition. Consequently they cannot use the obj relation and must use obl, even if the traditional grammar calls such dependents “objects”.

If there are two or more objects, one of them should be obj and the others should be iobj. In such cases it is necessary to decide what is the most directly affected object (patient). If there is just one object, it should likely be obj unless it is morphosyntactically more similar to clear cases of iobj in the language than it is to prototypical patient arguments.

There is further discussion of the two kinds of object at iobj. If possible, language-specific documentation should be available to help identify the primary (or direct) object.

edit obj

obl: oblique nominal

The obl relation is used for a nominal (noun, pronoun, noun phrase) functioning as a non-core (oblique) argument or adjunct. This means that it functionally corresponds to an adverbial attaching to a verb, adjective or other adverb.

The obl relation can be further specified by the case. In conjunction with the case relation, it provides a uniform analysis for:

etsiä ilman johtolankaa \n to_search without clue.PARTITIVE
obl(etsiä, johtolankaa)
case(johtolankaa, ilman)
etsiä taskulampun kanssa \n to_search torch.GENITIVE with
obl(etsiä, taskulampun)
case(taskulampun, kanssa)
etsiä johtolangatta \n to_search clue.ABESSIVE
obl(etsiä, johtolangatta)
give the children the toys
obj(give, toys)
iobj(give, children)
give the toys to the children
obj(give, toys)
obl(give, children)
case(children, to)
# give the toys to the children
1     donner    donner   VERB   _   VerbForm=Inf               0   root   _   give
2     les       le       DET    _   Definite=Def|Number=Plur   3   det    _   the
3     jouets    jouet    NOUN   _   Gender=Masc|Number=Plur    1   obj   _   toys
4-5   aux       _        _      _   _                          _   _      _   _
4     à         à        ADP    _   _                          6   case   _   to
5     les       le       DET    _   Definite=Def|Number=Plur   6   det    _   the
6     enfants   enfant   NOUN   _   Gender=Masc|Number=Plur    1   obl   _   children

obl is also used for temporal and locational nominal modifiers:

Last night , I swam in the pool
obl(swam, night)
obl(swam, pool)

and for the agent of a passive verb (with the optional subtype obl:agent):

the cat was chased by the dog
nsubj:pass(chased, cat)
obl:agent(chased, dog)

edit obl

obl:agent: agent modifier

The relation obl:agent is used for agents of passive verbs. In Czech, the agent is a nominal in the instrumental Case.

Cena byla udělena děkanem fakulty . \n Prize was awarded by-dean of-faculty .
obl:agent(udělena, děkanem)
obl:agent(awarded, by-dean)

Typical agents are animate but it is not a rule. Inanimate agents may be sometimes difficult to distinguish from instruments, which are also coded by the instrumental case. Instruments are attached using the simple relation obl. Consider the following two examples, the first one is active and the second is passive.

Praštil psa klackem . \n He-hit dog with-a-stick .
obl(Praštil, klackem)
obl(He-hit, with-a-stick)
Pes byl praštěn klackem . \n Dog was hit with-a-stick .
obl(praštěn, klackem)
obl(hit, with-a-stick)

However, in passive sentences like Byl přejet autem “He was run over by a car,” the car could be analyzed as an inanimate agent, but also as an instrument, which is supported by the plausibility of the active counterpart, Přejeli ho autem “They ran over him with a car.”

edit obl:agent

obl:arg: oblique argument

The relation obl:arg is used for oblique arguments and distinguishes them from adjuncts, which use the plain obl relation. It is thus possible to preserve the notion of object as it is defined in the traditional grammar of some languages, where it essentially follows the distinction between arguments and adjuncts (which is otherwise not reflected in the main UD relation types — see the discussion here). A Czech example:

Spoléhám se na jeho instinkt . \n I-rely REFL on his instinct .
obl:arg(Spoléhám, instinkt)
obl:arg(I-rely, instinct)
case(instinkt, na)
case(instinct, on)

Arguments are selected by the predicate. Their coding (preposition and morphological case) is determined by the predicate; within the set of arguments of this predicate, the coding maps the argument to a particular semantic role. In contrast, the semantics of an adjunct is relatively independent of the predicate, and typical adjuncts (such as specifications of time, location, manner or instrument) can combine with a large number of different predicates.

Hence in the above example, the preposition na “on” and the accusative case of the noun instinkt “instinct” are selected by the verb spoléhat “to rely”. Other verbs may also select the same preposition and case but the meaning will be different: for instance, myslet na někoho “to think of someone.” Finally, the preposition na itself has an adessive or allative meaning (see the corresponding values of the Case feature). This meaning is suppressed when the preposition is selected by a predicate but it is more recognizable in adjuncts. In the following example, the preposition combines with a noun phrase in the locative case and marks a locational modifier:

Konference se koná na Slovensku . \n Conference REFL takes-place in Slovakia .
obl(koná, Slovensku)
obl(takes-place, Slovakia)
case(Slovensku, na)
case(Slovakia, in)

edit obl:arg

obl:lmod: locative modifier

A locative modifier is a subtype of the obl relation: if the modifier is specifying a location, it is labeled as lmod.

Danish: Drive the road you are told.

Kør den vej , du får besked på . \n Drive the road , you get order to .
obl:lmod(Kør, vej)

edit obl:lmod

obl:tmod: temporal modifier

A temporal modifier is a subtype of the obl relation: if the modifier is specifying a time, it is labeled as tmod.

Last night , I swam in the pool
obl:tmod(swam, night)
You need to turn in your homework by next week
obl:tmod(turn, week)

edit obl:tmod

orphan: orphan

The ‘orphan’ relation is used in cases of head ellipsis where simple promotion would result in an unnatural and misleading dependency relation. The typical case is predicate ellipsis where one of the core arguments has to be promoted to clausal head.

Marie won gold and Peter bronze
nsubj(won, Marie)
obj(won, gold)
conj(won, Peter)
cc(Peter, and)
orphan(Peter, bronze)

In this example, the subject Peter is promoted to the head position in the second conjunct. Attaching the object bronze to the subject is necessary to preserve the integrity of the clause, but using the standard relation obj would be misleading because bronze is not the object of Peter. Therefore, the orphan relation is used to indicate that this is a non-standard attachment. By contrast, the coordinating conjunction and performs essentially the same function as in the non-elliptical case and therefore retains its normal relation cc.

See further discussion of ellipsis.

edit orphan

parataxis: parataxis

The parataxis relation (from Greek for “place side by side”) is a relation between a word (often the main predicate of a sentence) and other elements, such as a sentential parenthetical or a clause after a “:” or a “;”, placed side by side without any explicit coordination, subordination, or argument relation with the head word. Parataxis is a discourse-like equivalent of coordination, and so usually obeys an iconic ordering. Hence it is normal for the first part of a sentence to be the head and the second part to be the parataxis dependent, regardless of the headedness properties of the language. But things do get more complicated, such as cases of parentheticals, which appear medially.

Let 's face it we 're annoyed
parataxis(Let, annoyed)
The guy , John said , left early in the morning
parataxis(left, said)
punct(said, ,-3)
punct(said, ,-6)

An inventory of constructions to which parataxis has been applied

The following material is duplicated in the syntax overview.

Side-by-side sentences (“run-on sentences”)

The parataxis relation is used for a pair of what could have been standalone sentences, but which are being treated together as a single sentence. This may happen because sentence segmentation of the sentence was done primarily following the presence of sentence-final punctuation, and these clauses are joined by punctuation such as a colon or comma, or not delimited by punctuation at all. In a spoken corpus, it may happen because what is labeled as a sentence is more commonly an utterance turn. Even if the treebanker is doing the sentence division, it may happen because there seems to be a clear discourse relation linking two clauses. Sometimes there are more than two sentences joined in this way. In this case we make all the later sentences dependents of the first one, to maximize similarity to the analysis used for conjunction.

Bearded dragons are sight hunters , they need to see the food to move .
parataxis(hunters, need)
punct(need, ,)

This relation may happen with units that are smaller than sentences:

Divided world the CIA
amod(world, Divided)
parataxis(world, CIA)
det(CIA, the)

Paired clauses with non-conjunction connective (“X so Y” etc.)

The relation is also used for clauses connected by a word like so, then, therefore, or however if neither clause is interpreted as modifying the other, and there is no coordinating conjunction:

He claimed to be a wizard ; however/ADV , he turned out to be a humbug .
parataxis(claimed, turned)
advmod(turned, however)
I 'm hungry , so/ADV I 'm getting a bagel .
parataxis(hungry, getting)
advmod(getting, so)

The following, by contrast, are advcl modifiers:

Eat now so/ADV you wo n't be hungry later .
advcl(Eat, hungry)
advmod(hungry, so)
If/SCONJ you build it , then/ADV they will come .
advcl(come, build)
mark(build, If)
advmod(come, then)

Note that if-clauses should almost always be analyzed as subordinate, even when then is present.

Reported speech

When a speech verb interrupts reported speech content, the interruption is treated as a parenthetical parataxis:

The guy , John said , left early in the morning
parataxis(left, said)
punct(said, ,-3)
punct(said, ,-6)

See further discussion of reported speech at ccomp.

News article bylines

We have used the parataxis relation to connect the parts of a news article byline. There does not seem to be a better relation to use.

Washington ( CNN ) :
parataxis(Washington, CNN)
punct(CNN, ()
punct(CNN, ))
punct(CNN, :)

Interjected clauses

Single word or phrase interjections are analyzed as discourse, but when a whole clause is interjected, we use the relation parataxis.

Calafia has great fries ( they are to die for ! )
parataxis(has, are)
punct(are, ()
punct(are, ))
Just to let you all know Matt has confirmed the booking for 3rd Dec is OK .
parataxis(confirmed, let)

In the second example, we treat the second half as the head of the dependency because the first half feels like a whole clause interjection, not like the main clause of the utterance.

Tag questions

We also use the parataxis relation for tag questions such as isn’t it? or haven’t you?.

It 's not me , is it ?
parataxis(me, is)
punct(is, ,)

edit parataxis

punct: punctuation

This is used for any piece of punctuation in a clause, if punctuation is being retained in the typed dependencies. Note that symbols tagged SYM are not punctuation and cannot be attached via the punct relation.

Go home !
punct(Go, !)

Tokens with the relation u-dep/punct always attach to content words (except in cases of ellipsis) and can never have dependents. Since punct is not a normal dependency relation, the usual criteria for determining the head word do not apply. Instead, we use the following principles:

  1. A punctuation mark separating coordinated units is attached to the following conjunct.
  2. A punctuation mark preceding or following a dependent unit is attached to that unit.
  3. Within the relevant unit, a punctuation mark is attached at the highest possible node that preserves projectivity.
  4. Paired punctuation marks (e.g. quotes and brackets, sometimes also dashes, commas and other) should be attached to the same word unless that would create non-projectivity. This word is usually the head of the phrase enclosed in the paired punctuation.
We have apples , pears , oranges , and bananas . obj(have, apples) conj(apples, pears) conj(apples, oranges) conj(apples, bananas) cc(bananas, and) punct(pears, ,-4) punct(oranges, ,-6) punct(bananas, ,-8)
Der Mann , den Sie gestern kennengelernt haben , kam wieder . punct(kennengelernt, ,-3) punct(kennengelernt, ,-9) punct(kam, .)
A.K.A. , AKA , or a\/k\/a may refer to : “ Also known as ” , used to introduce pseudonyms , aliases , etc. ( Compare f.k.a. for “ formerly known as ” . ) punct(AKA, ,-2) punct(a/k/a, ,-4) punct(refer, :) punct(known-13, “-11) punct(known-13, ”-15) punct(used, ,-16) punct(aliases, ,-21) punct(etc., ,-23) punct(Compare, (-25) punct(Compare, )-35) punct(known-31, “-29) punct(known-31, ”-33) punct(Compare, .-34)

See also examples at parataxis.

edit punct

reparandum: overridden disfluency

We use reparandum to indicate disfluencies overridden in a speech repair. The disfluency is the dependent of the repair.

Go to the righ- to the left .
obl(Go-1, left-7)
reparandum(left-7, righ-)
case(righ-, to-2)
det(righ-, the-3)
case(left-7, to-5)
det(left-7, the-6)

edit reparandum

root: root

The root grammatical relation points to the root of the sentence. A fake node ROOT is used as the governor. The ROOT node is indexed with 0, since the indexing of real words in the sentence starts at 1. (The ROOT node is not represented explicitly in CoNLL-U.)

ROOT I love French fries .
root(ROOT, love)

New from v2: There should be just one node with the root dependency relation in every tree. If the main predicate is not present (due to ellipsis) and there are multiple orphaned dependents, one of these is promoted to the head (root) position and the other orphans are attached to it. (This rule has in practice been followed since release v1.2 but was not explicitly stated in the original v1 guidelines.)

ROOT And Robert the fourth place .
root(ROOT, Robert)
cc(Robert, And)
orphan(Robert, place)
punct(Robert, .)
amod(place, fourth)
det(place, the)

edit root

vocative: vocative

The vocative relation is used to mark a dialogue participant addressed in a text (common in conversations, dialogue, emails, newsgroup postings, etc.). The relation links the addressee’s name to its host sentence. A vocative commonly co-occurs with a null subject, as in the first example below. If the nominal is clearly vocative in intent, the preference is to use the vocative relation.

Guys , take it easy!
vocative(take, Guys)
Marie , comment vas - tu ?
vocative(vas, Marie)

edit vocative

xcomp: open clausal complement

An open clausal complement (xcomp) of a verb or an adjective is (i) a core argument of the verb, (ii) which is without its own subject and (iii) for which the reference of the subject is necessarily determined by an argument external to the xcomp. The third requirement is often referred to as obligatory control. An xcomp can also be described as a predicative complement. The subject of the xcomp is normally, but not always, controlled by the object of the next higher clause, if there is one, or else by the subject of the next higher clause. These clauses tend to be non-finite in many languages, but they can be finite as well. The name xcomp is borrowed from Lexical-Functional Grammar (see Joan Bresnan, 2001, Lexical-Functional Syntax, chapter on “Predication Relations”).

We expect them to change their minds
xcomp(expect, change)
obj(expect, them)
Sue asked George to respond to her offer
xcomp(asked, respond)
iobj(asked, George)
I started to work there yesterday
xcomp(started, work)
You look great
xcomp(look, great)
I consider him a fool
obj(consider, him)
xcomp(consider, fool)
Louise struck me as a fool
obj(struck, me)
case(fool, as)
xcomp(struck, fool)
I consider her honest
obj(consider, her)
xcomp(consider, honest)
I regard her as honest
obj(regard, her)
mark(honest, as)
xcomp(regard, honest)
We got COVID-19 under control
obj(got, COVID-19)
case(control, under)
xcomp(got, control)
Susan is liable to be arrested
cop(liable, is)
xcomp(liable, arrested)

The predicative complement can be headed by various parts of speech, including a VERB, ADJ, or NOUN. A nominal predicative complement can be marked by a preposition (in English, often but not always by as). The xcomp-taking predicate of the higher clause can be a VERB or ADJ.

Contrast xcomp with other complement clauses where there is an overt subject or no obligatory control, which use ccomp:

He says that you like to swim
ccomp(says, like)
I suggest eating now before the food gets cold
ccomp(suggest, eating)

The Inherited Subject Criterion

In examples like “I consider her honest”, the UD analysis corresponds to traditional grammar and what was termed “raising to object” in early generative grammar: the nominal “her” in these constructions is treated as the object of the higher clause (as its accusative morphology and ability to passivize suggests).

Note that the above condition “without its own subject” does not mean that a clause is an xcomp just because its subject is not overt. The subject must be necessarily inherited from a fixed position in the higher clause. That is, there should be no available interpretation where the subject of the lower clause may be distinct from the specified role of the upper clause. In cases where the missing subject may or must be distinct from a fixed role in the higher clause, ccomp should be used instead, as below. This includes cases of arbitrary subjects and anaphoric control. In the following example, the subject of start or starting does not have to be the boss, it is any contextually relevant person or group of people. In addition, in these cases, the complement clause can often be replaced by a pronoun like it or that and it can sometimes be passivized (Starting the project was recommended by the boss).

The boss said to start the project
ccomp(said, start)
The boss recommended starting the project
ccomp(recommended, starting)

Pro-drop languages have clauses where the subject is not present as a separate word, yet it is inherently present (and often deducible from the form of the verb). The relation between clauses with pro-drop may or may not be xcomp. The implicit subjects of a subordinate clause and a higher clause may be coincidentally coreferent, warranting ccomp or advcl:

Píšu , protože jsem to slíbil . \n I-write , because I-have it promised .
advcl(Píšu, slíbil)
advcl(I-write, promised)
aux(slíbil, jsem)
aux(promised, I-have)
obj(slíbil, to)
obj(promised, it)
mark(slíbil, protože)
mark(promised, because)
Slíbil jsem , že budu psát . \n Promised I-have , that I-will write .
ccomp(Slíbil, psát)
ccomp(Promised, write)
aux(Slíbil, jsem)
aux(Promised, I-have)
aux(psát, budu)
aux(write, I-will)
mark(psát, že)
mark(write, that)

It is only xcomp if the implicit subject depends on an argument from a higher clause (one cannot be varied without the other):

Slíbil jsem psát . \n Promised I-have to-write .
xcomp(Slíbil, psát)
xcomp(Promised, to-write)
aux(Slíbil, jsem)
aux(Promised, I-have)

Secondary Predicates

The following is excerpted from a more detailed discussion of secondary predicates.

The xcomp relation is also used in constructions that are known as secondary predicates or predicatives. Examples:

We could paraphrase the sentence using a subordinate clause: She declared that the cake was beautiful. There are two predicates mixed in one clause: 1. she declared something, and 2. the cake was beautiful (according to her opinion). The secondary predicate will be attached to the main predicate as an xcomp:

She declared the cake beautiful .
nsubj(declared, She)
obj(declared, cake)
xcomp(declared, beautiful)

The subject of “declared” is again obligatorily controlled by a role in the higher clause. In the enhanced representation, there is an additional subject link showing the secondary predication:

She declared the cake beautiful .
nsubj(declared, She)
obj(declared, cake)
xcomp(declared, beautiful)
nsubj(beautiful, cake)

A Czech example:

jmenovat někoho generálem \n to-appoint someone as-a-general
obj(jmenovat, někoho)
xcomp(jmenovat, generálem)

Remember that xcomp is used for core arguments of predicates so it will not be used for non-core instances of secondary predication. For instance, in She entered the room sad we also have a double predication (she entered the room; she was sad). But sad is not a core argument of enter: leaving it out will neither affect grammaticality nor significantly alter the meaning of the verb. On the other hand, leaving out beautiful in she declared the cake beautiful will either render the sentence ungrammatical or lead to a different interpretation of declared.

The result is that in She entered the room sad, sad is considered a modifier (not complement) of the verb, with the relation advcl instead of xcomp. (This was changed from the previous approach which analyzed the secondary predication directly with acl, because the nominal predicand is not always overt, and even when it is, the adjective does not really belong to the same nominal phrase.)

She entered the room sad .
nsubj(entered, She)
det(room, the)
obj(entered, room)
advcl(entered, sad)
punct(entered, .)
Entering the room sad is not recommended .
csubj(recommended, Entering)
det(room, the)
obj(Entering, room)
advcl(Entering, sad)
cop(recommended, is)
advmod(recommended, not)
punct(recommended, .)

Notice that while can be inserted before sad, clearly marking it as a clause.

A Czech example:

Vstoupila do místnosti smutná . \n She-entered to room sad .
advcl(Vstoupila, smutná)
advcl(She-entered, sad)

There is no need to decide whether an example like the following is a depictive or a manner adverbial:

Linda found the money walking our dog .
nsubj(found, Linda)
det(money, the)
obj(found, money)
advcl(found, walking)
det(dog, our)
obj(walking, dog)
punct(found, .)

The optional secondary predication or controlled adjunct subject relation can be represented with an enhanced dependency edge in addition to the advcl relation.

Some other cases that could be regarded as secondary predicates are just treated as obliques. In particular, locative arguments of verbs are always treated as obliques:

She put a book on the table .
nsubj(put, She)
det(book, a)
obj(put, book)
case(table, on)
det(table, the)
obl(put, table)
punct(put, .)

edit xcomp