Dependencies
Note: nmod, neg, and punct appear in two places.
|
|
|
|||||||||||||||||||||||||||||||||||||||
|
|
|
|||||||||||||||||||||||||||||||||||||||
|
|
|
acl
: clausal modifier of noun (adjectival clause)
acl
stands for finite and non-finite clauses that modify a nominal.
The acl
relation contrasts with the advcl relation, which is
used for adverbial clauses that modify a predicate. The head of the
acl
relation is the noun that is modified, and the dependent is the
head of the clause that modifies the noun.
In Portuguese, there are also 2 other language-specific subtypes of
acl
: acl:part
, acl:relcl
.
Examples:
Uma defesa bem parecida com aquela de Lula, no começo das denúncias contra José Paulo Bisol.
acl(defesa, parecida)
O caso nasceu de uma "vendetta", obra de um certo Francesco Farina, dono do Modena, um time rebaixado à 3ª divisão.
acl(time, rebaixado)
A Liga de Assistência e Recuperação, órgão ligado à Prefeitura de Salvador, está desenvolvendo um projeto.
acl(orgão, ligado)
acl:inf
: acl:inf
This relation is a language-specific subtype of acl
. There are also
2 other language-specific subtypes of acl
: acl:part
, acl:relcl
.
acl:inf
stands for infinitive clauses that modify a nominal.
Example:
Não tenho nada a perder.
acl:inf(nada,perder)
acl:part
: acl:part
This document is a placeholder for the language-specific documentation
for acl:part
.
acl:relcl
: acl:relcl
A relative clause modifier of an noun is a relative clause modifying the noun. The relation points from the noun that is modified to the head of the relative clause. Relative clauses are finite.
A seca que atingiu as áreas produtoras de grãos não deve causar grandes estragos.
acl:relcl(seca, atingiu)
advcl
: adverbial clause modifier
An adverbial clause modifier is a clause which modifies a verb or other predicate (adjective, etc.), as a modifier not as a core complement. This includes things such as a temporal clause, consequence, conditional clause, purpose clause, etc. The dependent must be clausal (or else it is an advmod) and the dependent is the main predicate of the clause.
Na Vila, quando recebo a bola, tenho que ficar olhando sua trajetória, para não ser surpreendido.
advcl(recebo, olhando) (temporal clause)
advcl(surpreendido, olhando) (purpose clause)
Eles tiveram que fazer isso se os escravos fossem superiores em qualidades que os próprios brancos valorizavam, onde estaria a justificativa moral para mantê-los escravizados?
advcl(superiores, tiveram) (conditional clause)
Note in the example above, that the advcl
relation holds between an
ADJ
and a VERB
, since the head of the adverbial clause (“se os
escravos fossem superiores em qualidades que os próprios brancos
valoriavam”) is “superiores”, as the verb to be (“fossem”) holds a
cop
relation and therefore it is not the head (the main predicate)
of the clause.
advmod
: adverbial modifier
An adverbial modifier of a word is a (non-clausal) adverb or adverbial phrase that serves to modify the meaning of the word.
Note that in some grammatical traditions, the term adverbial modifier covers constituents that function like adverbs regardless whether they are realized by adverbs, adpositional phrases, or nouns in particular morphological cases. We differentiate adverbials realized as adverbs (advmod) and adverbials realized by noun phrases or adpositional phrases (nmod).
Seca afeta pouco a produção de grãos.
advmod(afeta, pouco)
O fã daquela época vai ser fã sempre.
advmod(fã, sempre)
Talvez isto seja muito barulho por nada.
advmod(barulho, talvez)
Note that in the last example, the relation hold between the adverb (“talvez”) and the head of the main clause (“barulho”, since it is a copular clause).
amod
: adjectival modifier
An adjectival modifier of a noun is any adjectival phrase that serves to modify the meaning of the noun. This relation is universal.
Desde o último dia 13, «Confissões de Adolescente» pode ser vista pelos teens portugueses.
amod(dia, último)
"Câmera Manchete" é o nome do novo programa jornalístico.
amod(programa, jornalístico)
amod(programa, novo)
Na época, o então ministro da Fazenda, Fernando Henrique Cardoso, fez um pronunciamento em cadeia nacional.
amod(ministro, então)
amod (cadeia, nacional)
Note that in the last example, “então” behaves as an adjective (denotes “o atual ministro”).
Note that “Confissões de Adolescente” and “ministro da Fazenda” are
multi word expressions and therefore “de Adolescente” and “da Fazenda”
are part of the mwe token in the current version of Portuguese corpus,
so they do not hold the amod
relation.
appos
: appositional modifier
An appositional modifier of a noun is a nominal immediately following the first noun that serves to define or modify that noun (aposto). It includes parenthesized examples, as well as defining abbreviations in one of these structures. This relation is universal.
O modelo Lx 810, da Epson, é vendido em Miami por US$ 178.
appos(modelo, Lx 810)
nmod(modelo, Epson)
O Applause, um sedã quatro portas, com motor 1.6, é o carro mais caro da Daihatsu.
appos(Applause, sedã)
O nome oficial do projeto é Depse 1 (Deep Space Program Science Experiment).
appos(Depse 1, Deep Space Program Science Experiment)
In case of more than one appositive nominal, all nouns should be marked as modifying the first noun, rather than being chained:
Para o terceiro réu, Alexandre Cardoso , 21 , o "Topeira" , o juiz determinou uma pena de 20 anos .
appos(réu, Alexandre Cardoso)
appos(réu, 21)
appos(réu, Topeira)
Note however that nested apposition cannot be completely excluded. It may occur in combination with coordination:
Você pode escolher entre quatro matérias : língua ( alemão ou francês) , economia , tecnologia e arte .
appos(matérias, língua)
conj(língua, economia)
conj(língua, tecnologia)
conj(língua, arte)
cc(língua, e)
appos(língua, alemão)
conj(alemão, francês)
cc(alemão, ou)
appos is also used to link key-value pairs in addresses, signatures, etc. (see also the list label):
Steve Jones Fone: 555-9814 Email: jones@abc.edf
appos(Fone, 555-9814)
appos(Email, jones@abc.edf)
aux
: auxiliary
An auxiliary of a clause is a non-main verb of the clause, e.g., a modal auxiliary, or a form of ser, fazer or ter in a periphrastic tense.
Note that in Portuguese, verbs as “começar”, “acabar”, “terminar” are analysed as aspectualizers of the main verb (following ROCIO, S. Começar e acabar: aspestualizadores em processo de gramaticalização, then they should be tagged as aux
.
Exception: Auxiliary verb used to construct the passive
voice is not labeled aux
but auxpass.
músicos talvez tenham estudado um pouco demais
aux(estudado, tenham)
O mesmo não se pode dizer
aux(dizer, pode)
Acabou assinando com o Interscope
aux(assinando, acabou)
auxpass
: passive auxiliary
A passive auxiliary of a clause is a non-main verb of the clause which contains the passive information.
algo precisa ser feito
auxpass(feito, ser)
o Exército é chamado para ajudar
auxpass(chamado, é)
é chegado o momento
auxpass(chegado, é)
case
: case marking
The case
relation is used for any case-marking element which is treated as a separate syntactic word (including prepositions, postpositions, and clitic case markers). Case-marking elements are treated as dependents of the noun or clause they attach to or introduce. The case
relation aims at providing a more uniform analysis of nominal elements, prepositions and case in morphologically rich languages: a nominal in an oblique case will receive the same dependency structure as a nominal introduced by an adposition (“para ele” will have the same dependency relation as “lhe”, for example).
preços de os funileiros artesanais
det(funileiros, os)
case(funileiros, case)
moradores de renda baixa
case(renda, de)
Os policiais foram a uma casa, indicada por as duas mulheres, onde estariam outros integrantes da quadrilha
det(casa, uma)
case(casa, a)
det(mulheres, as)
case(mulheres, por)
case(quadrilha, de)
det(quadrilha, a)
cc
: coordinating conjunction
For more on coordination, see the conj relation.
A cc
is the relation between the first conjunct and
the coordinating conjunction delimiting another conjunct.
(Note: different dependency grammars have different treatments of coordination.
We take the first conjunct as the head of the coordination.)
repúblicas gregas e romanas
cc(gregas, e)
conj(gregas, romanas)
Insana, mas saudável
cc(Insana, mas)
conj(Insana, saudável)
opção de retorno ou acesso à avenida Santo Amaro
cc(retorno, ou)
A coordinating conjunction may also appear at the beginning of a
sentence. This is also called a cc
, and it depends on the root
predicate of the sentence.
E chegou lá.
cc(chegou, E)
Note that when many elements are coordinated, the conj
relation holds between the connective and the first element of the conjunction.
Aqui era o quarto pobre , simples , limpo e acolhedor.
amod(quarto, pobre)
conj(pobre, simples)
conj(pobre, limpo)
conj(pobre, acolhedor)
cc(pobre, e)
ccomp
: clausal complement
A clausal complement of a verb or adjective is a dependent clause which is a core argument. That is, it functions like an object of the verb, or adjective.
o PMDB vai procurar os presidentes
ccomp(vai, procurar)
Agora imagina que é chegado o momento.
ccomp(imagina, chegado)
Such clausal complements may be finite or nonfinite. However, if the subject of the clausal complement is controlled (that is, must be the same as the higher subject or object, with no other possible interpretation) the appropriate relation is xcomp.
The boss said to start digging
ccomp(said, start)
mark(start, to)
We started digging
xcomp(started, digging)
The key difference here is that, while it is possible to interpret the first
sentence to mean that the boss will not be doing any digging, in the second
sentence it is clear that the subject of digging can only be we. This is
what distinguishes ccomp
and xcomp
.
Additionally, ccomp
is used with copulas.
The important thing is to keep calm.
ccomp(is, keep)
The problem is that this has never been tried .
ccomp(is, tried)
(In these cases, the copula is treated as a head. This is a somewhat inconsistent and ugly feature of the current UD. An alternative solution was adopted for this case in the Turku TDT. It may be worth considering adopting it in a revision of UD.)
Note: In earlier versions of SD/USD, complement clauses with nouns
like fact or report were also analyzed as ccomp
. However, we
now analyze them as acl. Hence, ccomp
does not appear in
nominals. This makes sense, since nominals normally do not take core
arguments.
compound
: compound
compound
is one of the three relations in UD for compounding.
It is used for
-
any kind of X0 compounding: noun compounds (e.g., linha vermelha), but also verb and adjective compounds that are more common in other languages (guarda-chuva, nova era). ~~~ sdparse nova era compound(era, nova) ~~~
-
for numbers
a décima primeira rodada
compound(décima, primeira)
vendas de US$ 150 milhões
compound(milhões, 150)
The two other compounding relations are:
- mwe for fixed grammaticized expressions with function words
- name for proper nouns constituted of multiple nominal elements
conj
: conjunct
A conjunct is the relation between two elements connected by a
coordinating conjunction, such as e, ou, mas etc. We treat
conjunctions asymmetrically: The head of the relation is the first
conjunct and all the other conjuncts depend on it via the conj
relation.
Palmeiras e Corinthians treinam hoje
conj(Palmeiras, Corinthians)
Aqui era o quarto pobre , simples , limpo e acolhedor.
amod(quarto, pobre)
conj(pobre, simples)
conj(pobre, limpo)
conj(pobre, acolhedor)
cc(pobre, e)
Coordinate clauses are treated the same way as coordination of other constituent types:
Ronaldo tomou conhecimento da pesquisa e procurou Belluzzo.>
He came home , took a shower and immediately went to bed .
conj(tomou, procurou)
punct(tomou, .)
cc(tomou, e)
Coordination may be asyndetic, which means that the coordinating conjunction is omitted. Commas or other punctuation symbols will delimit the conjuncts in the typical case.
O tom familiar , coloquial , benigno de suas crônicas foi pouco a pouco vencendo as resistências do público .
conj(familiar, coloquial)
conj(familiar, benigno)
punct(familiar, ,-4)
punct(familiar, ,-6)
cop
: copula
A copula is the relation between the complement of a copular verb and
the copular verb to be (only). In Portuguese, cop
covers verb ‘ser’ and ‘estar’. We normally take a copula as a dependent of its
complement (predicativo do sujeito) .
Nigéria é campeã
nsubj(campeã, Nigéria)
cop(campeã, é)
As 15 pessoas são membros da Peta
nsubj(membros, pessoas)
cop(membros, são)
The copula ser/estar is not treated as the head of a clause, but rather the dependent of a lexical predicate, as exemplified above.
Such an analysis is motivated by the fact that many languages often or always lack an overt copula in such constructions, as in Russian and Greek.
In informal Portuguese or specific textual genres (as news headlines and conversation), this may also arise.
E-mail grátis se você tem wifi
nsubj(grátis, E-mail)
This analysis is adopted also when the predicate is a prepositional phrase, in which case the nominal part of the prepositional phrase is the head of the clause.
Susan está em forma
nsubj(forma, Susan)
cop(forma, é)
case(forma, em)
If the copula is accompanied by other verbal auxiliaries for tense, aspect, etc., then they are also given a flat structure, and taken as dependents of the lexical predicate:
a nossa opção estratégica tem sido a mais correcta
nsubj(correta, opção)
cop(correta, sido)
aux(correta, tem)
The motivation for this choice is that this structure is parallel to the flat structure which we give to auxiliary verbs accompanying verbs.
In particular, in languages such as English and Portuguese, it is often very difficult to decide whether to regard a participle as a verb or an adjective.
Perhaps the following sentence is such a case:
os aparelhos estão a ser equipados com um sistema de iluminação
nsubj(equipados, aparelhos)
cop(equipados, estão)
aux(equipados, ser)
While a part of speech has to be decided in such cases, it would be unfortunate if the choice of part of speech also changed the dependency structure.
Finally, ccomp
is used with copulas. Only in this case, the structure is different, and we take the form of be as a head:
O importante é manter a calma.
ccomp(é, manter)
nsubj(é, importante)
O problema é que ele nunca tentou.
ccomp(é, tentou)
nsubj(é, problema)
If we took the main verb as the head, it would have two subjects, which would be unworkable. Examples like the above could be analyzed reversed with the initial noun phrase as the predicate, but in addition to this seeming undesirable, it would fail to be a solution if there were a clause on both sides of be, such as in: Não tentar resolver um problema é reconhecer a derrota.
Note that it is possible to have cop
constructions without subject.
É muito engraçado .
cop(engraçado, é)
advmod(engraçado, muito)
punct(engraçado, .)
csubj
: clausal subject
A clausal subject is a clausal syntactic subject of a clause, i.e., the subject is itself a clause.
Ir a uma feira livre serve de aula para os principiantes.
csubj(serve, Ir)
The governor of this relation might not always be a verb: when the verb is a copular verb, the root of the clause is the complement of the copular verb. The dependent is the main lexical verb or other predicate of the subject clause.
É um equívoco reduzir a liberalização comercial ao neoliberalismo.
csubj(equívoco, reduzir)
é necessário alertar toda a sociedade.
csubj(necessário, alertar)
csubjpass
: clausal passive subject
A clausal passive subject is a clausal syntactic subject of a passive clause (or more generally, any voice where the proto-agent argument does not become the subject of the clause). In the example below, que ele mentiu is the subject.
Que ela mentiu foi suspeito por todos.
csubjpass(suspeito, mentiu)
dep
: unspecified dependency
A dependency is labeled as dep
when a system is unable to
determine a more precise dependency relation between two words. This
may be because of a weird grammatical construction, a limitation in
software (e.g. the Stanford Dependency conversion), a parser error, or
because of an unresolved long distance dependency.
Fica assim totalmente sinalizado o percurso
dep(Fica, sinalizado)
In this example dep
is used due the ambiguity of sinalizado between adjective and verb participle.
det
: determiner
The relation determiner (det
) holds between a nominal head and its
determiner. Most commonly, a word of POS DET
will have the relation det
and vice versa. It is a universal relation.
Meu passado muito me orgulha
det(passado, meu)
Meu intuito é tentar entender que papel teve a imprensa nessa história.
det(intuito, meu)
det(impressa, a)
det(história, essa)
Outros três suspeitos estão foragidos.
det(suspeitos, outros)
det:poss
: det:poss
This document is a placeholder for the language-specific documentation
for det:poss
.
discourse
: discourse element
This is used for interjections and other discourse particles and elements (which are not clearly linked to the structure of the sentence, except in an expressive way). We generally follow the guidelines of what the Penn Treebanks count as an INTJ. They define this to include: interjections (claro, não, pronto), fillers (hum, hahaha), and discourse markers (bem, na verdade, but not você sabe).
ROOT Não , não e não
root(ROOT, Não-2)
discourse(Não-2, não-4)
discourse(Não-2, não-6)
cc(Não-2, e)
dislocated
: dislocated elements
The dislocated
relation is used for fronted or postposed elements
that do not fulfill the usual core grammatical relations of a
sentence. These elements often appear to be in the periphery of the sentence, and may be separated off with a comma intonation.
It is used for fronted elements that introduce the topic of a sentence. The dislocated element attaches to the head of the clause to which it belongs:
Quantos artistas , quer estrangeiros quer nacionais , a festa de toiros tem motivado.
dislocated(motivado, artistas)
However, it would not be used for a topic-marked noun that is also the subject of the sentence; this would be an nsubj.
It is also used for postposed elements. The dislocated elements attach to the same governor as the dependent that they double for. Right dislocated elements are frequent in spoken languages.
O fado , esse , ficou aquém.
dislocated(ficou, esse)
dobj
: direct object
The direct object of a verb is the second most core argument of a verb after the subject. Typically, it is the noun phrase that denotes the entity acted upon or which undergoes a change of state or motion (the proto-patient).
Euller fez mais duas jogadas .
dobj(fez, jogadas)
In general, if there is just one object, it should be labeled dobj
,
regardless of the morphological case or semantic role that it bears. If there are two or more
objects, one of them should be dobj
and the others should be
iobj. In such cases it is necessary to decide what is the most
directly affected object (patient). The one exception is when there is a clausal complement.
Then the clausal complement is regarded as a “clausal direct object” and an object nominal will be an iobj.
não te dizem nada
iobj(dizem, te)
dobj(dizem, nada)
Note that oblique pronouns are tagged as iobj
.
faltou lhes inteligência
nsubj(faltou, inteligência)
iobj(faltou, lhes)
expl
: expletive
There is no expl
in Portuguese.
This relation captures expletive or pleonastic nominals. These are nominals that appear in an argument position of a predicate but which do not themselves satisfy any of the semantic roles of the predicate. The main predicate of the clause (the verb or predicate adjective or noun) is the governor. In English, this is the case for some uses of it and there: the existential there, and it when used in extraposition constructions. (Note that both it and there also have non-expletive uses.)
There is a ghost in the room
expl(is, There)
It is clear that we should decline .
expl(clear, It)
Some languages, as Portuguese, do not have expletives of the English sort, including most languages with free pro-drop (the ability to use zero anaphora rather than overt pronouns). In languages with expletives of this sort, they can be positioned where normally a core argument appears: the subject and direct object (and even indirect object) slots, as in the examples below.
ROOT há um processo de conglomerização de empresas
root(ROOT, há)
dobj(há, processo)
Caso não haja fila , o período de uso pode ser maior .
mark(haja, caso)
dobj(haja, fila)
neg(haja, não)
advcl(maior, haja)
nsubj(maior, período)
foreign
: foreign words
We use foreign
to label sequences of foreign words. These are given
a linear analysis: the head is the first token in the foreign phrase.
foreign
does not apply to loanwords or to foreign names.
It applies to quoted foreign text incorporated in a sentence/discourse
of the host language (unless we want to and know how to annotate the
internal structure according to the syntax of the foreign language).
Eu acho que c' est la vie
nsubj(acho-2, Eu-1)
ccomp(acho-2, c'-4)
mark(c'-4, que-3)
foreign(c'-4, est-5)
foreign(c'-4, la-6)
foreign(c'-4, vie-7)
goeswith
: goes with
This relation links two parts of a word that are separated in text that is not well edited. The head is in some sense the “main” part, often the second part.
Computa dor é um aparelho .
goeswith(Computa, dor)
iobj
: indirect object
The indirect object of a verb is any nominal phrase that is a core argument of the verb but is not its subject or direct object. The prototypical example is the recipient of ditransitive verbs of exchange:
Ela me deu um aumento
iobj(deu, me)
In general, if there is just one object, it should be labeled dobj, regardless of the morphological case or semantic role. For example, in Portuguese, ensinar can take either the subject matter or the recipient as the only object, and in both cases it would be analyzed ad the dobj:
Ela ensina lógica
dobj(ensina, lógica)
Ela ensina os alunos do primeiro ano
dobj(ensina, alunos)
Ela ensina lógica a os alunos do primeiro ano
dobj(ensina, lógica)
iobj(ensina, alunos)
This is consistent with the analysis of Huddleston and Pullum (2002) “The Cambridge Grammar of the English Language”, chapter 4 section 4 (p. 251). As they note, it is no different to the same semantic role being sometimes the subject and sometimes the object in intransitive/transitive alternations. The one exception is when there is a clausal complement. Then the clausal complement is regarded as a “clausal direct object” and an object nominal will be an iobj, parallel to the simple ditransitive case:
Ela disse a os estudante que eles precisam estudar esta noite
iobj(disse, estudantes)
ccomp(told, precisam)
Ela disse o plano a os estudantes
iobj(disse, estudantes)
dobj(disse, plano)
If there are two or
more objects, one of them should be dobj and the others should be
iobj
. In such cases it is necessary to decide what is the
most directly affected object (patient). In Portuguese, usually iobj
comes with a preposition (a, de, em) or in olibque case (me, se, lhe).
list
: list
The list
relation is used for chains of comparable items. In lists
with more than two items, all items of the list should modify the
first one. Informal and web text often contains passages which are
meant to be interpreted as lists but are parsed as single
sentences. Email signatures often contain these structures, in the
form of contact information: the different contact information items
are labeled as list
; the key-value pair relations are labeled
as appos.
Steve Jones Phone: 555-9814 Email: jones@abc.edf
name(Steve-1, Jones-2)
list(Steve-1, Phone:-3)
list(Steve-1, Email:-5)
appos(Phone:-3, 555-9814-4)
appos(Email:-5, jones@abc.edf-6)
Another place where list
has been used is for a sequence of
attributes or descriptive terms used as the title line of a review
(such as product or restaurant reviews, etc.:
Long Lines , Silly Rules , Rude Staff , Ok Food
list(Lines, Rules)
list(Lines, Staff)
list(Lines, Food)
However, list
should not be over-used. If a construction can be
easily analyzed using the grammatical relations of standard sentences,
such as when there is overt coordination, then it should be analyzed
with these more standard relations, even if it is laid out as a list
typographically.
mark
: marker
A marker is the word introducing a finite clause subordinate to
another clause. For a complement clause, this is words like que
or se. For an adverbial clause, the marker is typically a
subordinating conjunction like enquanto or embora. The mark is a dependent of the
subordinate clause head. In a relative clause, it is a normally uninflected word, which simply introduces a relative clause, such as que. In this last use, one needs to distinguish between relative clause markers, which are mark
from relative pronouns, which fill a regular verbal argument or modifier grammatical relation.
Era de facto por ali que começava a surgir perigo
mark(surgir, que)
Note that this relation holds between the marker and the head of the subordinate clause, which can be a non verbal element.
Sugere ainda que seja elaborada uma circular
mark(elaborada, que)
mwe
: multi-word expression
The multi-word expression (modifier) relation is one of the three
relations (compound, mwe
, name) for compounding.
It is used for certain fixed grammaticized expressions that behave
like function words or short adverbials.
The scope of mwe
annotation corresponds roughly to the fixed
expressions category of
Sag et al., but
excludes any relations in scope of name or compound.
Additionally, limited morphosyntactic variation may be allowed
for MWEs in exceptional cases.
fluido está para vítreo assim como viscoso está para translúcido
mwe(assim, como)
o que é mais 48,31 por cento
mwe(o, que)
mwe(por, cento)
todos os candidatos recebem os dois pontos
mwe(todos, os)
det(candidatos, todos)
Multiword expressions are annotated in a flat, head-initial structure,
in which all words in the expression modify the first one using the
mwe
label.
name
: name
name
is one of the three relations for compounding in UD (together
with compound and mwe).
It is used for proper nouns constituted of multiple nominal
elements. For example, name
would be used between the words of
Hillary Clinton, Rio de Janeiro, or Dom Pedro I but not to
replace the usual relations in a phrasal or clausal name like O rei da Suécia or the novels O senhor dos anéis.
Words joined by name
should all be part of a minimal noun phrase;
otherwise regular syntactic relations should be used. This is
basically similar to the treatment of noun compounds with
compound, except that in many cases parts of the name may be
another nominal element such as an adjective (Páginas Amarelas).
In general, names are annotated in a flat, head-initial structure, in
which all words in the name modify the first one using the name
label.
Dom Pedro I
name(Dom-1, I-3)
name(Dom-1, Pedro-2)
For organization names with clear syntactic modification structure, the dependencies should reflect the syntactic modification structure using regular syntactic relation, as in:.
Procuradoria Geral da República
amod(Procuradoria, Geral)
compound(Procuradoria, República)
In addition, regular syntactic relations are used: (i) for a modifying determiner or (ii) to connect together the words of a description or name which involve embedded prepositional phrases, sentences, etc.
os Estados Unidos
det(Estados, os)
Miguel de Cervantes
name(Miguel, Cervantes)
case(Cervantes, de)
O rei de a Suécia
det(rei-2, O-1)
nmod(rei-2, Suécia-4)
case(Suécia-4, de-3)
Rio de Janeiro
case(Janeiro, de)
nmod(Rio, Janeiro)
In the case of proper entities named after people, e.g. Fundação Getúlio Vargas, the name
relation should only be used inside the person name, with the rest of the construction analyzed compositionally using normal syntactic relations:
Fundação Getúlio Vargas
compound(Fundação, Getúlio)
name(Getúlio, Vargas)
neg
: negation modifier
The negation modifier is the relation between a negation word and the word it modifies. This relation in universal.
Modifiers labeled neg
depend either on a noun (group “noun
dependents”) or on a predicate (group “non-core dependents of clausal
predicates”).
O Emeraude não carrega armamento nuclear .
neg(carrega, não)
nmod
: nominal modifier
The nmod
relation is used for nominal modifiers. They depend either
on another noun (group “noun dependents”) or on a predicate (group
“non-core dependents of clausal predicates”).
nmod
is a noun (or noun phrase) functioning as a
non-core (oblique) argument or adjunct.
This means that it functionally corresponds to an adverbial when it attaches to a verb, adjective or other adverb.
But when attaching to a noun, it corresponds to an attribute, or genitive complement (the terms are less standardized here).
We differentiate adverbials realized by noun phrases or adpositional phrases (nmod
as in “Na noite passada, eu nadei”) from adverbials realized as adverbs (advmod
as in “Ontem, eu nadei”).
"PT no governo"
nmod(PT, governo)
"«Confissões» chega a Portugal"
nmod(chega, Portugal)
nsubj
: nominal subject
A nominal subject (nsubj
) is a nominal which is the syntactic subject and the proto-agent of a clause.
That is, it is in the position that passes typical grammatical test for subjecthood, and this argument is the more agentive,
the do-er, or the proto-agent of the clause.
(See csubj for when the subject is clausal. See nsubjpass and csubjpass for when the subject is not
the proto-agent argument due to valence changing operations.) This nominal may be headed by a noun,
or it may be a pronoun or relative pronoun, or in ellipsis contexts, other things such as an adjective.
Euller fez mais duas jogadas .
nsubj(fez, Euler)
«!- The nsubj
role is only applied to semantic arguments of a predicate.
When there is an empty argument in a grammatical subject position (sometimes called a pleonastic or expletive),
it is labeled as expl. If there is then a displaced subject
in the clause, as in the English existential there construction, it will be labeled as nsubj
.) –>
The governor of the nsubj
relation might not always be a verb: when
the verb is a copular verb, the root of the clause is the complement
of the copular verb, which can be an adjective or noun, including a noun marked by a preposition,
as in the examples below.
o tom já é outro .
nsubj(outro, tom)
Note that when the subject appears after the verb, it is still tagged as nsubj
.
É um povo apaixonado , o povo basco .
nsubj(povo-3, povo-7)
Note that complex subjects are treated as conjunctions and only the first element of the conjunction holds the nsubj
relation.
O Grupo Champalimaud, a Petrogal, a TAP, a Marconi, são algumas das que mais investiram.
nsubj(algumas, Grupo)
conj(Grupo, Petrogal)
conj(Grupo, TAP)
conj(Grupo, Marconi)
nsubjpass
: passive nominal subject
A passive nominal subject is a noun phrase which is the syntactic subject of a passive clause (or more generally, any voice where the proto-agent argument does not become the subject of the clause).
Peças podem ser encontradas em leilão
nsubjpass(encontradas, Peças)
nummod
: numeric modifier
A numeric modifier of a noun is any number phrase that serves to modify the meaning of the noun with a quantity.
a realização de mais 30 episódios
nummod(episódios, 30)
dois árbitros resolveram contar todos os podres
nummod(árbitros, dois)
ele aluga o imóvel por US$ 1.000
nummod(US$, 1.000)
Note that indefinite quantifiers such as poucos, muitos are
tagged u-pos/DET rather than u-pos/NUM. Therefore their
relation to the quantified noun is not nummod
but det:
Há muitos servidores da Internet
det(servidores, muitos)
parataxis
: parataxis
The parataxis relation (from Greek for “place side by side”) is a relation between a word (often the main predicate of a sentence) and other elements, such as a sentential parenthetical or a clause after a “:” or a “;”, placed side by side without any explicit coordination, subordination, or argument relation with the head word. Parataxis is a discourse-like equivalent of coordination, and so usually obeys an iconic ordering. Hence it is normal for the first part of a sentence to be the head and the second part to be the parataxis dependent, regardless of the headedness properties of the language. But things do get more complicated, such as cases of parentheticals, which appear medially.
A cama não era um leito de enferma, era um trono de rainha.
parataxis(trono, leito)
Folha -- O sr. acredita ter influenciado estes filmes ?
parataxis(Folha, acredita)
An inventory of constructions to which parataxis has been applied
Side-by-side sentences (“run-on sentences”)
The relation parataxis is used for a pair of what could have been standalone sentences, but which are being treated together as a single sentence. This may happen because sentence segmentation of the sentence was done primarily following the presence of sentence-final punctuation, and these clauses are joined by punctuation such as a colon or comma, or not delimited by punctuation at all. In a spoken corpus, it may happen because what is labeled as a sentence is more commonly an utterance turn. Even if the treebanker is doing the sentence division, it may happen because there seems to be a clear discourse relation linking two clauses. Sometimes there are more than two sentences joined in this way. In this case we make all the later sentences dependents of the first one, to maximize similarity to the analysis used for conjunction.
Fácil subir nos palanques , defendendo que é necessário « mudar tudo o que está aí .
parataxis(Fácil, defendendo)
This relation may happen with units that are smaller than sentences:
Mundo dividido a CIA
amod(Mundo, dividido)
parataxis(mundo, CIA)
det(CIA, a)
Treatment of reported speech
For this reported speech example:
Encontramos um homem-gol de que sentíamos falta há muitos anos , disse Van Himst.
parataxis(Encontramos, disse)
There are paraphrases that convey essentially the same meaning but with a different syntactic structure. When the reported speech is embedded in a subordinate clause (with or without an overt complementizer que, a not frequent construction in Portuguese, but frequent in many languages, as English), the subordinate clause is a ccomp of the speech verb.
Van Himst disse que eles encontraram um homem-gol .
ccomp(disse, encontraram)
When the reported speech follows the speech verb and is separated by a colon(:), the reported speech forms a main clause that attaches to the preceding main clause with a parataxis relation, hence with the speech verb as its head.
Van Himst disse : Eles encontraram um homem - gol .
parataxis(disse, encontraram)
However, when the speech verb occurs as a medial or final parenthetical, the relation is reversed and the speech verb is treated as a parataxis of the reported speech.
Eles encontraram um homem - gol, disse Van Himst.
parataxis(encontraram, disse)
This analysis is not uncontroversial but follows many authorities, such as Huddleston and Pullum (2002), The Cambridge Grammar of the English Language (see chapter 11, section 9).
An argument for this analysis is that in the cases analyzed as embedding, the entire clause can be further embedded (Eu estava triste quando o Van Himst disse que eles encontraram um homem-gol.), while this is not possible with medial or final placement of the speech verb (*Eu estava triste quando encontraram um homem-gol, disse Van Hismt. (note that this sentnece is not ungrammatical but has a different meaning).
News article bylines
We have used the parataxis relation to connect the parts of a news article byline. There does not seem to be a better relation to use.
Lisboa ( Público ) :
parataxis(Lisboa, Público)
Interjected clauses
Single word or phrase interjections are analyzed as discourse, but when a whole clause is interjected, we use the relation parataxis.
Calafia tem ótimas batatas ( a gente morre por isto ! )
parataxis(tem, morre)
Só para confimar João conclui a reversa do dia três.
parataxis(confirmar, conclui)
In the second example, we treat the second half as the head of the dependency because the first half feels like a whole clause interjection, not like the main clause of the utterance.
Tag questions
We also use the parataxis relation for tag questions such as isn’t it? or haven’t you?.
Isso é para mim , não é ?
parataxis(mim, é)
punct
: punctuation
This is used for any piece of punctuation in a clause, if punctuation is being retained in the typed dependencies. This relation is universal.
Tem sentido -- aliás, muitíssimo sentido.
punct(Tem, --)
punct(sentido, ,)
punct(Tem, .)
Tokens with the relation u-dep/punct always attach to content words (except in cases of ellipsis) and can never have dependents.
Since punct
is not a normal dependency relation, the usual criteria for determining the head word do not apply.
Instead, we use the following principles:
- A punctuation mark separating coordinated units is attached to the first conjunct.
- A punctuation mark preceding or following a subordinated unit is attached to this unit.
- Within the relevant unit, a punctuation mark is attached at the highest possible node that preserves projectivity.
- Paired punctuation marks (quotes and brackets) should be attached to the same word unless that would create non-projectivity. This word is usually the head of the phrase enclosed in the paired punctuation.
remnant
: remnant in ellipsis
The remnant
relation is used to provide a satisfactory treatment of
ellipsis (in the case of gapping and stripping, where a predicational
or verbal head gets elided). This is something that was lacking in
earlier versions of SD and provides a basis for being able to
reconstruct dependencies in the enhanced representation of UD. In
particular, the goal was to achieve this without having to postulate
empty nodes in the basic representation.
To develop motivation, consider first a sentence without ellipsis:
Maria foi para Paris e Miriam foi para Praga
nsubj(foi-2, Maria-1)
root(root-0, foi-2)
nmod(foi-2, Paris-4)
case(Paris-4, para-3)
cc(foi-2, e-5)
nsubj(foi-7, Miriam-6)
conj(foi-2, foi-7)
case(Praga-9, para-8)
nmod(foi-7, Praga-9)
The question is then how to treat the sentence “Maria foi para Paris e Miriam para Praga”
Maria foi para Paris e Miriam para Praga
nsubj(foi-2, Maria-1)
root(root-0, foi-2)
nmod(foi-2, Paris-4)
case(Paris-4, para-3)
cc(foi-2, e-5)
case(Praga-8, para-7)
One option would be to pretend that there is an empty verb and to have the final elements be dependents of it: Maria foi para Paris e Miriam ∅ para Praga. This analysis has some appeal but also has some problems and at any rate stops the basic dependency graph from being simply a tree of dependencies over the words of a sentence. A second option is to simply promote the final elements and to have them as dependents of the main verb of the sentence (foi-2) or of root-0. But then (in general) one loses the ability to successfully reconstruct the correct predicate-argument structure of the sentence from the basic dependency representation.
Therefore, UD adopts an analysis that notes that in ellipsis a
remnant
corresponds to a correlate in a preceding clause. The
remnant
relation connects each remnant to its correlate in the basic
dependency representation. This is then a sufficient representation to
reconstruct the predicate-argument structure in the enhanced
representation. So, for this example, we have:
Maria foi para Paris e Miriam para Praga
nsubj(foi-2, Maria-1)
root(root-0, foi-2)
nmod(foi-2, Paris-4)
case(Paris-4, para-3)
cc(foi-2, e-5)
case(Praga-8, para-7)
remnant(Maria-1, Miriam-6)
remnant(Paris-4, Praga-8)
Even in the more complex example below, the remnant
relations enable us to correctly retrieve the subjects and objects in
the clauses with an elided verb.
João ganhou bronze , Maria prata , e Sandy ouro
nsubj(ganhou-2, João-1)
dobj(ganhou-2, bronze-3)
remnant(João-1, Maria-5)
remnant(Maria-5, Sandy-9)
remnant(bronze-3, prata-6)
remnant(prata-6, ouro-10)
Note in particular that (unlike for conj), remnant
uses a chaining analysis where each subsequent remnant depends on the immediately preceding remnant/correlate. The reason for this is that otherwise in a sentence with 2 or more chained ellipses the dependency structure would no longer track which remnants go together. It would become impossible to determine whether Maria ganhou prata and Sandy ouro, or Maria ganhou ouro e Sandy prata.
It is also possible that the incomplete part precedes the complete one in the sentence [de]:
78 % para Bush e 4 % para o discurso do Clinton
remnant(%-7, %-2)
remnant(Clinton, Bush)
The remnant
relation is used when no predicational material is
present. In contrast, in right-node-raising (RNR) and VP-ellipsis
constructions in which some kind of predicational or verbal material
is still present, the remnant
relation is not used. In RNR, the
verbs are coordinated and the object is a dobj of the first verb:
João comprou e comeu uma maçã
nsubj(comprou-2, João-1)
cc(comprou-2, e-3)
conj(comprou-2, comeu-4)
det(maçã-6, uma-5)
dobj(comprou-2, maçã-6)
In VP-ellipsis, we keep the auxiliary as the head, as shown below:
João vai ganhar ouro e Maria vai também
nsubj(ganhar-3, João-1)
aux(ganhar-3, vai-2)
dobj(ganhar-3, ouro-4)
cc(ganhar-3, e-5)
conj(ganhar-3, vai-7)
nsubj(vai-7, Maria-6)
advmod(vai-7, também-8)
reparandum
: overridden disfluency
We use reparandum
to indicate disfluencies overridden in a speech
repair. The disfluency is the dependent of the repair.
Vá para a direi- para a esquerda .
nmod(Vá, esquerda)
reparandum(esquerda, direi-)
case(direi-, para-2)
det(direi-, a-3)
case(esquerda, para-5)
det(esquerda, a-6)
root
: root
The root
grammatical relation points to the root of the sentence. A
fake node ROOT
is used as the governor. The ROOT
node is indexed
with 0, since the indexing of real words in the sentence starts at 1.
This relation is universal.
Eu quero viver !
root(ROOT, quero)
There should be just one node with the root
dependency relation in every
tree. If the main predicate is not present (due to ellipsis) and there are
multiple orphaned dependents, the leftmost dependent should be promoted to
the head (root) position and the other orphans should be attached to it.
ROOT PT no governo .
root(ROOT, PT)
case(governo, em)
nmod(PT, governo)
vocative
: vocative
The vocative relation is used to mark a dialogue participant addressed in a text (common in conversations, dialogue, emails, newsgroup postings, etc.). The relation links the addressee’s name to its host sentence. A vocative commonly co-occurs with a null subject, as in the first example below. If the nominal is clearly vocative in intent, the preference is to use the vocative relation.
Senhor , fale!
vocative(fale, Senhor)
Sr. Oliveira , o seu filme é o meu favorito .
vocative(favorito, Oliveira)
xcomp
: open clausal complement
An open clausal complement (xcomp
) of a verb or an adjective is a
predicative or clausal complement without its own subject. The
reference of the subject is necessarily determined by an argument
external to the xcomp (normally by the object of the next higher
clause, if there is one, or else by the subject of the next higher
clause). This is often referred to as obligatory control.
These clauses tend to be non-finite in many languages,
but they can be finite as well. The name xcomp
is
borrowed from Lexical-Functional Grammar.
Dois árbitros resolveram contar todos os podres.
xcomp(resolveram, contar)
Os três continentes mais obstinados em cortar o reinado de Havelange.
xcomp(obstinados, cortar)
Volpi foi dos mais influentes pintores do país.
xcomp(foi, pintores)
Disse que não conseguia vislumbrar artifícios fraudulentos.
xcomp(conseguia, vislumbrar)
ccomp(Disse, conseguia)
Mas me considero um piloto rápido.
xcomp(considero, piloto)
Nós esperamos que eles mudem de idéia.
xcomp(esperamos, mudem)
Note that the above condition “without its own subject” does not mean
that a clause is an xcomp
just because its subject is not overt.
The subject must be necessarily inherited from a fixed position in the
higher clause. That is, there should be no available interpretation
where the subject of the lower clause may be distinct from the
specified role of the upper clause. In cases where the missing subject
may or must be distinct from a fixed role in the higher clause,
ccomp
should be used instead, as below. This includes cases of
arbitrary subjects and anaphoric control.
O chefe disse para começar a cavar.
ccomp(disse, começar)
Pro-drop languages, as Portuguese, have clauses where the subject is not present as a separate word, yet it is inherently present (and often deducible from the form of the verb) and it does not depend on arguments from a higher clause.
O tomate foi projetado para manter o sabor
advcl(projetado, manter)
Os empresários abriram mão de posições históricas , eventualmente visando sua proteção , para construir e defender idéias
nsubj(abriram, empresários)
advcl(abriram, visando)
advcl(abriram, construir)
conj(construir, defender)
Secondary Predicates
The xcomp
relation is also used in constructions that are known as
secondary predicates or predicatives. Examples:
- Ela declarou o bolo lindo.
- Ela declarou o bolo um sucesso.
We could paraphrase the sentence using a subordinate clause: Ela
declarou que o bolo estava lindo. There are two predicates mixed in
one clause: 1. ela declarou algo, and 2. o bolo estava lindo (segundo
ela). The secondary predicate will be attached to the main predicate
as an xcomp
:
Ela declarou o bolo lindo .
nsubj(declarou, Ela)
dobj(declarou, bolo)
xcomp(declarou, lindo)
nsubj(lindo, bolo)
In the enhanced representation, there is an additional subject link showing the secondary predication (bolo is the subject of lindo.
Remember that xcomp
is used for core arguments of clausal predicates
so it will not be used for other instances of secondary predication.
For instance, in Ela entrou na sala triste we also have a double
predication (ela entrou na sala; ela estava triste). But triste is
not a core argument of entrar: leaving it out will neither affect
grammaticality nor significantly alter the meaning of the verb. On
the other hand, leaving out lindo in ela declarou o bolo lindo
will either render the sentence ungrammatical or lead to a different
interpretation of declarou.
The result is that in Ela entrou na sala triste, triste will
depend on Ela and the relation will be acl instead of xcomp
.
xcomp:adj
: xcomp:adj
This document is a placeholder for the language-specific documentation
for xcomp:adj
.