Dependencies
Note: nmod, neg, and punct appear in two places.
|
|
|
|||||||||||||||||||||||||||||||||||||||
|
|
|
|||||||||||||||||||||||||||||||||||||||
|
|
|
acl
: clausal modifier of noun
acl
stands for finite and non-finite clauses that modify a nominal.
The head of the acl
relation is the noun that is modified,
and the dependent is the head of the clause that modifies the noun.
Like non-clausal adjectives, most adjectival clauses in Turkish precede the noun they modify. The only exception is the adjectival clauses formed by ki that is similar to English relative pronouns “which” or “who” (not to be confused by suffix -ki).
The primary means of subordination, including forming adjectival clauses, is through the subordinating suffixes attached to the head of the subordinate clause. The adjectival clauses formed by -ki is not as frequent and cover only a limited range uses of adjectival clauses.
Almost all adjectival clauses in Turkish are relative clauses. There are only a few marginal constructions where a pronoun referring to the modified noun can be present in the subordinate clause.
We currently do not mark (non-)relative clauses differently.
advcl
: adverbial clause modifier
An adverbial clause modifier is a clause which modifies a verb or other predicate (adjective, etc.), as a modifier not as a core complement. This includes things such as a temporal clause, consequence, conditional clause, purpose clause, etc. The dependent must be clausal (or else it is an advmod) and the dependent is the main predicate of the clause.
Note that we treat the conditional clauses specially and mark then with a subtype: advcl:cond.
Turkish adverbial clauses are mainly formed by a set of converbial suffixes.
The subordinator ki and a few other subordinating words may also form adverbial clauses.
A large number of adverbials and adverbial clauses are formed by postpositions attached to nouns or noun clauses.
We do not mark these as adverbial (advmod
or advcl
).
For both cases we use nmod (see discussion of subordination
in tr-overview/specific-syntax).
advcl:cond
: conditional adverbial clause modifier
This relation is a subtype of advcl. It is used for conditional clauses.
In Turkish conditional clauses are formed by a verbal inflection on the head of the clause. There are also two redundant words, eğer and şayet that may be included at the beginning or end of the phrase. Use of these words are only for emphasis or an early signal that a conditional clause follows. We use discourse (not mark) for marking the relation between these words and head of the conditional clause.
advmod
: adverbial modifier
An adverbial modifier of a word is a (non-clausal) adverb or adverbial phrase that serves to modify the meaning of the word.
Note that nouns in particular morphological cases, or followed by an adposition are marked using nmod even if they function as adverbial modifiers.
We use a special label, advmod:emph for adverbial modifiers that are used for emphasis.
advmod:emph
: emphasizing word, intensifier
This is a subtype of advmod. It used for (non-clausal) modifiers that emphasize or intensify their heads.
amod
: adjectival modifier
An adjectival modifier of a noun is any adjectival phrase that serves to modify the meaning of the noun.
amod
is not used for all modifiers of nouns.
We use det for determiners (tagged tr-pos/DET),
and for so-called “bare noun compounds” we use compound.
appos
: appositional modifier
An appositional modifier of a noun is a nominal immediately following the first noun that serves to define or modify that noun. It includes parenthesized examples, as well as defining abbreviations in one of these structures.
appos is also used to link key-value pairs in addresses, signatures, etc. (see also the list label):
aux
: auxiliary
An auxiliary of a clause is a non-main verb of the clause.
In Turkish two verbs, ol- and, in formal registers, bulun, complement the main verb with additional tense/aspect/modality suffixes that cannot be attached to the main verb due to morphological restrictions (or sometimes stylistic reasons).
The auxiliary use of ol- is different than its use as a copula, where the cop relation is used.
We use a subtype of aux
, aux:q, question particle mi.
We also use aux
in case bound auxiliary -abil
is separated from the main verb.
aux:q
: question particle
This is a subtype of aux, used for question particle -mI (mı/mi/mu/mü).
The question particle, when attached to a predicate, typically carries some of the tense/aspect/modality suffixes as well as person/number agreement suffixes.
Although it does not function as an auxiliary when attached to non-predicate words or phrases,
we use aux:q
for all uses of the question particle.
Annotation of question particles are not well-specified in UD, and currently under discussion.
auxpass
: passive auxiliary
We do not use auxpass
case
: case marking
The case relation is used for any case-marking element which is treated as a separate syntactic word (including prepositions, postpositions, and clitic case markers). Case-marking elements are treated as dependents of the noun or clause they attach to.
In Turkish, case marking is typically done through suffixation,
in which case the case-marked word will carry the appropriate Case feature.
The case
relation marks postpositions,
and some of the case-like suffixes that are tokenized as separate syntactic tokens (inflectional groups).
Currently, we also use case
for some not-so-case-like modifiers.
cc
: coordinating conjunction
A cc
is the relation between the first conjunct and the coordinating conjunction delimiting another conjunct.
Note that we currently diverge from UD specification by marking the last conjunct as the head. See the conj relation, for more information.
Note that in instrumental or commutative usage of ile the relation case is used.
ccomp
: clausal complement
A clausal complement of a predicate is a dependent clause which is a core argument. That is, it functions like an object of the predicate.
We split the verbal noun suffixes,
and mark them as the head of the subordinate clause.
The unit with the subordinating suffix is tagged as noun.
However, we still use ccomp
for the relation between the higher level clause and the clausal object.
At present, we use ccomp
only for direct objects, i.e., non-finite noun phrases in accusative or nominative Case.
The arguments in other cases are marked using nmod relation or appropriate subtype of it.
See also xcomp.
ccomp:cau
: clausal complement of a causative verb which refers to the "causee"
This is the clausal counterpart of dobj:cau.
compound
: compound
compound
is one of the relations in UD for compounding.
In Turkish it is used for bare noun compounds, compound verbal forms and numbers.
compound:lvc
: compound:lvc
This document is a placeholder for the language-specific documentation
for compound:lvc
.
compound:redup
: reduplicated compounds
This subtype of compound covers a range of reduplicated forms in Turkish. Reduplication is a common process especially for adverbs and adjectives, but it is also used for reduplicated noun and verb forms.
The reduplication typically involves two identical words, but some morpho-phonological alternations (as in m-reduplication in example 3 below) are possible. In some cases one of the words in reduplicated forms may also be modified individually by other words (see example 4 below).
For lexicalized multi-word items with repetition where one or more of the words are not free lexemes, (e.g. paldır küldür, ufak tefek), we use mwe.
conj
: conjunct
A conjunct is the relation between two elements connected by a coordinating conjunction, such as and, or, etc.
We diverge from UD specification by marking the last (instead of the first) conjunct as the head of the relation.
All the other conjuncts depend on the last via the conj
relation.
See the relation cc for a few more examples.
cop
: copula
A copula is the relation between a subject complement and a copular verb or copular suffix. We always mark copula as dependent of the subject complement.
In Turkish, the auxiliary verb ol- and in some constructions the negative particle değil act like a free copula. The main means of forming copular constructions, however, is through the bound morpheme -(y), and (infrequently) its clitic form i-. Since the morpheme -(y) consists only of a “buffer” consonant, in some morphological contexts, it is not realized.
Copular morphemes carry features, e.g., Number, Person, that may conflict with the complement they are attached to. Furthermore, the copular suffixes can also attach to verbal nouns, causing conflicting dependency relations besides more feature conflicts. As a result, all copular markers, including the “zero copula” are considered as a separate syntactic tokens.
When an overt subject is present, it is headed by the subject complement (not the copula).
csubj
: clausal subject
A clausal subject is a clausal syntactic subject of a clause, i.e., the subject is itself a clause.
TODO: link to the explanation of splitting of subordinating suffixes.
The following needs more discussion We also analyze the nominal predicates with clausal subjects formed by subordinating conjunction ki similarly. In the METU-Sabancı treebank they are marked (somewhat inconsistently) as modifiers rather than main predicates.
csubjpass
: clausal passive subject
A clausal passive subject is a clausal syntactic subject of a passive clause.
dep
: unspecified dependency
A dependency is labeled as dep
when a system is unable to determine a more precise dependency relation between two words.
We currently do not use the dep
label.
det
: determiner
The relation determiner (det) holds between a nominal head and its determiner.
discourse
: discourse element
This is used for interjections and other discourse particles and elements (which are not clearly linked to the structure of the sentence, except in an expressive way).
dislocated
: dislocated elements
The dislocated
relation is used for fronted or postposed elements that do not fulfill the usual core grammatical relations of a sentence.
These elements often appear to be in the periphery of the sentence,
and may be separated off with a comma intonation.
dobj
: direct object
The direct object of a verb is the noun phrase that denotes the entity acted upon.
In Turkish, direct objects take either nominative (unmarked), or accusative cases.
We do not mark arguments of verbs in other cases with dobj
.
(NOTE: Kyrgiz treebank marks ablatives as in pastadan aldı ‘he took from the cake’. We may consider doing the same. At least we should try to unify the analyses.)
Note also that we mark objects of intransitive causative verbs using dobj:cau.
We also mark the non-case marked or accusative noun phrases as dobj
even if they are not the entities that are acted upon.
dobj:cau
: direct object of an intransitive causative verb
This is a subtype of dobj. We mark direct objects of causative voice intransitive verbs with this subtype, since the interpretation is different in comparison to a direct object of a non-causative verb. In general, if the verb is intransitive, direct object indicates the “causee”, the subject of the content verb, or the entity that performs the action. If the verb is transitive the direct object is the entity that is acted upon as in the non-causative case (see nmod:cau).
expl
: expletive
Turkish does not have expletives.
foreign
: foreign words
We use foreign to label sequences of foreign words. These are given a linear analysis: the head is the first token in the foreign phrase.
goeswith
: goes with
This relation links two parts of a word that are separated in text that is not well edited. The head is in some sense the “main” part, often the second part.
iobj
: indirect object
We do not use the dependency label iobj
.
TODO: link to argument/adjunct discussion.
list
: list
The list
relation is used for chains of comparable items. In lists with more than two items, all items of the list should modify the first one. Informal and web text often contains passages which are meant to be interpreted as lists but are parsed as single sentences. Email signatures often contain these structures, in the form of contact information: the different contact information items are labeled as list
; the key-value pair relations are labeled as appos.
mark
: marker
A marker is the subordinating conjunction introducing a finite clause subordinate to another clause. The mark is a dependent of the subordinate clause head.
mwe
: multi-word expression
The multi-word expression (modifier) relation is one of the three relations (compound, mwe
, name) for compounding.
It is used for certain fixed grammaticized expressions that behave like function words or short adverbials.
Note that, we mark most of the expressions that are marked MWE
in METU-Sabancı treebank as compound
.
mwe
is only used for fixed expressions that do not show any morphological variation.
name
: name
name
is one of the three relations for compounding in UD (together with compound and mwe).
It is used for proper nouns constituted of multiple nominal elements.
For phrasal or clausal names the usual relations are used.
neg
: negation modifier
The negation modifier is the relation between a negation word and the word it modifies.
In Turkish, negation is typically done through suffixation.
We use neg
only if the non-predicative use of the word değil.
nmod
: nominal modifier
The nmod
relation is used for nominal modifiers.
They depend either on another noun (group “noun dependents”) or on a predicate (group “non-core dependents of clausal predicates”).
nmod
is a noun (or noun phrase) functioning as a non-core (oblique) argument or adjunct.
This means that it functionally corresponds to an adverbial when it attaches to a verb, adjective or other adverb.
But when attaching to a noun, it corresponds to an attribute, or genitive complement (the terms are less standardized here).
The nmod
relation is further specified by the Case feature or case relation.
We also use the following language-specific subtypes for nmod:
- nmod:cau: nominal modifier of a causative predicate that markes the causee.
- nmod:comp: a comparative nominal modifier
- nmod:pass: nominal modifier of a passive predicate that express the actor (subject of the active predicate)
- nmod:tmod: nominal modifier that indicates time
- nmod:own: owner in a possessive existential sentence
- nmod:poss: possessor in in a genitive-possessive construction
- nmod:part: noun modifier specifying the whole-part relation
We do not currently distinguish between core arguments and adjuncts (TODO: link to discussion).
nmod:cau
: nominal modifier indicating the causee of a causative predicate
This subtype of nmod is used for marking the performer of the real action, “causee”, in a causative predicate. The subject of the causative predicate is the actor who causes the action to be taken. Occasionally the performer of the action is also included in the phrase/sentence, and it is useful to mark it. The causee is predictable for intransitive verbs, since it is the object of the causative predicate. For transitive verbs, it is often marked with dative Case, but it is ambiguous as a dative nominal modifier also has the function of marking the “beneficiary” (and possibly others).
Note that the above is ambiguous. It may also mean “The teacher made someone to read the book to/for Ali”. In that case nmod should be used.
nmod:comp
: comparative modifier of an adjective or adverb
This subtype of nmod is used for marking comparative modifier of an adjective or adverb.
nmod:own
: owner in a possessive existential sentence
This needs more discussion
This subtype of nmod is used for marking the owner of a possessive existential sentence.
In Turkish possessive sentences (e.g., “I have a book”) resemble existential sentences
where the subject is the entity that “exists”, or “owned”.
nmod:own
relation marks the entity that “owns” the subject.
The head of the relation is the predicate,
as opposed to the subject noun phrase
(this allows a uniform analysis in case the subject is dropped).
It should not be confused with nmod:poss,
which is used in (genitive-)possessive constructions.
nmod:part
: nominal modifier indicating part-whole relations
This subtype of nmod is used for marking the part-whole relations. The structure is similar to nmod:poss in most cases, but the range structures expressing “part of” is diverse, and distinction is often be useful.
nmod:pass
: nominal modifier indicating the actor of a passive predicate
This subtype of nmod is used for marking the performer of action (the subject in the corresponding active sentence) in a passive predicate.
nmod:poss
: possessive nominal modifier
This subtype of nmod is used in (genitive-)possessive constructions.
Typically, the head of the construction is a possessive noun phrase,
and the dependent is in genitive case.
We also use nmod:poss
in the alternative construction where the modifier is not in genitive case.
So-called “bare noun compounds” are marked using the compound relation.
nmod:tmod
: temporal modifier
A temporal modifier is a subtype of the nmod relation: if the modifier is specifying time, it is labeled as nmod:tmod
.
nsubj
: nominal subject
A nominal subject is a noun phrase which is the syntactic subject of a clause.
For existential sentences, “the thing that exists” is the subject. This includes possessive existentials.
Although we currently mark the head of the verbal nouns as nouns, we use csubj when they are in the subject position.
nsubjpass
: passive nominal subject
A passive nominal subject is a noun phrase which is the syntactic subject of a passive clause.
The distinction between nsubj and nsubjpass
is not strictly necessary in Turkish,
since the predicate will always be morphologically marked as passive.
nummod
: numeric modifier
A numeric modifier of a noun is any number phrase that serves to modify the meaning of the noun with a quantity.
parataxis
: parataxis
The parataxis relation (from Greek for “place side by side”) is a relation between the main verb of a clause and other sentential elements, such as a sentential parenthetical, a clause after a “:” or a “;”, or two sentences placed side by side without any explicit coordination or subordination.
punct
: punctuation
This is used for any piece of punctuation in a clause. See punct for details.
remnant
: remnant in ellipsis
The remnant relation is used to provide a satisfactory treatment of certain instances of ellipsis.
reparandum
: overridden disfluency
We use reparandum
to indicate disfluencies overridden in a speech repair.
The disfluency is the dependent of the repair.
root
: root
The root
grammatical relation points to the root of the sentence.
A fake node “ROOT” is used as the governor.
The ROOT node is indexed with “0”, since the indexation of real words in the sentence starts at 1.
vocative
: vocative
The vocative relation is used to mark dialogue participant addressed in text. The relation links the addressee’s name to its host sentence.
xcomp
: open clausal complement
An open clausal complement of a predicate is a predicative or clausal complement without its own subject. The reference of the subject is necessarily determined by an argument external to the xcomp (normally by the object of the next higher clause, if there is one, or else by the subject of the next higher clause. These complements are always non-finite, and they are complements (arguments of the higher predicate) rather than adjuncts/modifiers, such as a purpose clause.
In majority of the cases, we use xcomp
for the verbal nouns formed by the suffix -mAk.
Note that we split the nominal part, and mark the noun as the head of the predicate.
(TODO: link to the subordination discussion)
In addition, we also use xcomp
for secondary predicates,
or in general, what Göksel & Kerslake (2005) calls “small clauses”.
The decision between a secondary predicate or adverb analysis is often diffcult,
since most adjectives also function as adverbs.
References
Aslı Göksel and Celia Kerslake. Turkish: A Comprehensive Grammar. London: Routledge, 2005.