home edit page issue tracker

This page still pertains to UD version 1.

Specific constructions

Please note: this language-specific overview of guidelines for specific constructions is a work in progress.


Subjects and objects

Finnish subjects and objects are straightforward to recognize in their prototypical cases, but both phenomena also have some difficult cases, which are discussed here.

The subject is the primary complement of the verb, usually denoting the entity doing something. In addition to the basic subject (see ISK §910), also existential subjects (eksistentiaalisubjekti, e-subjekti) are considered subjects in UD Finnish.

Tien vieressä on talo . \n Road beside is house .
case(Tien-1, vieressä-2)
nmod(on-3, Tien-1)
nsubj(on-3, talo-4)
punct(on-3, .-5)

Possessive clauses (omistuslause) are considered a subtype of existential clauses, and analyzed similarly. The owner in possessive clauses is marked using the type nmod:own. The haver must be an animate being or a group of animate beings.

Hänellä on oma asunto . \n At_him is own apartment .
nmod:own(on-2, Hänellä-1)
nsubj(on-2, asunto-4)
amod(asunto-4, oma-3)
punct(on-2, .-5)

Also the genitive subject in for instance necessive structures is annotated as nsubj. (This is not to be confused with the genitive subject of a noun, nmod:gsubj)

Minun on pakko mennä kotiin . \n I(gen.) is obligation go home .
nsubj(mennä-4, Minun-1)
cop(pakko-3, on-2)
xcomp:ds(pakko-3, mennä-4)
nmod(mennä-4, kotiin-5)
punct(pakko-3, .-6)

In UD Finnish, subjects are allowed to be in the nominative, genitive and partitive cases, and in addition, also an accusative subject is possible (the accusative case only exists for certain pronouns). Two notable situations where a complement in the accusative form is analyzed as the subject are:

  1. Nonfinite clausal complements (Sain hänet itkemään. “I made him cry.”)
  2. Possessive clauses (Minulla on sinut. “I have you.”)

The same cases are allowed for objects as for subjects: the nominative, the partitive, the genitive and the accusative. Nominal and adjectival complements (other than predicatives), however, can be in other cases as well.

Object cased amount adverbials (objektin sijainen määrän adverbiaali, OSMA ISK,§972), which, as the name implies, use the same cases as objects, are analyzed as nominal modifiers. However, certain verbs are considered such that they can take as their object an expression that would otherwise be considered an amount adverbial. Examples where an amount is considered the object are for instance:

Examples

Passive verbforms take a direct object and not a passive subject, as in for instance English.

Oppitunti valmisteltiin huolellisesti . \n Lesson was_prepared carefully .
dobj(valmisteltiin, Oppitunti)
advmod(valmisteltiin, huolellisesti)
punct(valmisteltiin, .)

However, there are certain verbs, so called derived passives ISK, §336, which may resemble passive verbforms in meaning, but which in fact take a subject, not an object. (In English, the Finnish derived passives generally correspond to intransitive uses of a verb, such as the door opens, sometimes termed inchoative.).

Minä avasin oven . \n I opened the_door .
nsubj(avasin-2, Minä-1)
dobj(avasin-2, oven-3)
punct(avasin-2, .-4)
Ovi aukeaa . \n The_door opens .
nsubj(aukeaa-2, Ovi-1)
punct(aukeaa-2, .-3)

References

Copulas

This section discusses first defining copular verbs and predicatives, then copulas in combination with auxiliaries, and finally the distinction between the subject and the predicative in copular clauses.

What can be a predicative?

In the UD scheme, the head of a copular clause is the predicative, not the verb (copula), unlike in other clauses. The Finnish language only has one copular verb, olla “to be” ISK §891, and in order to avoid marking other verbs as copular and to prevent copular clauses from having multiple head words, rules are needed to define what is accepted as a predicative.

The basic alternatives for predicatives are nominals (nouns, adjectives, pronouns and numerals). Words of these parts of speech are required to be in nominative, partitive or genitive case to be accepted as predicatives.

Varpunen on pieni lintu . \n Sparrow is small bird(nom.) .
nsubj:cop(lintu-4, Varpunen-1)
cop(lintu-4, on-2)
amod(lintu-4, pieni-3)
punct(lintu-4, .-5)
Maali oli valkoista . \n Paint was white(part.) .
nsubj:cop(valkoista-3, Maali-1)
cop(valkoista-3, oli-2)
punct(valkoista-3, .-4)
Tämä kirja on minun . \n This book is mine(gen.) .
det(kirja-2, Tämä-1)
nsubj:cop(minun-4, kirja-2)
cop(minun-4, on-3)
punct(minun-4, .-5)

Nominals in any other case are not marked as predicatives, even if they are associated with the verb olla. They, similarly to adpositional phrases, are marked as nominal modifiers (nmod) in case of modifiers and one of the clausal complement types (xcomp, xcomp:ds) in case of complements including secondary predication, and the verb is marked as the head of the clause, even if it is olla “to be”.

Lapset olivat pihalla . \n Children were on_yard .
nsubj(olivat-2, Lapset-1)
nmod(olivat-2, pihalla-3)
punct(olivat-2, .-4)
Lapset olivat talon takana . \n Children were behind house .
nsubj(olivat-2, Lapset-1)
nmod(olivat-2, talon-3)
case(talon-3, takana-4)
punct(olivat-2, .-5)

This restriction is to prevent a clause from having two predicatives and hence two heads, which would be the case in a sentence such as the following:

Examples

Here both Oulusta “from Oulu” and ystävältäni “from my friend” could be interpreted as predicatives, resulting in a clause with two heads, or alternatively, a decision between two equally likely head-candidates. Therefore, only nominative, genitive and partitive are allowed as cases for predicatives.

Note that cases not allowed for predicatives include the essive case; this is to avoid marking verbs other than olla as copulas.

Mies oli portsarina baarissa . \n Man was doorman(essive) in_bar .
nsubj(oli-2, Mies-1)
nmod(oli-2, portsarina-3)
nmod(oli-2, baarissa-4)
punct(oli-2, .-5)
Mies toimi portsarina baarissa . \n Mand worked doorman(essive) in_bar .
nsubj(toimi-2, Mies-1)
nmod(toimi-2, portsarina-3)
nmod(toimi-2, baarissa-4)
punct(toimi-2, .-5)

In addition to nominals, also adverbs can act as predicatives, given that they do not express location or time. Note that with adverbs, there is no restriction with regard to case, only that they are not locational or temporal. As a result, adverbs such as täällä “here” or huomenna “tomorrow” can not act as predicatives, but others, such as naimisissa “married” (inessive adverb) and raskaana “pregnant” (essive adverb) can, regardless of their case.

In UD Finnish, also a full clause can act as a predicative, in addition to nominals and adverbs. In these cases, the head of the clause acting as the predicative becomes also the head of the main clause. (If the clause acting as the predicative is also a copular clause, this results in the predicative clause seemingly having two copula subjects and copulas. However, this is not how the analysis should be interpreted.)

Tarkoitus on järjestää lopuksi juhlat . \n The_meaning is to_arrange in_the_end a_party .
nsubj:cop(järjestää-3, Tarkoitus-1)
cop(järjestää-3, on-2)
dobj(järjestää-3, juhlat-5)
advmod(järjestää-3, lopuksi-4)
punct(järjestää-3, .-6)

Diffs

In FinnTreeBank (FI_FTB), in addition to copular clauses, also state clauses and result clauses (ISK § 891) contain predicatives. This results in a larger group of verbs accepted as copular verbs, e.g. tulla “to become”, muuttua “to turn” and tehdä “to make”. (See FinnTreeBank Annotation Manual: 16.9 Predicative.)

In FI_FTB, none of the adverbs can act as predicatives (e.g. naimisissa “married” or raskaana “pregnant”).

Copulas and auxiliaries

In the Finnish-specific version of the UD scheme, copular verbs and auxiliaries take no dependents of their own. In cases of two auxiliaries or an auxiliary of a copular verb, all auxiliaries as well as the copular verb are attached to the main predicate or the predicative. The same principle applies also to negation verbs.

Hänkin on joskus ollut nuori . \n He_too has some_time been young .
nsubj:cop(nuori-5, Hänkin-1)
aux(nuori-5, on-2)
advmod(nuori-5, joskus-3)
cop(nuori-5, ollut-4)
punct(nuori-5, .-6)
Minun ei ehkä olisi pitänyt sanoa niin . \n I not maybe have should said so .
nsubj(sanoa-6, Minun-1)
neg(sanoa-6, ei-2)
advmod(sanoa-6, ehkä-3)
aux(sanoa-6, olisi-4)
aux(sanoa-6, pitänyt-5)
advmod(sanoa-6, niin-7)
punct(sanoa-6, .-8)

The distinction between the predicative and the subject

Distinguishing the subject from the predicative in copular clauses can be difficult, as it would often be possible to invert the word-order and thus swap the positions of the two elements. For instance in the following sentences, either kirahvit “giraffes” or eläimiä “animals” could be the subject and the other the predicative.

Examples

In UD Finnish, the main rule in annotating copular structures is that the leftmost element is the subject and the rightmost one the predicative. Hence, the above sentences would be annotated in the following manner:

Kirahvit ovat mielenkiintoisimpia eläimiä . \n Giraffes are the_most_interesting animals .
nsubj:cop(eläimiä-4, Kirahvit-1)
cop(eläimiä-4, ovat-2)
amod(eläimiä-4, mielenkiintoisimpia-3)
punct(eläimiä-4, .-5)
Mielenkiintoisimpia eläimiä ovat kirahvit . \n The_most_interesting animals are giraffes .
amod(eläimiä-2, Mielenkiintoisimpia-1)
nsubj:cop(kirahvit-4, eläimiä-2)
cop(kirahvit-4, ovat-3)
punct(kirahvit-4, .-5)

Semantic considerations such as which concept is a subconcept of the other are not taken into account in the annotation. However, it is possible to mark the leftmost element the predicative in cases where the word order is clearly inverted. This occurs for instance in (indirect) questions and sometimes relative clauses. Note that especially in questions, several different word orders are possible.

Millainen matka oli ? \n What_like trip was ?
nsubj:cop(Millainen-1, matka-2)
cop(Millainen-1, oli-3)
punct(Millainen-1, ?-4)
Kysyin , oliko matka mukava . \n I_asked , whether_was trip nice .
ccomp(Kysyin-1, mukava-5)
punct(Kysyin-1, .-6)
punct(mukava-5, ,-2)
cop(mukava-5, oliko-3)
nsubj:cop(mukava-5, matka-4)
yhdistys , jonka puheenjohtaja Matikainen on \n association , of_which chairman Matikainen is
acl:relcl(yhdistys-1, puheenjohtaja-4)
punct(puheenjohtaja-4, ,-2)
nmod:poss(puheenjohtaja-4, jonka-3)
nsubj:cop(puheenjohtaja-4, Matikainen-5)
cop(puheenjohtaja-4, on-6)

Also, if the leftmost element of the copular clause is an adjective rather than a noun or pronoun, it is considered that the word order is inverted, and thus the adjective is marked as the predicative, not the subject.

Kaunishan tämä talo on . \n Beautiful this house is .
nsubj:cop(Kaunishan-1, talo-3)
det(talo-3, tämä-2)
cop(Kaunishan-1, on-4)
punct(Kaunishan-1, .-5)

References

External subjects

Open clausal complements share their subject with another verb (see also the documentation for xcomp). The fact that the subject of the main verb is also the subject of the complement cannot be annotated using basic dependencies, as this would violate the treeness restriction. Therefore, in UD Finnish these subjects are marked on the second layer of annotation (DEPS field) using the standard dependency types nsubj and nsubj:cop. Note that an open clausal complement may not always have a subject, in for instance passive constructions.

Note that while some related schemes such as SD and TDT differentiate second-layer (or “additional”) external subject dependencies by applying a specific type such as xsubj, nsubj is used on both the basic and second layer in UD Finnish.

Matti ryhtyi lukemaan . \n Matti started_to read .
nsubj(ryhtyi-2, Matti-1)
xcomp(ryhtyi-2, lukemaan-3)
punct(ryhtyi-2, .-4)
nsubj(lukemaan-3, Matti-1)
Hän vaikutti olevan hiljainen . \n He appeared to_be silent .
nsubj(vaikutti-2, Hän-1)
xcomp(vaikutti-2, hiljainen-4)
cop(hiljainen-4, olevan-3)
punct(vaikutti-2, .-5)
nsubj:cop(hiljainen-4, Hän-1)

Appositions and appellation modifiers

The Finnish Grammar (see ISK §1059, §1062) distinguishes between three similar phenomena: the apposition, the appellation modifier (nimikemääarite) and the supporting noun (tukisubstantiivi). Out of these, the apposition and the appellation modifier (compound:nn) are distinguished in TDT, and supporting noun structures are considered appositions.

All of these structures have in common that they all include two (usually adjacent) elements, most often noun phrases, which refer to the same entity or entities and have the same function in the sentence. Thus, in order to be considered an apposition, an appellation modifier or a supporting noun structure, a structure has to fulfill the following criteria (the same as in the Finnish grammar §1059):

  1. Both elements of the structure must refer to the same entity or group of entities.
  2. Both elements of the structure must have the same function in the sentence (for instance, the subject).

These criteria are interpreted rather loosely, and there are no restrictions on the part of speech of the elements involved. Most appositions (and appellation modifiers) in TDT consist of noun phrases, but there are occurrences of different parts of speech as appositions; notably the fiction section of the treebank contains few examples of verbal appositions.

Among the expressions that fulfill criteria 1 and 2, six common cases can be distinguished according to inflection and punctuation.

  1. singular, both elements in nominative, no punctuation: professori Matti Tamminen “professor Matti Tamminen”
  2. singular, first element in nominative, second element inflected: professori Matti Tammisen mukaan “according to professor Matti Tamminen”
  3. singular, both elements in nominative, punctuation in between: professori, Matti Tamminen “the professor, Matti Tamminen”
  4. singular, first element inflected, second element in nominative: romaanissa Putkinotko “in the novel Putkinotko”
  5. singular, both elements inflected: professorin, Matti Tammisen, mukaan “according to the professor, Matti Tamminen”
  6. plural, elements either in nominative or inflected: professorit Matti Tamminen ja Erkki Koivunen “the professors Matti Tamminen and Erkki Koivunen” or professoreiden, Matti Tammisen ja Erkki Koivusen, mukaan “according to the professors, Matti Tamminen and Erkki Koivunen” or professoreiden Matti Tamminen and Erkki Koivunen mukaan “according to the professors Matti Tamminen and Erkki Koivunen”

Out of these six cases, the first two are considered appellation modifiers, and thus marked with the dependency type nn. Note that the governor of the dependency in appellation modifiers is the latter of the two words.

Professori Matti Tamminen pitää puheen . \n Professor Matti Tamminen gives a_speech .
compound:nn(Matti-2, Professori-1)
name(Matti-2, Tamminen-3)
nsubj(pitää-4, Matti-2)
dobj(pitää-4, puheen-5)
punct(pitää-4, .-6)

The remaining four cases are all considered appositions and marked with the type appos. Contrary to appellation modifiers, in apposition structures the first word is considered the governor.

Professori , Matti Tamminen , luennoi tänään . \n The_professor , Matti Tamminen , lectures today .
appos(Professori-1, Matti-3)
punct(Matti-3, ,-2)
punct(Matti-3, ,-5)
name(Matti-3, Tamminen-4)
nsubj(luennoi-6, Professori-1)
advmod(luennoi-6, tänään-7)
punct(luennoi-6, .-8)

It should be noted that case number 4 is in fact an example of a supporting noun structure, but in TDT, these are marked as appositions. In plural (case number 6), all possible case combinations are considered appositions.

The only difference between the cases 1 and 3 is the presence or absence of punctuation. Often, said punctuation is a comma, but also parentheses, a dash or a colon are possible. As can be seen from the examples above, the punctuation produces a semantic difference, which is taken into account in the annotation. Punctuation variations of the cases 2, 4, and 5 need not be considered, as these variations are ungrammatical. (Naturally, ungrammatical phenomena can and do occur in a corpus of actual language, but these cases are resolved on a case-by-case basis.)

Examples

References

Verbal dependents: Clauses, non-clauses, complements and modifiers

One particularly challenging task in annotating in the UD Finnish scheme is selecting the correct dependency type for dependents that are verbal. Verbal dependents include different kinds of subordinate clauses as well as infinitive and participial complements and modifiers. A simplified description of the decision procedure for verbal dependents is given in Table 1, and the full details are given below.

TABLE 1 OMITTED

Some basic cases are relatively easy to decide. If the dependent is a regular subordinate clause, the choices are clear. For relative clauses the type to be used is acl:relcl and as indirect questions are clausal complements, the correct type for them is ccomp.

If the subordinate clause is a conjunction clause, it can be either a complement or a modifier. Complement clauses are marked with ccomp and modifier ones with advcl. In the majority of cases, conjunction clauses starting with the conjunction että are complements and clauses starting with any other conjunction are modifiers. However, it should be noted that the conjunction että can be a used instead of the conjunction jotta, and respectively, also jotta can (especially in spoken language) be used instead of että.

Examples

In these cases, a clause starting with että is a modifier, and a clause starting with jotta is a complement.

If the dependent is not a subordinate clause, the next deciding factor is the part of speech of the governor. If the governor is a noun, the dependent can be an infinitive modifier or a participle modifier, both marked with acl.

If, in turn, the governor is a verb, then the dependent can be either a complement or a modifier. With complements, there are three alternative dependency types available: xcomp, ccomp, and xcomp:ds.

If the subject of the dependent is shared with the governor (subject control), the correct type to use is xcomp. If any other sentence element is inherited from the higher clause (for example a dobj), the correct type is xcomp:ds, and otherwise ccomp.

Examples

The dependent can also be a participial complement that resembles adjectival complements. The above-mentioned three clausal complement types should be used in these cases as well.

Examples

If the dependent is not a complement but a modifier, then the correct dependency type is advcl. These cases are usually recognized as lauseenvastike (“substitute of a clause”) or non-complement participles.

Examples

References

Attachment issues: word-order-dependent structures and ambiguity

Occasionally determining the correct head word for a dependency may be difficult. Some structures are inherently ambiguous, and with some structures, often ones involving nominal modifiers, the dependent is most naturally seen to modify different sentence elements depending on the word-order. The following classic example is ambiguous:

Examples

In this example, it is possible that the shooting happened while wearing the pajamas, in which case the correct syntax tree would be as follows:

Ammuin elefantin pyjamassani . \n I_shot an_elephant in_my_pajamas .
dobj(Ammuin-1, elefantin-2)
nmod(Ammuin-1, pyjamassani-3)
punct(Ammuin-1, .-4)

On the other hand, it is also possible that the elephant wore the pajamas, in which case the correct analysis is:

Ammuin elefantin pyjamassani . \n I_shot an_elephant in_my_pajamas .
dobj(Ammuin-1, elefantin-2)
nmod(elefantin-2, pyjamassani-3)
punct(Ammuin-1, .-4)

Ambiguities such as this one are resolved as far as possible, and also context is used to determine the correct reading where applicable. That is, if in the context there exists another sentence which makes it clear whether the shooter or the elephant wore the pajamas, then that sentence is used to disambiguate the structure.

If, however, the ambiguity cannot be resolved even given context, or if an element seems to modify two or more elements simultaneously, then the attachment higher in the tree is chosen. In the case of the previous example, this would be the reading in which the shooting happens wearing the pajamas.

In some structures, the most natural analysis may be word order dependent. Consider the following two examples.

Examples

In the former example, there is clearly a man in a brown coat, whereas in the latter case, the coming into the train happened while wearing a brown coat. Therefore, the correct analyses for these examples differ in their attachment of the phrase in a brown coat. These attachment rules are akin to those used in the Prague Dependency Treebank.

Mies ruskeassa takissa tuli junaan . \n Man brown in_coat came into_train .
nmod(Mies-1, takissa-3)
amod(takissa-3, ruskeassa-2)
nsubj(tuli-4, Mies-1)
nmod(tuli-4, junaan-5)
punct(tuli-4, .-6)
Mies tuli junaan ruskeassa takissa . \n Man came into_train brown in_coat .
nsubj(tuli-2, Mies-1)
nmod(tuli-2, junaan-3)
nmod(tuli-2, takissa-5)
amod(takissa-5, ruskeassa-4)
punct(tuli-2, .-6)

References

Relative clauses

Relative clauses most often modify noun phrases, but it is also possible for them to modify a whole clause. From a prescriptive perspective, the relativizer that should be used in relative clauses that modify noun phrases is joka, and the relative clause should always modify the word directly before it. The relativizer that should be used in relative clauses modifying full clauses is mikä. However, in real, especially spoken, language, the use of the two relativizers is mixed, and not every joka clause actually refers to the word adjacent to it. In UD Finnish, the actual reference for the relative clause is chosen as the head of the acl:relcl dependency wherever possible. For this reason, the head of the acl:relcl relation can occasionally be a verb.

Annoin hänelle kirjan , joka sitä oli pyytänyt . \n I_gave him the_book , who it had asked_for .
nmod(Annoin-1, hänelle-2)
dobj(Annoin-1, kirjan-3)
acl:relcl(hänelle-2, pyytänyt-8)
punct(pyytänyt-8, ,-4)
nsubj(pyytänyt-8, joka-5)
dobj(pyytänyt-8, sitä-6)
aux(pyytänyt-8, oli-7)
punct(Annoin-1, .-9)

The relativizer is annotated with the standard syntactic role that it plays in the relative clause, such as nsubj or dobj. (Note that this treatment differs from the annotation of relative clauses in previously proposed related schemes, which used specific dependency types (e.g. rel) to mark the relativizer. In particular, in the TDT corpus the basic dependency layer used rel and the second annotation layer identified the actual syntactic role.)

Lapsi , jonka hän sai itkemään , parkui yhä surkeasti . \n The_child , whom he made cry , wailed still miserably .
acl:relcl(Lapsi-1, sai-5)
punct(sai-5, ,-2)
nsubj(itkemään-6, jonka-3)
nsubj(sai-5, hän-4)
xcomp:ds(sai-5, itkemään-6)
punct(sai-5, ,-7)
nsubj(parkui-8, Lapsi-1)
advmod(parkui-8, yhä-9)
advmod(parkui-8, surkeasti-10)
punct(parkui-8, .-11)
Tuon lapsen hän sai itkemään . \n That child he made cry .
det(lapsen-2, Tuon-1)
nsubj(sai-4, hän-3)
xcomp:ds(sai-4, itkemään-5)
nsubj(itkemään-5, lapsen-2)
punct(sai-4, .-6)

Note also that the dependent of this dependency is always the head of the relative phrase, which may or may not be the relative word itself.

Nainen , jonka auto hajosi , seisoo tuolla . \n Lady , whose car broke , stands there .
acl:relcl(Nainen-1, hajosi-5)
punct(hajosi-5, ,-2)
punct(hajosi-5, ,-6)
nmod:poss(auto-4, jonka-3)
nsubj(hajosi-5, auto-4)
nsubj(seisoo-7, Nainen-1)
advmod(seisoo-7, tuolla-8)
punct(seisoo-7, .-9)

Diffs

FinnTreeBank (FI_FTB) uses the universal acl relation instead of the language-specific acl:relcl.

Units, measures and amounts

There are several ways to express amounts. The most simple case is expressing amount with numbers: three apples, sixteen litres.

kolme litraa \n three litres
nummod(litraa-2, kolme-1)

The semantic head, litraa “litres” in the above example, is selected as the head, and the number is marked as a numeral modifier, nummod (Morpho-syntactically, the number kolme “three” could also be considered the head, as it determines the case used for the word litra “litre”). For more information on the internal structure of numerical expressions, see Section 5.12.

Amount can also be expressed with adverbs. This, too, is handled by selecting the semantic head as the head of the structure, that is, the noun.

paljon maitoa \n a_lot_of milk
advmod(maitoa-2, paljon-1)

In addition, amount can be expressed using a nominal, often in expressions such as kuppi kahvia “a cup of coffee” or joku pojista lit. someone from the boys “one of the boys”. In these cases, the first nominal is marked as the head.

Hän joi kupin kahvia . \n He drank a_cup_of coffee .
nsubj(joi-2, Hän-1)
dobj(joi-2, kupin-3)
nmod(kupin-3, kahvia-4)
punct(joi-2, .-5)
Joku pojista voisi auttaa minua . \n Someone from_boys could help me .
nmod(Joku-1, pojista-2)
nsubj(auttaa-4, Joku-1)
aux(auttaa-4, voisi-3)
dobj(auttaa-4, minua-5)
punct(auttaa-4, .-6)

These structures are considered different from the amount expressions with numerals or adverbs, as their inflection behaves differently. Consider the following examples.

Examples

In the first example, both parts of the amount expression inflect as required by the verb kieltäytyä “to refuse”, whereas in the latter case, only the first nominal inflects, signaling that the head, the thing refused in this expression, is the cup. The structure Joku pojista behaves and is annotated similarly.

Two things should be noted about the above analysis of joku pojista lit. someone from the boys “one of the boys”. First, this analysis leads to yksi pojista “one of the boys” being analyzed similarly to joku pojista rather than yksi poika “one boy”.

Yksi pojista juoksi ulos . \n One from_boys ran out .
nsubj(juoksi-3, Yksi-1)
nmod(Yksi-1, pojista-2)
advmod(juoksi-3, ulos-4)
punct(juoksi-3, .-5)

Second, this analysis allows a structure like joku pojista to act as a predicative, as the head of the expression is in nominative.

Se oli joku pojista . \n It was someone from_boys .
nsubj:cop(joku-3, Se-1)
cop(joku-3, oli-2)
nmod(joku-3, pojista-4)
punct(joku-3, .-5)

Diffs

Contrary to the special cases desribed above, in FI_FTB (FinnTreeBank) the amounts expressed using a nominal are treated similarly to the amounts expressed with a number or an adverb. This means that the semantic nucleus of the phrase is marked as the head in spite of its case (often the partitive or elative case) as in kuppi kahvia “a cup of coffee” or joku pojista “one of the boys”.

Noun phrases without nouns

In UD Finnish, it is possible for a phrase with a head word other than a noun (or pronoun) to act as a noun phrase. Typical cases of this include adjective-headed and participle-headed noun phrases.

Examples

These structures are analyzed as standard noun phrases. For instance, they can be marked as the subject of a clause, or a nominal modifier, regardless of the part of speech of the head word.

Ikkunan takana oli jotain sinistä . \n Window behind was something blue .
case(Ikkunan-1, takana-2)
nmod(oli-3, Ikkunan-1)
nsubj(oli-3, sinistä-5)
det(sinistä-5, jotain-4)
punct(oli-3, .-6)
Onnettomuudessa olleille suositeltiin terapiaa . \n In_accident been(_ones) was_recommended therapy .
nmod(olleille-2, Onnettomuudessa-1)
nmod(suositeltiin-3, olleille-2)
dobj(suositeltiin-3, terapiaa-4)
punct(suositeltiin-3, .-5)

Comparatives and superlatives

This section describes the annotation of comparative and superlative structures, which, in UD Finnish, are considered to include also certain similar structures that do not contain a comparative or superlative wordform.

Comparatives

Structures with comparative adjectives and adverbs may be difficult to annotate: they are often elliptical, and it may be difficult to tell what is being compared with what. To annotate comparative constructions, dependency types advcl and mark are used.

The basic usage of these two types is as follows. The comparative adjective or adverb acts as the head for a advcl dependency, and the element being compared is its dependent. The element being compared also acts as the head for a mark dependency, the dependent of which is a comparative conjunction, nearly always kuin.

Keittiö on pienempi kuin olohuone . \n Kitchen is smaller than livingroom .
nsubj:cop(pienempi-3, Keittiö-1)
cop(pienempi-3, on-2)
advcl(pienempi-3, olohuone-5)
mark(olohuone-5, kuin-4)
punct(pienempi-3, .-6)

Note that the comparative adjective or adverb remains the head of the advcl dependency even if the word order is such that the dependency becomes non-projective.

Matilla on isompi auto kuin Pekalla . \n At_Matti is bigger car than Pekka .
nmod:own(on-2, Matilla-1)
nsubj(on-2, auto-4)
amod(auto-4, isompi-3)
advcl(isompi-3, Pekalla-6)
mark(Pekalla-6, kuin-5)
punct(on-2, .-7)

From the previous example it can also be seen that comparative structures are often elliptical in some way. Strictly speaking, the example does not compare Matti and Pekka, but rather their cars, and the car owned by Pekka is not explicitly present in the sentence. As a general rule of thumb, the different kinds of ellipsis present in comparative structures are not marked with null tokens, but rather the available elements are used wherever possible.

It is also possible to make comparisons without the comparative conjunction kuin. In these cases, only the dependency type advcl is used, marking the comparative adjective or adverb as the head, and the element compared as the dependent, just as in the case with the comparative conjunction present.

Olohuone on keittiötä suurempi . \n Livingroom is (than_)kitchen bigger .
nsubj:cop(suurempi-4, Olohuone-1)
cop(suurempi-4, on-2)
advcl(suurempi-4, keittiötä-3)
punct(suurempi-4, .-5)

Also some structures not involving a comparative adjective or adverb can be marked as comparatives. In order to qualify as a comparative construction, a structure has to contain either a comparative word form or a word form that otherwise semantically entails comparison, such as samanlainen “similar”, sama “same”, erilainen “different” or eri “differing, separate”. (Note that for example the word sama “same” is in fact a pronoun in Finnish.)

Luin saman kirjan kuin Pekka . \n I_read same book as Pekka .
dobj(Luin-1, kirjan-3)
det(kirjan-3, saman-2)
advcl(saman-2, Pekka-5)
mark(Pekka-5, kuin-4)
punct(Luin-1, .-6)

An additional difficulty is posed by the fact that in Finnish, the comparative conjunction kuin can also appear as a subordinating conjunction as well as an adverb. Borderline situations are resolved on a case-by-case basis, considering whether or not there is a comparison involved in the structure and, secondarily, whether the dependent structure is a clause. (Comparative structures can also occasionally be full clauses.)

Superlatives

Superlatives are less problematic than comparatives but deserve some attention nevertheless. The basic case with superlatives is simple: a lone superlative modifying a noun. The superlative form in this case is not marked in any particular way in the syntax annotation, but the structure is annotated similarly to any adjective modifying a noun. The same strategy of not marking the superlative in any particular way is also used in cases where the superlative acts as a predicative.

Suurin paketti oli muiden takana . \n Biggest package was others behind .
amod(paketti-2, Suurin-1)
nsubj(oli-3, paketti-2)
nmod(oli-3, muiden-4)
case(muiden-4, takana-5)
punct(oli-3, .-6)

Often a superlative is modified by nominal in some manner. A very common phenomenon is a genitive modifier modifying a superlative. For instance, in an expression such as

Suomen paras kokki \n Finland's best cook
nmod:poss(paras-2, Suomen-1)
amod(kokki-3, paras-2)

the cook is the best of those in/of Finland and thus the correct head word for the genitive modifier is paras “best”. Similarly, an ordinal number can act as the head of a genitive modifier. For example, in

Virtasen kuudes mestaruus \n Virtanen's sixth championship
nmod:poss(kuudes-2, Virtasen-1)
nummod(mestaruus-3, kuudes-2)

the championship is the sixth out of those of Virtanen, and thus the genitive modifier should modify the ordinal number.

However, it is still possible for the noun to act as the head word in some cases. For instance, in

Rusakon pahin vihollinen \n The_hare's worst enemy
nmod:poss(vihollinen-3, Rusakon-1)
amod(vihollinen-3, pahin-2)

the enemy is not the worst of the hare, but rather it is an enemy of the hare, and it is the worst enemy. Thus, the head word should be hare.

As a rule of thumb, if the noun phrase containing the genitive modifier can be turned into a copular clause in the following fashion, then the genitive modifier should modify the superlative or ordinal number.

Examples

are perfectly valid, but

Examples

is questionable at best. Thus, in Suomen paras kokki and Virtasen kuudes mestaruus, the genitive modifier is considered to modify the superlative adjective, but in rusakon pahin vihollinen, it is considered to modify the noun directly.

In this context, it should also be noted that in addition to superlatives, also certain other adjectives can also act as the head of a genitive modifier. These adjectives can be semantically superlative-like viimeinen “last”, but there are also many others, such as oma “own”, kaltainen “-like”, välinen “between (adj.)”, and vastainen “against (adj.)”.

Also other nominal modifiers are possible, expressing the set of beings from which the objects are drawn when making the comparison. These are treated similarly to the genitive modifiers, making the superlative wordform the head of the modifier if the modifier expresses the set of beings to draw from.

Kukista kaunein oli ikkunalaudalla . \n From_the_flowers most_beautiful was on_windowsill .
nmod(kaunein-2, Kukista-1)
nsubj(oli-3, kaunein-2)
nmod(oli-3, ikkunalaudalla-4)
punct(oli-3, .-5)

Note how in the previous example the phrase kukista kaunein can act as a noun phrase (it is the subject of the clause), even though its head word is an adjective.

Subordinate clauses and expressions of time

Many subordinate clauses, especially ones starting with the conjunction kun “when”, come with an adverbial, usually expressing time. Consider the following examples.

Examples

It is often unclear where these time adverbials should be attached. On the one hand, they seem to modify the main clause, expressing when the action of the main clause takes place. On the other hand, they could also modify the subordinate clause, being a part of the time condition given in the subordinate clause. A third option would be to make the time adverbial depend on the subordinating conjunction, becoming either multi-part conjunctions or conjunctions with adverbial modifiers.

In UD Finnish, a very limited number of these cases are considered especially tightly bound with the subordinating conjunction. These cases are considered multi-part subordinating conjunctions and listed as such in the documentation for mark. Otherwise, these adverbials are consistently made dependents of the subordinate conjunctions.

Tulen sinne heti , kun pääsen . \n I_will_come there right_away , when I_can .
advmod(Tulen-1, sinne-2)
advcl(Tulen-1, pääsen-6)
advmod(kun-5, heti-3)
punct(kun-5, ,-4)
mark(pääsen-6, kun-5)
punct(Tulen-1, .-7)

However, it should be noted that all subordinate clauses themselves are not dependents of the main verb. As discussed in the documentation for ccomp, clausal complements can depend on nouns, pronouns or adverbs. Similar situations can occur with subordinate clauses that are modifiers, and they are also analyzed similarly. Most commonly this occurs with the pronoun se “it”.

Hänet säikäytti se , kun poika putosi hevosen selästä . \n Him scared it , when boy fell horse's from_back .
dobj(säikäytti-2, Hänet-1)
nsubj(säikäytti-2, se-3)
advcl(se-3, putosi-7)
punct(putosi-7, ,-4)
mark(putosi-7, kun-5)
nsubj(putosi-7, poika-6)
nmod(putosi-7, selästä-9)
nmod:poss(selästä-9, hevosen-8)
punct(säikäytti-2, .-10)

Diffs

To prevent pure function words from having dependents when possible, the first of the three options has been chosen in FinnTreeBank (FTB_FI). The time adverbial modifies the main clause and the following subordinate clause modifies the adverbial. If the time adverbial could not stand on its own, a multi-part subordinating conjunction is considered (e.g. ennen kuin “before”).

Subjects and objects of a noun

In Finnish, it is possible for certain nouns which either are direct derivations of a verb or otherwise have a verb counterpart (verbivastineellinen substantiivi ISK §560; in Finnish) to take a subject- or object-like complement. Both of these are identical in form to more general genitive modifiers of a noun, marked with the dependency type nmod:poss in the UD Finnish scheme.

talon katto \n house(gen.) roof(N)
nmod:poss(katto-2, talon-1)

Genitive objects of a noun are marked the nmod:gobj, which is a subtype for the more general genitive-modifier type nmod:poss. Both nominal derivations and other nouns with verb counterparts can take a genitive object, with the exception of JA- derivations, the genitive modifier of which is never considered an object in UD Finnish (talon rakentaja “the builder of the house”).

talon rakentaminen \n house(gen.) building(N+deriv.)
nmod:gobj(rakentaminen-2, talon-1)

Genitive subjects, in turn, are marked using the nmod:gsubj dependency type, also a subtype of nmod:poss. Only nouns that are marked as derivations of a verb in the morphological tagging receive a nmod:gsubj dependent.

maljakon putoaminen \n vase(gen.) falling(N+deriv.)
nmod:gsubj(putoaminen-2, maljakon-1)

References

Diffs

In the current release of FinnTreeBank (FI_FTB) only minen-derivations of nouns can take a genitive object or subject. The information about being a verb-derived nominal does not occur in the morphological tagging of these nouns.

Numerical expressions

The dependency type compound is used for numerical expressions. Generally, with multi-token numerical expressions, the rightmost token of the expression is considered the head and the dependencies are chained.

Poikasia on yleensä 3 - 5 . \n Youngsters are usually 3 to 5 .
nsubj:cop(5-6, Poikasia-1)
cop(5-6, on-2)
advmod(5-6, yleensä-3)
compound(--5, 3-4)
compound(5-6, --5)
punct(5-6, .-7)

However, it is possible that rather complex expressions are considered numerical, and in these cases the structure of the expression is also marked, showing the parts of which the expression consists. Often these complex expressions involve dates, which are also considered numerical expressions in UD Finnish.

3. joulukuuta 1510 - 15. kesäkuuta 1579 \n 3rd December 1510 to 15th June 1579
compound(joulukuuta-2, 3.-1)
compound(1510-3, joulukuuta-2)
compound(--4, 1510-3)
compound(1579-7, --4)
compound(kesäkuuta-6, 15.-5)
compound(1579-7, kesäkuuta-6)

Dates can be expressed using many different forms, and all full dates are considered numerical expressions in UD Finnish, also those where some or all parts of the date are written with characters. Even partial dates such as

3. joulukuuta \n 3rd December
compound(joulukuuta-2, 3.-1)

are considered numerical expressions. However, year expressions such as the following are not considered dates in UD Finnish, and thus not complex numerical expressions.

sanoi vuonna 1996 \n said in_the_year 1996
nmod(sanoi-1, vuonna-2)
nummod(vuonna-2, 1996-3)
tapahtui kesällä 1972 \n happened in_the_summer 1972
nmod(tapahtui-1, kesällä-2)
nummod(kesällä-2, 1972-3)

If a date expression has a clear internal syntactic structure, this structure is annotated instead of the default chain of compound dependencies.

syyskuun 3. ja 4. päivä \n September's 3rd and 4th day
nmod:poss(3.-2, syyskuun-1)
cc(3.-2, ja-3)
conj(3.-2, 4.-4)
nummod(päivä-5, 3.-2)

If a date has a more specific time (such as kello kuudelta “at six o’clock”) attached to it, the date is considered the head of the expression, and the more specific time depends on it. Clock expressions, alone or in conjunction with a date, are not considered dates or numerical expressions in UD Finnish.

6. joulukuuta kello 18 \n 6th December o'clock 18
compound(joulukuuta-2, 6.-1)
nmod(joulukuuta-2, kello-3)
nummod(kello-3, 18-4)

In addition to dates, there is one more case of numerical expressions that deserves attention: numerical expressions with multiple units. If a single amount expression involves multiple units, the units are considered a compound unit so to say, and combined using the dependency type compound:nn.

2 kg 315 g
nummod(kg-2, 2-1)
compound:nn(g-4, kg-2)
nummod(g-4, 315-3)

In rare cases, however, the previous situation may occur with the rightmost part of the expression lacking the unit. These cases are annotated flatly as numerical expressions, with no compound units.

2 kg 315
compound(kg-2, 2-1)
compound(315-3, kg-2)

Diffs

In FinnTreeBank (FI_FTB), the dependency type compound is not used for numerical expressions. If any clear internal syntactic structure is not noticeable in a numerical expression, the rightmost token of the expression is considered the head of a chain consisting of nummod- or nmod-dependents. Respectively, numerical expressions with multiple units are annotated using a conj-relation.

Participial modifiers and predicatives

In connection with participial modifiers, predicatives are given a slightly different treatment than in other contexts. In a regular copular clause, the analysis is as follows.

Eeva on raskaana . \n Eeva is pregnant .
nsubj:cop(raskaana-3, Eeva-1)
cop(raskaana-3, on-2)
punct(raskaana-3, .-4)

However, if the same analysis were applied in a situation where olla acts as a participial modifier, this would result in a non-tree structure:

Raskaana oleva nainen on nälkäinen . \n Pregnant being woman is hungry .
cop(Raskaana-1, oleva-2)
nsubj:cop(Raskaana-1, nainen-3)
nsubj:cop(nälkäinen-5, nainen-3)
cop(nälkäinen-5, on-4)
punct(nälkäinen-5, .-6)

Therefore, in conjunction with participial modifiers, copular verbs are analyzed similarly to regular verbs, in order to avoid non-tree structures.

Raskaana oleva nainen on nälkäinen . \n Pregnant being woman is hungry .
advmod(oleva-2, Raskaana-1)
acl(nainen-3, oleva-2)
nsubj:cop(nälkäinen-5, nainen-3)
cop(nälkäinen-5, on-4)
punct(nälkäinen-5, .-6)

The same rule is applied to certain special constructions that are normally considered passive structures but can also appear in conjunction with participial modifiers. Here the application of the rule results in two chained participial modifiers.

Resurssit ovat käytettävissä . \n Resources are usable .
dobj(käytettävissä-3, Resurssit-1)
auxpass(käytettävissä-3, ovat-2)
punct(käytettävissä-3, .-4)
Käytettävissä olevat resurssit ovat rajalliset . \n Usable being resources are limited .
xcomp(olevat-2, Käytettävissä-1)
acl(resurssit-3, olevat-2)
nsubj:cop(rajalliset-5, resurssit-3)
cop(rajalliset-5, ovat-4)
punct(rajalliset-5, .-6)

Diffs

As the passive-verb-derived, idiomatic structures olla tehtävissä / tehtävillä (“to be doable”) are considered root (or other) + advcl in FinnTreeBank (FI_FTB), the rule relating to certain passive structures does not apply to FinnTreeBank.

Necessive structures and clausal subjects

A clause can act as a subject to another clause (as well as an object, but these are marked as clausal complements, ccomp), in which case it should be marked as a clausal subject, csubj, or, if the main clause is copular, a clausal copular subject, csubj:cop. However, in the case of clausal-copular subject, it may be difficult to determine whether a clause is, in fact, the subject of another clause, as the construct is similar to that of a necessive structure. Consider the following example.

Examples

At first glance, it seems that the clause syödä hyvin is the subject of on tärkeää. However, in UD Finnish, this is not considered a clausal subject. Instead, it is considered a necessive structure, as on tärkeää can be given a subject in the genitive form:

Examples

The whole structure is considered a single unit, and the genitive subject is considered the subject of the latter verb (which expresses what it is that is necessary).

Hänen on pakko mennä kotiin . \n He has to go home .
nsubj(mennä-4, Hänen-1)
cop(pakko-3, on-2)
xcomp:ds(pakko-3, mennä-4)
nmod(mennä-4, kotiin-5)
punct(pakko-3, .-6)

The name necessive structure comes from the fact that these structures often express the necessity of doing something, but it does not mean that all of these structures would have such a meaning; for example, on vaikea(a) “it is difficult” is a necessive structure the meaning of which does not express necessity. Common necessive structures include expressions such as on pakko, on tärkeää, on oleellista and on välttämätöntä. They usually, but not always, involve the verb olla and an adjective. There are also some verbs, such as kannattaa “be worth it” and kuulua “be supposed to”, that are analyzed in a necessive manner.

FIGURE MISSING

If it is not possible to insert a genitive subject into the clause, then the structure is considered a clausal subject case.

Examples

On mahtavaa mennä ulos . \n (it)_is splendid to_go out .
cop(mahtavaa-2, On-1)
csubj:cop(mahtavaa-2, mennä-3)
advmod(mennä-3, ulos-4)
punct(mahtavaa-2, .-5)

Note that due to the copular nature of the main clause, the clausal subjects in these clauses which resemble necessive structures are in fact clausal copular subjects. There are also other clausal subjects which cannot be confused with necessive structures.

Hänen aikomuksenaan oli mennä ulos . \n His intention(essive) was to_go out .
nmod:poss(aikomuksenaan-2, Hänen-1)
nmod(oli-3, aikomuksenaan-2)
csubj(oli-3, mennä-4)
advmod(mennä-4, ulos-5)
punct(oli-3, .-6)

Passive structures and zeroth person constructions

The Finnish language has two notable cases of subjectless expressions: the passive voice and the zeroth person. In most cases, distinguishing these two is rather simple, as the zeroth person uses the same verb forms as the third person, whereas there is a morphological passive form that is used in constructions considered passive. However, there are at least two particular phenomena that deserve special attention. First, the on tehtävä -structure is worth examining:

Examples

The form tehtävä is morphologically a passive participle of the verb tehdä “to do”. Still, on tehtävä can take a subject, which could perhaps point towards to the subjectless version being zeroth person after all.

Examples

In UD Finnish, we use the presence or absence of a subject as a cue to whether the structure is passive or not. If a subject is present, the structure is marked as an active construction, and if not, it is assumed to be passive.

Tämä työ on tehtävä tänään . \n This work has_to_be done today .
det(työ-2, Tämä-1)
dobj(tehtävä-4, työ-2)
auxpass(tehtävä-4, on-3)
advmod(tehtävä-4, tänään-5)
punct(tehtävä-4, .-6)
Matin on tehtävä työ tänään . \n Matti has_to do work today .
nsubj(tehtävä-3, Matin-1)
aux(tehtävä-3, on-2)
dobj(tehtävä-3, työ-4)
advmod(tehtävä-3, tänään-5)
punct(tehtävä-3, .-6)

Second, the on tehtävissä structure deserves a mention. Similarly to tehtävä, tehtävissä is a passive verb participle - in fact, the difference between the two forms is only that tehtävissä is the plural inessive form of the base participle tehtävä. The annotation of on tehtävissä follows a strategy similar to the previous one. In general, it is assumed that the structure is passive.

FIGURE MISSING

Unlike on tehtävä, on tehtävissä cannot take a genitive form subject:

Examples

However, in some cases it is possible to attach a possessive suffix to the participle and use a corresponding personal pronoun as a nominal modifier (this is a rare phenomenon and not seen with many verbs). This case is analyzed as an active structure.

FIGURE MISSING

However, as can be seen from the example, no subject is marked, but rather an object. It is still understood that means are the object of using in this example.

Morphological distinctions

Distinctions between certain dependency types, most commonly between participial modifiers (acl) and adjectival modifiers (amod) as well as adverbial modifiers (advmod) and nominal modifiers (nmod), are based on the corresponding morphological distinction, which can sometimes be rather difficult. This section describes heuristics used to make these two most common morphology-based distinctions. Some of these heuristics resemble those used in the Penn Treebank.

Participles versus adjectives

The distinction between verb participles and adjectives is difficult in several languages, and Finnish is no exception. In UD Finnish, this distinction affects the syntax annotation of mainly two kinds of structures. First, it affects the choice between the dependency types acl (participial modifier) and amod (adjectival modifier).

Tunnettu näyttelijä John Travolta \n Well-known actor John Travolta
amod/acl?(näyttelijä-2, Tunnettu-1)
compound:nn(John-3, näyttelijä-2)
name(John-3, Travolta-4)

Second, it affects whether certain structures should be marked as copular clauses, or alternatively, as passive clauses in the present or past perfect form (perfekti and pluskvamperfekti in Finnish grammar). The same structure can be considered copular if the head word is an adjective, or a passive clause if the head word is considered a passive participle.

Uiminen järvessä on kielletty . \n Swimming in_lake is\/has_been forbidden .
nsubj:cop/dobj?(kielletty-4, Uiminen-1)
nmod(Uiminen-1, järvessä-2)
cop/auxpass?(kielletty-4, on-3)
punct(kielletty-4, .-5)

Some words have several possible readings, and it is fairly common that a word can be given either a participial reading or an adjectival one. The following heuristics are used when deciding whether a word is an adjective or a participle.

If a word can receive comparative and superlative forms, it is likely to be an adjective. For instance, the word tunnettu “well-known”, which has both and adjectival and a participial reading, inflects in these forms: tunnettu, tunnetumpi, tunnetuin.

If, on the other hand, the word is modified by for instance a nominal or adverbial modifier, it is likely to be a verb participle. For instance, with the word tunnettu, the following contexts would be possible:

Examples

Thus, it is the case that the same word can act both as an adjective and as a verbal participle, depending on context, and the decisions are made on a case-by-case basis. As a third heuristic used in the decision, the annotators are asked to consider whether someone is actively doing something in the example under consideration. If so, then the word is likely a verbal participle, otherwise it is an adjective. Consider the following examples:

Examples

In the first example, the husband is not actively doing anything, he simply is going to be Maija’s husband in the future. Thus tuleva in this example would be considered an adjective. In the second example, he is actively coming from the direction of Turku, and thus tuleva here would be a verbal participle.

As a rule of thumb, if an adjectival reading is possible in a given context, it is generally preferred. For instance, in tunnettu näyttelijä “well-known actor”, if it was not specified a a by whom or for what the actor is known, it would be assumed that the adjectival reading is intended. Similarly, in uiminen on kielletty “swimming is forbidden”, if the context does not reveal that there has been active forbidding of the swimming (the example is genuinely ambiguous), then it is assumed that it is a property of the swimming that it is forbidden.

Adverbs versus nouns

Due to the fact that certain Finnish adverbs have a partial case inflection, it is sometimes difficult to decide whether a word is an inflected form of a noun (or adjective), or rather an adverb. For instance, the word pääasiassa “mainly” could be analyzed as an adverb, or alternatively, as an inflected form of the noun pääasia “the main thing”.

This distinction affects the choice between the dependency types advmod (adverb modifier) and nmod (nominal modifier). Additionally, it can affect the choice of whether a word can be marked as a predicative (if it is an adverb) and thus head of the clause, or if it should me marked as a nominal modifier for the verb olla. In the latter case, the structure of the whole clause is affected by the decision.

Pääasiassa tämä vaikuttaa koron suuruuteen . \n Mainly this affects interest's level .
advmod/nmod?(vaikuttaa-3, Pääasiassa-1)
nsubj(vaikuttaa-3, tämä-2)
nmod(vaikuttaa-3, suuruuteen-5)
nmod:poss(suuruuteen-5, koron-4)
punct(vaikuttaa-3, .-6)
Elisa ja Elias ovat naimisissa . \n Elisa and Elias are married .
cc(Elisa-1, ja-2)
conj(Elisa-1, Elias-3)
nsubj:cop?(naimisissa-5, Elisa-1)
cop?(naimisissa-5, ovat-4)
punct(naimisissa-5, .-6)
Matti oli humalassa . \n Matti was drunk .
nsubj?(oli-2, Matti-1)
nmod?(oli-2, humalassa-3)
punct(oli-2, .-4)

Again, the main source of information while annotating is the morphological analysis of the word, but occasionally it is possible that the syntactic annotation uses a reading that has been omitted. It is less common that both an adverb and noun reading would be available. Decision heuristics are needed here as well.

The main deciding factor between a noun and an adverb reading is whether there exists a corresponding noun in its baseform and whether and to what degree the word under question is related to that noun. For example, in the case of pääasiassa “mainly” there exists a corresponding noun pääasia “main thing”, but in the case of naimisissa “married” the only candidate for such a noun would be naiminen, which could technically be translated as “marrying”, but is in fact more often used (usually in spoken language) in the meaning “having sex”. As for humalassa “drunk”, there is a candidate noun, humala, which can be used to refer to the state of being drunk.

As a test used to see whether the possible candidate noun is closely (enough) related to the word under question, annotators are asked to reflect on the hypothetical baseform of the noun reading and on whether it could be imagined to be involved in the current sentence. For instance, is there a main thing (pääasia) in which the interest rate is affected? Is there a state of being married (“naimiset”) in which Elisa and Elias are? Is there a state of being drunk (humala) in which Matti is? The answer to the first two questions is no, and thus pääasiassa and naimisissa are considered adverbs. The answer to the third question, however, is yes, and therefore the word humalassa is analyzed as an inflected form of the noun humala.

References

Attaching punctuation

Dependencies signaling punctuation are labeled with the dependency type punct, and the main rule is that the dependency should be attached to that element which it delimits. Thus, sentence-delimiting punctuation, such as “.”, “!” or “?” should be attached to the main verb (or predicative) of the sentence.

Söin jäätelöä . \n I_ate ice-cream .
dobj(Söin-1, jäätelöä-2)
punct(Söin-1, .-3)

According to the same rule, the comma delimiting a subordinate clause should be attached to the head word of said clause.

Jos sataa , menen sisälle . \n If it_rains , I_go inside .
mark(sataa-2, Jos-1)
punct(sataa-2, ,-3)
advcl(menen-4, sataa-2)
advmod(menen-4, sisälle-5)
punct(menen-4, .-6)

If there are several subordinate clauses within each other and the punctuation could delimit any of them, the shortest-spanning (closest) clause is selected.

Jos syöt sieniä , jotka ovat myrkyllisiä , kuolet . \n If you_eat mushrooms , that are poisonous , you_die .
mark(syöt-2, Jos-1)
dobj(syöt-2, sieniä-3)
acl:relcl(sieniä-3, myrkyllisiä-7)
nsubj:cop(myrkyllisiä-7, jotka-5)
punct(myrkyllisiä-7, ,-4)
cop(myrkyllisiä-7, ovat-6)
punct(myrkyllisiä-7, ,-8)
advcl(kuolet-9, syöt-2)
punct(kuolet-9, .-10)

In coordinations, the punctuation symbols (usually commas) are treated similarly to the coordinating conjunction and attached to the head of the coordination, which is the first coordinated element.

kivet , kannot ja männynkävyt \n rocks , stumps and pinecones
punct(kivet-1, ,-2)
conj(kivet-1, kannot-3)
cc(kivet-1, ja-4)
conj(kivet-1, männynkävyt-5)

Punctuation related to coordination-like parataxis, that is, parataxis used in connection with a semicolon, colon or dash, is attached as in coordinations.

Matti tuli töistä ; Maija oli jo kotona . \n Matti came from_work ; Maija was already home .
nsubj(tuli-2, Matti-1)
nmod(tuli-2, töistä-3)
punct(tuli-2, ;-4)
parataxis(tuli-2, oli-6)
nsubj(oli-6, Maija-5)
advmod(oli-6, jo-7)
advmod(oli-6, kotona-8)
punct(tuli-2, .-9)

Punctuation with direct speech -type parataxis is attached to the first element.

" Älä sotke itseäsi " , äiti sanoi . \n " Don't mess yourself " , mother said .
neg(sotke-3, Älä-2)
dobj(sotke-3, itseäsi-4)
punct(sotke-3, "-1)
punct(sotke-3, "-5)
punct(sotke-3, ,-6)
parataxis(sotke-3, sanoi-8)
nsubj(sanoi-8, äiti-7)
punct(sotke-3, .-9)

Single and double quotes as well as parentheses are attached to the head of the quoted/parenthetical clause or phrase. Dashes signifying quotes are also attached to the head of the quote.

Illan elokuva on " Kuninkaan puhe " . \n Tonigt's movie is " The_King's speech " .
nmod:poss(elokuva-2, Illan-1)
nsubj:cop(puhe-6, elokuva-2)
cop(puhe-6, on-3)
punct(puhe-6, "-4)
nmod:poss(puhe-6, Kuninkaan-5)
name(Kuninkaan-5, puhe-6)
punct(puhe-6, "-7)
punct(puhe-6, .-8)
Matikainen ( s. 1943 ) on ammatiltaan kirjailija . \n Matikainen ( born 1943 ) is by_profession author .
nsubj:cop(kirjailija-8, Matikainen-1)
acl(Matikainen-1, s.-3)
punct(s.-3, (-2)
nmod(s.-3, 1943-4)
punct(s.-3, )-5)
cop(kirjailija-8, on-6)
nmod(kirjailija-8, ammatiltaan-7)
punct(kirjailija-8, .-9)
- Älä sotke itseäsi , sanoi äiti . \n - Don't mess yourself , said mother .
punct(sotke-3, --1)
neg(sotke-3, Älä-2)
dobj(sotke-3, itseäsi-4)
punct(sotke-3, ,-5)
parataxis(sotke-3, sanoi-6)
nsubj(sanoi-6, äiti-7)
punct(sotke-3, .-8)

If the quotes or parentheses contain two or more items, such as parts of a coordination, then the punctuation is attached to the closest enclosed element, so as to avoid unnecessary non-projectivity.

Hän pitää kirjoista ( ja näytelmistä ) . \n He likes books ( and plays ) .
nsubj(pitää-2, Hän-1)
dobj(pitää-2, kirjoista-3)
cc(kirjoista-3, ja-5)
conj(kirjoista-3, näytelmistä-6)
punct(pitää-2, .-8)
punct(ja-5, (-4)
punct(näytelmistä-6, )-7)

Punctuation can also delimit short additions, such as nominal modifiers or appositions, and in such cases, the punctuation should be attached to the head of the addition.

Matti Tamminen , professori \n Matti Tamminen , the_professor
name(Matti-1, Tamminen-2)
appos(Matti-1, professori-4)
punct(professori-4, ,-3)
Lähden matkalle , ainakin viikoksi . \n I_am_going to_trip , at_least for_a_week .
nmod(Lähden-1, matkalle-2)
nmod(Lähden-1, viikoksi-5)
punct(Lähden-1, .-6)
punct(viikoksi-5, ,-3)
advmod(viikoksi-5, ainakin-4)

Finally, list item markers such as bullets of a bulleted list are marked as punctuation attached to the head of the list item.

* Käy kaupassa . \n * Visit store .
punct(Käy-2, *-1)
punct(Käy-2, .-4)
nmod(Käy-2, kaupassa-3)