This is part of archived UD v1 documentation. See http://universaldependencies.org/ for the current version.
home issue tracker

Specific constructions

Clausal structures

Reflexive pronouns

Czech has a reflexive personal pronoun that takes different forms in different cases and these forms differ from the normal, irreflexive pronouns:

Case:GenDatAccLocIns
Clitic:sise
Full:sebesoběsebesoběsebou

The clitic forms se, si are very frequent and serve various purposes. Their default function is to represent object that is identical to the subject of the same verb. The test is that they could be substituted by a normal personal pronoun. Such instances are attached to the verb as dobj or iobj.

The Czech reflexive pronoun is also used in reciprocal actions where other languages use a special reciprocal pronoun. These instances are still attached as dobj or iobj, respectively:

If the reflexive pronoun can be substituted by another nominal but it is not a core argument (object) of the verb, it will be attached as nmod.

The reflexive pronoun can be used to form a passive construction. This is called reflexive passive; there is also the “normal” passive built with the passive participle and the auxiliary verb být “to be”. Reflexive pronoun that forms a reflexive passive is attached as auxpass:reflex.

There are inherently reflexive verbs, i.e. the verb always occurs with a reflexive prounoun, and the pronoun cannot be replaced by a non-reflexive pronoun or any other nominal.

With these verbs, the reflexive pronoun is attached as expl.

If a reflexive verb (inherently or not) has been turned to a verbal noun, the reflexive pronoun is attached to the noun as nmod:

Finally, the dative reflexive si is sometimes used in situations where it is redundant. Such instances are attached as discourse:

Adjectival and adverbial constructions

Comparatives (degree)

Unlike in English, most Czech adjectives and adverbs have morphological comparative and superlative forms (see the Degree feature): chytrý “smart”, chytřejší “smarter”, nejchytřejší “smartest”. Periphrastic constructions such as English more intelligent cannot be completely excluded but they are infrequent and often deemed poor style: inteligentnější is preferred over více inteligentní. The exception is when the adjective or adverb applies less to the entity being compared than to the entity being compared to: méně inteligentní “less intelligent” is the only way of reversing the comparison. Equality comparisons are also periphrastic.

To keep the analyses of the morphological and the periphrastic cases parallel (and also to keep the analyses parallel cross-linguistically), in the periphrastic examples the entity comapared to modifies still the adjective and not the adverb:

If a property is compared to a clause, the clause is attached as advcl instead of nmod and the conjunction (než, jako) is attached to the subordinate clause as mark.

Very commonly the complement clause in a comparative undergoes various amounts of partial reduction or ellipsis, sometimes to a quite extreme extent. In general, we treat whatever remnant that remains as still an advcl, as above.

The limiting case is that only a nominal is present; then we analyze it as an nmod, although one could see Martin is more intelligent than Vojta as a reduced expression of Martin is more intelligent than how Vojta is intelligent. We lean towards minimizing the postulation of unobserved structure and opt to treat these cases as just an oblique nominal complement.

Comparatives (quantity)

In the periphrastic comparatives in the previous section, the words více “more” and méně “less” are comparative forms of the adverbs hodně/mnoho “much/many” and málo “little”, respectively. However, in other situations they combine directly with nouns and act as quantifiers (termed indefinite numerals in the Czech grammar but labeled DET in accord with our definition). They behave syntactically like high-value numerals (see nummod for details) and we attach them as det:numgov or det:nummod.

As with qualitative comparisons, we use nmod instead of advcl and case instead of mark when the comparative complement is reduced to just a nominal:

In certain contexts the comparative complement combines both the action or adjective that is being compared and the quantity it is compared to:

In these cases we consider více než to be a multi-word expression because the two words are inseparable. One cannot say *více procent než 90 (the word procent can be pulled to the front but then it will skip the whole MWE, as in těch procent nebylo více než 90 lit. the percent were-not more than 90.)

Ellipsis

Ellipsis means that there is something missing in the sentence. Something that has been omitted from the surface form, although it is understood by both the speaker and the listener. Various phenomena can be classified as ellipsis; the most important and difficult are those where the missing word has dependents. Where do we attach these orphans to?

Several different solutions can be found in treebanks. One of them is to include an empty node (labeled NULL, #Fantom etc.) that represents the missing word. Orphans are then attached to the empty node with their real dependency relation labels. Such analysis would be linguistically adequate but it would violate our principle that dependencies exist between real syntactic words. (It would also make parsing more difficult.) We do not insert empty nodes.

If empty nodes are not an option, some treebanks attach all orphans to the grandparent, i.e. to the parent of the missing parent node. Then they may

Another possibility is that one of the orphans gets promoted to the place of the missing parent and the other orphans are attached to it.

We use a combination of approaches in the Czech UD. The only limitation is that we do not reconstruct nodes that are not present in the surface sentence form.

If the head noun is missing from a noun phrase, i.e. there is just an adjective, possibly also a numeral or a determiner, then one orphan is selected as the main dependent and it gets promoted:

Note that Czech does not have promotion of auxiliaries like in English I did not come but he did. Occasionally yes/no is used to construct similar sentences, as in Já jsem nepřišel, ale on ano. lit. I have not-come, but he yes.

We do not use promotion when a verb is missing and two or more arguments of the verb are present. A frequent special case of this is coordination of clauses that share the same verb but only the first occurrence of the verb is retained on the surface, while the other copies have been deleted and only their dependents remain: Pavel si objednal hovězí a Markéta [si objednala] vepřové. “Pavel ordered beef and Markéta [ordered] pork.” Universal Dependencies annnotate such cases using the remnant relation, which enables reconstruction of the functions of the arguments, without inserting an empty node for the missing verb:

Sometimes a verb is missing but there is no coordination and no overt copy of the verb, hence we cannot use the remnant analysis. In particular, there are sentence-like segments that lack the main verb: A co na to [říká] MF? “And what [does] MF [say] to it?”

Since release 1.2 of the Czech UD treebank, there is just one node with the root dependency relation in every tree; when there are multiple orphaned dependents at the top level of the tree, the leftmost dependent is promoted to the head (root) position and the other orphans are attached to it.

BESbswyBESbswyBESbswyBESbswy