home edit page issue tracker

This page pertains to UD version 2.

UD German LIT

Language: German (code: de)
Family: Indo-European, Germanic

This treebank has been part of Universal Dependencies since the UD v2.4 release.

The following people have contributed to making this treebank part of UD: Alessio Salomoni.

Repository: UD_German-LIT
Search this treebank on-line: PML-TQ
Download all treebanks: UD 2.13

License: CC BY-NC-SA 4.0

Genre: nonfiction

Questions, comments? General annotation questions (either German-specific or cross-linguistic) can be raised in the main UD issue tracker. You can report bugs in this treebank in the treebank-specific issue tracker on Github. If you want to collaborate, please contact [alessio • salomoni (æt) unibg • it]. Development of the treebank happens outside the UD repository. If there are bugs, either the original data source or the conversion procedure must be fixed. Do not submit pull requests against the UD repository.

Annotation Source
Lemmas assigned by a program, with some manual corrections, but not a full manual verification
UPOS annotated manually in non-UD style, automatically converted to UD, with some manual corrections of the conversion
XPOS assigned by a program, with some manual corrections, but not a full manual verification
Features not available
Relations annotated manually, natively in UD style


This treebank aims at gathering texts of the German literary history. Currently, it hosts Fragments of the early Romanticism, i.e. aphorism-like texts mainly dealing with philosophical issues concerning art, beauty and related topics.

In a long-term perspective, this treebank aims at gathering texts from different genres and different authors of the German literary history. Currently, it exclusively hosts Fragments of the early Romanticism (end of the 18th century, modern German), i.e. really short texts, often in aphorism-like form, that deal with philosophical issues in a witty and cryptic way. They mainly deal with aesthetics, i.e. philosophy concerning art and beauty. This treebank is mainly intended for corpus-based stylistic analysis that can benefit from the dependency relations as well as from all the other levels of annotation (currently LEMMA and both UPOs and XPOS).

The version 2.5 hosts the following texts (each text is followed by the reference to the original edition from which it was digitized, as well as by the permalink to the online source of the digital raw text):

Each sentence in the treebank is preceded by some comments introduced by ‘#’, through which the following information is respectively encoded:

In this case, the sentence following the set of comments would be the first sentence of the first fragment of the collection “Blüthenstaub” written by Novalis. We made this choice about such a use of comments because we want to preserve the parallelism between the treebanked data and the source texts as much as possible. In this perpsetive, this treebank aims to be the linguistically annotated counterpart of the orgiginal texts, thus preserving those categories that we are traditionally acquainted to adopt in order to work on literary texts.


Many thanks to Daniel Zeman, who has promptly solved some fundamental problems concerning the data format, and showed great interest for this project right from the beginning. …

Statistics of UD German LIT

POS Tags






Tokenization and Word Segmentation



Nominal Features

Degree and Polarity

Verbal Features

Pronouns, Determiners, Quantifiers

Other Features


Auxiliary Verbs and Copula

Core Arguments, Oblique Arguments and Adjuncts

Here we consider only relations between verbs (parent) and nouns or pronouns (child).

Verbs with Reflexive Core Objects

Relations Overview