home edit page issue tracker

This page pertains to UD version 2.

UD German LIT

Language: German (code: de)
Family: Indo-European, Germanic

This treebank has been part of Universal Dependencies since the UD v2.4 release.

The following people have contributed to making this treebank part of UD: Alessio Salomoni.

Repository: UD_German-LIT
Search this treebank on-line: PML-TQ
Download all treebanks: UD 2.2

License: CC BY-NC-SA 4.0

Genre: nonfiction

Questions, comments? General annotation questions (either German-specific or cross-linguistic) can be raised in the main UD issue tracker. You can report bugs in this treebank in the treebank-specific issue tracker on Github. If you want to collaborate, please contact [alessio • salomoni (æt) unibg • it]. Development of the treebank happens outside the UD repository. If there are bugs, either the original data source or the conversion procedure must be fixed. Do not submit pull requests against the UD repository.

Annotation Source
Lemmas assigned by a program, with some manual corrections, but not a full manual verification
UPOS annotated manually in non-UD style, automatically converted to UD, with some manual corrections of the conversion
XPOS assigned by a program, with some manual corrections, but not a full manual verification
Features not available
Relations annotated manually, natively in UD style

Description

It aims at gathering texts of the German literary history. Currently, it hosts Fragments of the early Romanticism, i.e. aphorism-like texts mainly dealing with philosophical issues concerning art, beauty and related topics.

In a long-term perspective, this treebank aims at gathering texts from different genres and different authors of the German literary history. Currently, it exclusively hosts Fragments of the early Romanticism (end of the 18th century, modern German), i.e. really short texts, often in aphorism-like form, dealing with philosophical issues in a witty and cryptic way. They mainly deal with aesthetics, i.e. with philosophy concerning art and beauty. This treebank is mainly intended for stylistic analysis that can benefit from the dependency formalism as well as from the opportunity to automatically and quickly retrieve information concerning syntax.

Version 2.4 hosts the following texts (each text is followed by the reference to the original edition from which it was digitized, as well as by the permalink to the online source of the raw text):

Each sentence in the treebank file is preceded by some comments introduced by ‘#’, through which the following information is preserved:

’# newdoc id = bluethenstaub’ ‘# newpar id = bluethenstaub-f1’ ‘# author = Novalis’ ‘# work = Blüthenstaub’ ‘# sent_id = bluethenstaub-f1-s1’

We made this choice since the treebank is exactly intended as a structured version in dependency formalism of the original texts, therefore we want to preserve the parallelism between the treebanked data and the source texts as much as possible.

Acknowledgments

Many thanks to Daniel Zeman, who promptly solved some fundamental problems concerning data format, and showed great interest for this project right from the beginning. …

References

Statistics of UD German LIT

POS Tags

ADJADPADVAUXCCONJDETNOUNNUMPARTPRONPROPNPUNCTSCONJVERBX

Features

Relations

acladvcladvmodamodapposauxaux:passcaseccccompcompoundcompound:prtconjcopcsubjdepdetdet:possexplfixedflatiobjmarknmodnmod:possnsubjnsubj:passnummodobjoblobl:agentorphanparataxispunctrootxcomp

Tokenization and Word Segmentation

Morphology

Tags

Nominal Features

Degree and Polarity

Verbal Features

Pronouns, Determiners, Quantifiers

Other Features

Syntax

Auxiliary Verbs and Copula

Core Arguments, Oblique Arguments and Adjuncts

Here we consider only relations between verbs (parent) and nouns or pronouns (child).

Relations Overview