home edit page issue tracker

This page pertains to UD version 2.

UD Swedish LinES

Language: Swedish (code: sv)
Family: Indo-European, Germanic

This treebank has been part of Universal Dependencies since the UD v1.3 release.

The following people have contributed to making this treebank part of UD: Lars Ahrenberg.

Repository: UD_Swedish-LinES
Search this treebank on-line: PML-TQ
Download all treebanks: UD 2.5

License: CC BY-NC-SA 4.0

Genre: fiction, nonfiction, spoken

Questions, comments? General annotation questions (either Swedish-specific or cross-linguistic) can be raised in the main UD issue tracker. You can report bugs in this treebank in the treebank-specific issue tracker on Github. If you want to collaborate, please contact [lars • ahrenberg (æt) liu • se]. Development of the treebank happens outside the UD repository. If there are bugs, either the original data source or the conversion procedure must be fixed. Do not submit pull requests against the UD repository.

Annotation Source
Lemmas annotated manually in non-UD style, automatically converted to UD
UPOS annotated manually in non-UD style, automatically converted to UD, with some manual corrections of the conversion
XPOS annotated manually
Features assigned by a program, not checked manually
Relations annotated manually in non-UD style, automatically converted to UD, with some manual corrections of the conversion

Description

UD Swedish_LinES is the Swedish half of the LinES Parallel Treebank with UD annotations. All segments are translations from English and the sources cover literary genres, online manuals and Europarl data.

UD Swedish_LinES is the Swedish half of the LinES Parallel Treebank with UD annotations. All segments are translations of the corresponding English segments found in the UD English_LinES treebank.The original dependency annotation was first automatically converted to Universal Dependencies and then partially reviewed (Ahrenberg, 2015). In January-February 2017 it was converted to UD version 2 and again reviewed for errors. With version 2.1 lemmata and morphological features have been added.

The treebank is being developed continuously.

Acknowledgments

Three of the source texts were collected as part of the Linköping Translation Corpus Corpus (Merkel, 1999). The treebank was first developed in the project ‘Micro- and macro-level analysis of translations’ funded by the Swedish Research Council (Ahrenberg, 2007).

Statistics of UD Swedish LinES

POS Tags

ADJADPADVAUXCCONJDETINTJNOUNNUMPARTPRONPROPNPUNCTSCONJSYMVERBX

Features

AbbrCaseDefiniteDegreeGenderMoodNumberNumTypePersonPolarityPossPronTypeTenseVerbFormVoice

Relations

aclacl:cleftacl:relcladvcladvmodamodapposauxaux:passcaseccccompcompoundcompound:prtconjcopcsubjcsubj:passdepdetdiscoursedislocatedexplfixedflatiobjmarknmodnmod:possnsubjnsubj:passnummodobjoblobl:agentorphanparataxispunctreparandumrootvocativexcomp

Tokenization and Word Segmentation

Morphology

Tags

Nominal Features

Degree and Polarity

Verbal Features

Pronouns, Determiners, Quantifiers

Other Features

Syntax

Auxiliary Verbs and Copula

Core Arguments, Oblique Arguments and Adjuncts

Here we consider only relations between verbs (parent) and nouns or pronouns (child).

Relations Overview