home edit page issue tracker

This page pertains to UD version 2.

UD French ALTS

Language: French (code: fr)
Family: IE

This treebank has been part of Universal Dependencies since the UD v2.16 release.

The following people have contributed to making this treebank part of UD: Natalia Romanova, Rayan Ziane, Khensa Daoudi.

Repository: UD_French-ALTS
Search this treebank on-line: PML-TQ
Download all treebanks: UD 2.16

License: CC BY-SA 4.0

Genre: legal

Questions, comments? General annotation questions (either French-specific or cross-linguistic) can be raised in the main UD issue tracker. You can report bugs in this treebank in the treebank-specific issue tracker on Github. If you want to collaborate, please contact [natalia • romanova (æt) unicaen • fr]. Development of the treebank happens directly in the UD repository, so you may submit bug fixes as pull requests against the dev branch.

Annotation Source
Lemmas annotated manually
UPOS annotated manually, natively in UD style
XPOS not available
Features annotated manually, natively in UD style
Relations annotated manually, natively in UD style

Description

ALTS (AUTOMATED Sixteenth-century corpus) is a treebank of sixteenth-century legal French. Currently in contains one text, trial accounts from Guernsey Greffe (register Crime I), transcribed directly from the manuscript and manually annotated in PoS, lemmata and syntactic functions. The text presents dialectal Norman features and forms.

This text of Guernsey Crime I (1,269 sentences; 45,101 tokens) which contains accounts of fifteen court cases on the island on Guernsey from 1563 to 1569 (witchcraft, piratry, infanticide etc) was first annotated in PoS, lemmatised and automatically parsed as part of the Franco-German MICLE project (2021-2024) led by Professor Pierre Larrivée (University of Caen) and Professor Cecilia Poletto (University of Frankfurt). Earlier versions of the text, annotated with HT-CRISCO workflow incorporating the use of HOPS parser, can be consulted on CRISCO Lab’s TXM server and via the website.

As part of AUTOMATED project the text was reannotated with BertForDeprel parser and manually corrected using bootstrapping methodology (Peng et al 2022) on ArboratorGrew software.

Set Sentences Tokens
Train 811 30,140
Dev 111 4,575
Test 347 10,386
Total 1,269 45,101

Acknowledgments

This work was made possible thanks to the generous support of the ANR-DFG Franco-German scheme (MICLE project (2021-2024)) and of the Normandy region AUTOMATED project (2023-2025).

We would like to thank the staff at the Guernsey Greffe archives and the Guernsey Museum & Art Gallery for giving us acces to the manuscript and digital images in 2021 and 2023. We are also grateful to former island archivist Daryl Ogier for his assistance and advice when working with the original source. We are grateful to the team of student transcribers (Agathe Aubert, Lucie Marie-Leblanc, Marie Picart and Valentin Simenel) who helped with the transcription in 2022. We thank Patrice Lajoye and Stéphane Laîné for their assistance with lemmatisation and dialectal features of the text and to Mattis Le Scaer who helped elucidate the historical context of the document.

Annotation was performed by Natasha Romanova and Rayan Ziane, technical assistance by Khensa Daoudi.

References

Statistics of UD French ALTS

POS Tags

ADJADPADVAUXCCONJDETINTJNOUNNUMPRONPROPNPUNCTSCONJVERB

Features

DefiniteExtPosNumberNumTypePolarityPronTypeVerbForm

Relations

aclacl:relcladvcladvmodamodapposauxaux:passcaseccccompconjcopcsubjdetdiscoursedislocatedexplfixedflatiobjmarknmodnsubjnummodobjoblorphanparataxispunctrootvocativexcomp

Tokenization and Word Segmentation

Morphology

Tags

Nominal Features

Degree and Polarity

Verbal Features

Pronouns, Determiners, Quantifiers

Other Features

Syntax

Auxiliary Verbs and Copula

Core Arguments, Oblique Arguments and Adjuncts

Here we consider only relations between verbs (parent) and nouns or pronouns (child).

Relations Overview