home edit page issue tracker

This page pertains to UD version 2.

UD French ALTS

Language: French (code: fr)
Family: IE

This treebank has been part of Universal Dependencies since the UD v2.16 release.

The following people have contributed to making this treebank part of UD: Natalia Romanova, Rayan Ziane, Khensa Daoudi, Théo Brillet.

Repository: UD_French-ALTS
Search this treebank on-line: PML-TQ
Download all treebanks: UD 2.17

License: CC BY-SA 4.0

Genre: legal

Questions, comments? General annotation questions (either French-specific or cross-linguistic) can be raised in the main UD issue tracker. You can report bugs in this treebank in the treebank-specific issue tracker on Github. If you want to collaborate, please contact [natalia • romanova (æt) unicaen • fr]. Development of the treebank happens directly in the UD repository, so you may submit bug fixes as pull requests against the dev branch.

Annotation Source
Lemmas annotated manually
UPOS annotated manually, natively in UD style
XPOS not available
Features annotated manually, natively in UD style
Relations annotated manually, natively in UD style

Description

ALTS (AUTOMATED Sixteenth-century corpus) is a treebank of sixteenth-century legal French from Normandy and the Channel Islands.

Currently it contains two texts: 1) trial accounts from Guernsey Greffe (register Crime I), transcribed directly from the manuscript (1563-1569Guern**) and 2) an extract from Book 9 of Guillaume Terrien’s _Commentaires du droict civil tant public que privé observé au pays et duché de Normandie digitised from the original printed book (1578_Terrien**). The text of 1563-1569_Guern presents many dialectal Norman features and forms. The text of 1578_Terrien has some Latin words and expressions.

1563-1569_Guern

This text contains accounts of fifteen court cases on the island on Guernsey from 1563 to 1569 (witchcraft, piratry, infanticide etc). The text was transcribed in full from the original manuscript Guernsey Greffe Crime I, abbreviations were expanded. In the treebank, sentences from this text have the prefix 1563-1569_Guern.

1578_Terrien

This text contains passages authored by Guillaume Terrien himself (and not quotations from earlier legal texts) from Book 9 “Style de procédure” from the sixteenth-century printed book Guillaume Terrien (1568). Commentaires du droict civil tant public que privé observé au pays et duché de Normandie, 2nd edition, Paris: Jacques du Puy, pp. 339-402. The spelling and word segmentation of the original, including abbreviated words (e.g. “glo.” for “glose”), have been retained. Only abbreviations for “m” and “n” (eg. “o with a tilda” for “om” or “on” and “&” for “et” have been expanded. In the treebank, sentences from this text have the prefix 1578_Terrien.

Sentences written completely in Latin were excluded. If Latin words occur in French sentences, the token contains the tag Lang=la and is lemmatised with a Latin lemma.

Sentence and token number per text

Text Sentences Tokens
1563-1569_Guern 1,269 45,101
1578_Terrien 757 25,113
Total 2,026 70,114

Annotation

Verbs and auxiliaries are annotated in verb forms (VerbForm): Inf (infinitive), Fin (conjugated) and Part (participle). In 1568_Terrien, congujated verbs and auxiliaries are annotated in Person and Number.

Pronouns are annotated in type (PronType: Dem for demonstrative, Ind for indefinite, Int for interrogative, Prs for personal and Rel for relative). Reflexive and possessive pronouns are also tagged (Reflexive=Yes and Poss=Yes).

Determiners are annotated using PronType feature (Art for articles, Dem for demonstratives, Ind for indefinite). Possessive determiners have are annotated Poss=Yes.

The treebank is lemmatised using modern French lemmata and, wherever approriate, using lemmata from (Dictionnaire du Moyen Français).

Train/Dev/Test split

Set Sentences Tokens
Train 1202 43,389
Dev 154 6,024
Test 670 20,701
Total 2,026 70,114

Earlier versions of the texts, annotated with HT-CRISCO workflow incorporating the use of HOPS parser, can be consulted on CRISCO Lab’s TXM server and via the website.

Please note that French-ALTS treebank is still under development and will be undergoing campains of correction. Annotation will be revised and expanded. Please do not hesitate to contact us is you have any questions, suggestions or comments.

Acknowledgments

This work was made possible thanks to the generous support of the ANR-DFG Franco-German scheme (MICLE project (2021-2024)) and of the Normandy region AUTOMATED project (2023-2025). The projects were led by Professor Pierre Larrivée at the University of Caen.

1563-1569_Guern

We thank the staff at the Guernsey Greffe archives and the Guernsey Museum & Art Gallery for giving us acces to the original manuscript and digital images in 2021 and 2023 which. We are also grateful to former island archivist Daryl Ogier for his assistance and advice when working with the original source. We are grateful to the team of student transcribers (Agathe Aubert, Lucie Marie-Leblanc, Marie Picart and Valentin Simenel) who helped with the transcription in 2022. We thank Patrice Lajoye and Stéphane Laîné for their assistance with lemmatisation and dialectal features of the text and to Mattis Le Squer who helped elucidate the historical context of the document. The annotation of 1563-1569_Guern has not been revised since UD 2.16 release. Annotation was performed by Natasha Romanova and Rayan Ziane, technical assistance by Khensa Daoudi.

**1578Terrien** The digitisation of Guillaume Terrien’s _Commentaires du droict civil tant public que privé observé au pays et duché de Normandie was originally performed by Morgane Pica and Mathieu Goux as part of the ConDE project funded by Normandy region. PoS annotation and lemmatisation was performed by Natasha Romanova. Annotation in syntactic functions was done by Théo Brillet and Natasha Romanova. Théo Brillet annotated all the sentences with Latin tokens. Khensa Daoudi and Rayan Ziane provided technical assistance.

References

See also:

Statistics of UD French ALTS

POS Tags

ADJADPADVAUXCCONJDETINTJNOUNNUMPRONPROPNPUNCTSCONJVERB

Features

DefiniteExtPosNumberNumTypePersonPolarityPossPronTypeTenseVerbForm

Relations

aclacl:relcladvcladvmodamodapposauxaux:passcaseccccompconjcopcsubjcsubj:outerdetdiscoursedislocatedexplfixedflatiobjmarknmodnsubjnummodobjoblorphanparataxispunctrootvocativexcomp

Tokenization and Word Segmentation

Morphology

Tags

Nominal Features

Degree and Polarity

Verbal Features

Pronouns, Determiners, Quantifiers

Other Features

Syntax

Auxiliary Verbs and Copula

Core Arguments, Oblique Arguments and Adjuncts

Here we consider only relations between verbs (parent) and nouns or pronouns (child).

Relations Overview