This page pertains to UD version 2.

UD French ParisStories

Language: French (code: fr)
Family: Indo-European, Romance

This treebank has been part of Universal Dependencies since the UD v2.9 release.

The following people have contributed to making this treebank part of UD: Kim Gerdes, Sylvain Kahane, Menel Mahamdi.

License: CC BY-SA 4.0

Genre: spoken

Annotation Source
Lemmas annotated manually in non-UD style, automatically converted to UD
UPOS annotated manually in non-UD style, automatically converted to UD
XPOS not available
Features annotated manually in non-UD style, automatically converted to UD
Relations annotated manually in non-UD style, automatically converted to UD


Paris Stories is a corpus of oral French collected and transcribed by Linguistics students from Sorbonne Nouvelle and corrected by students from the Plurital Master’s Degree of Computational Linguistics ( Inalco, Paris Nanterre, Sorbonne Nouvelle) between 2017 and 2021. It contains monologues and dialogues from speakers living in the Parisian region.

For an assignment, students had to record a friend or a relative sharing an anecdote about a given theme (meaningful encounters, vacations, interesting stories..). The corpus was created for the study of contemporary spoken French and to train a syntactic parser for spoken French. All data has been morpho-syntactically annotated following the SUD (Surface Syntactic Universal Dependencies) guidelines.

See SUD Guidelines : https://surfacesyntacticud.github.io/guidelines/u/

The Treebank can be found here : http://match.grew.fr/?corpus=SUD_French-ParisStories@latest

The recordings can be downloaded via the url given in the ‘# sound_url’ metadata.


– Paris Stories 2019 –

Creation Year : 2017

Annotation Year : 2019

Size :

Topics : travels, funny/unusual stories

– Paris Stories 2020 –

Creation Year : 2018

Annotation Year : 2020

Size :

Topics : vacation stories, funny/unusual stories

– Paris Stories 2021 –

Creation Year : 2020

Annotation Year : 2021

Size :

Topics : first encounters, funny/unusual stories


The corpus is maintained here in the SUD framework and automatically converter into UD using the Grew software with the conversions rules described here.

Data Split

The file fr_parisstories-ud-test.conllu contains the following data:

The file fr_parisstories-ud-train.conllu contains the following data:


Annotation : Sylvain Kahane, Bruno Guillaume, Mariam Nakhlé, Vanessa Gaudray-Bouju, Menel Mahamdi

Annotation tools development : Kim Gerdes, Marine Courtin, Gaël Guibon

Conversion and handling of data validation : Bruno Guillaume

Direction of data collection : Cédric Gendrot, Kim Gerdes, Marine Courtin

We would like to thank all the students who participated in this project.


An article about the annotation of spoken French will soon be released (Kahane et al. 2021)

Statistics of UD French ParisStories

