This page pertains to UD version 2.

UD for Middle French

Tokenization and Word Segmentation

Middle French tokenization is mostly based on whitespaces and punctuation. Some work is still needed for a complete analysis of fused forms such as “dudit” = “de ledit” (ADP+DET) along the UD guidelines.

There are 1 Middle French UD treebanks:

