home edit page issue tracker

This page pertains to UD version 2.

UD for Zaar

Tokenization and Word Segmentation

Since the dependencies presented in the Universal Dependencies framework are based on a lexical approach of syntax, the first step of the processing chain is to decide how to tokenize the language. The idea is, by breaking down the sentence into tokens, to extract the syntactic information related to words in the discourse chain.


This is an overview only. For more detailed discussion and examples, see the list of Zaar POS tags and Zaar features.





There is 1 Zaar UD treebank: