home edit page issue tracker

This page still pertains to UD version 1.

Tokenization

The New York Abu Dhabi Arabic UD Treebank

The tokenization followed in the NYUAD Arabic UD treebank is the ATB tokenization used in the PATB.

ATB tokenization tokanizes all the clitics, except for the definite article, and normalizes Alif/Ya.