UD for Yiddish 
Tokenization and Word Segmentation
- In general, words are delimited by whitespace characters.
- Tokens are written in Hebrew script with transliterations provided as metadata.
- Hyphenated compounds are analyzed as one word – we do not split them.
- Contracted words are analyzed as two separated tokens. Example(s): tsum = tsu + dem; s’iz = es + iz
Morphology
Tags
- Yiddish uses all universal POS categories, except SYM.
- The following words in Yiddish are particles and are tagged PART: halevay ‘if only’; ni(sh)t ‘not’; sakh used only in fixed expressions a sakh ‘many’, aza sakh ‘so many’, and keyn sakh ‘not many’; the question particle tsi; and the infinitive marker tsu ‘to’
- In general, words that inflect for gender to agree with a modified noun are tagged DET (e.g., der/di/dos ‘the’, yeder ‘each’). The class of possessive adjectives are tagged DET, even though they do not inflect for gender in prenominal position. When used as a noun or in constructions such as mayner a khaver ‘a friend of mine’, they are tagged PRON.
- The Yiddish auxiliaries are:
- zayn ‘be’ for past tense of some verbs and as a copula
- hobn ‘have’ for past tense of most verbs
- veln ‘will’ for future tense
- vern ‘become’ for passive voice
- volt ‘would’ for conditional
- modal verbs: darfn ‘need’, zoln ‘should’, muzn ‘must’, kern ‘ought, might’, torn + nit ‘ought not’, megn ‘may’, kenen ‘can’, veln ‘want’, flegn ‘used to’ with zero-inflection in 3SG
- The verbs hobn, vern, darfn, kenen, veln can occur as normal verbs. zayn in periphrastic verbs is tagged VERB.
- The (de)verbal forms and their tags are as follows:
- Infinitive, VERB or AUX
- Finite verb, VERB or AUX
- Participle, VERB, AUX, or ADJ
- Verbs used as nonus are tagged VERB, including infinitive forms preceded by a definite article (e.g., dos trinken ‘drinking’) and verbal stems preceded by an indefinite article in the stem construction (e.g., a kuk ton/gebn ‘to look’)
Features
- Morphological features are not provided at this time.
Syntax
Core and Oblique Arguments
- A nominal subject (nsubj) is a noun phrase in nominative case.
- A nominal object (obj) is a noun phrase in accusative case.
- If a verb licenses two accusative objects, the relation iobj is used for the second one, usually a recipient.
- The object of a preposition (obl) is in dative case.
- An object in dative case without a preposition has the relation obl:arg.
- A clause is labeled csubj when it serves as the subject of its matrix clause.
- Clausal complements with a unique subject are labeled ccomp.
- Clausal complements with a subject determined by the next higher clause are labeled xcomp. This also goes for secondary predicates.
Non-verbal Clauses
- The copula verb zayn (be) is used in equational, attributional, locative, possessive, benefactory and existential nonverbal clauses.
Relations Overview
- Yiddish uses all universal syntactic relations, except clf.
- The following relation subtypes are also used in Yiddish:
- acl:relcl for adnominal relative clauses
- advcl:relcl for relative clauses whose antecedent is a clause
- aux:pass for passive auxiliaries
- compound:lvc for periphrastic verbs
- compound:prt for separable verb prefixes
- csubj:pass for clausal subjects of passive verbs
- det:poss for possessive adjectives
- expl:pv for reflexive clitics of inherently reflexive verbs
- flat:foreign for foreign expressions
- flat:name for multi-word proper noun
- nmod:poss for possessive modifier phrases
- nsubj:outer for subject of copular clause whose predicate is also a clause
- nsubj:pass for nominal subjects of passive verbs
- obl:agent for agents of passive verbs
- obl:arg for dative objects
Treebanks
There is one Yiddish UD treebanks: