UD Neapolitan RB
Language: Neapolitan (code: nap
)
Family: IE
This treebank has been part of Universal Dependencies since the UD v2.9 release.
The following people have contributed to making this treebank part of UD: Rodolfo Basile.
Repository: UD_Neapolitan-RB
Search this treebank on-line: PML-TQ
Download all treebanks: UD 2.15
License: CC BY-SA 4.0
Genre: grammar-examples
Questions, comments? General annotation questions (either Neapolitan-specific or cross-linguistic) can be raised in the main UD issue tracker. You can report bugs in this treebank in the treebank-specific issue tracker on Github. If you want to collaborate, please contact [rodolfo • basile (æt) ut • ee]. Development of the treebank happens directly in the UD repository, so you may submit bug fixes as pull requests against the dev branch.
Annotation | Source |
---|---|
Lemmas | annotated manually |
UPOS | annotated manually, natively in UD style |
XPOS | not available |
Features | annotated manually, natively in UD style |
Relations | annotated manually, natively in UD style |
Description
This treebank contains example sentences in Neapolitan, translated by a native speaker.
The example sentences have been translated from Italian. Since Neapolitan orthography is not standardized, a new way of writing reduced vowels is proposed, to avoid italianization (Cerruti 2016). Reduced vowels are transcribed with a breve diacritic. Neapolitan reduced vowels are hence /ă/, /ĕ/ and /ŏ/, all representing the schwa sound.
Acknowledgments
…
References
Cerruti, Massimo. 2016. L’italianizzazione dei dialetti: una rassegna. Quaderns d’Italià 21, 63–74.
Statistics of UD Neapolitan RB
POS Tags
ADJ – ADP – DET – NOUN – PUNCT – VERB
Features
Relations
amod – case – det – nsubj – obj – obl – punct – root
Tokenization and Word Segmentation
- This corpus contains 1 sentences, 9 tokens and 10 syntactic words.
- This corpus contains 2 tokens (22%) that are not followed by a space.
- This corpus does not contain words with spaces.
- This corpus contains 2 types of words that contain both letters and punctuation. Examples: 'A, 'na
- This corpus contains 1 multi-word tokens. On average, one multi-word token consists of 2.00 syntactic words.
- There are 1 types of multi-word tokens. Examples: all'.
Morphology
Tags
- This corpus uses 6 UPOS tags out of 17 possible: ADJ, ADP, DET, NOUN, PUNCT, VERB
- This corpus does not use the following tags: PROPN, PRON, NUM, AUX, ADV, SCONJ, CCONJ, PART, INTJ, SYM, X
- This corpus contains 0 lemmas tagged as pronouns (PRON):
- This corpus contains 1 lemmas tagged as determiners (DET): _
- This corpus contains 0 lemmas tagged as auxiliaries (AUX):
- This corpus does not use the VerbForm feature.
Nominal Features
Degree and Polarity
Verbal Features
Pronouns, Determiners, Quantifiers
Other Features
Syntax
Auxiliary Verbs and Copula
- This corpus does not contain copulas.
- This corpus does not contain auxiliaries.
Core Arguments, Oblique Arguments and Adjuncts
Here we consider only relations between verbs (parent) and nouns or pronouns (child).
- nsubj
- VERB--NOUN (1)
- obj
- VERB--NOUN (1)
Relations Overview
- This corpus does not use relation subtypes.
- The following 29 relation types are not used in this corpus at all: iobj, csubj, ccomp, xcomp, vocative, expl, dislocated, advcl, advmod, discourse, aux, cop, mark, nmod, appos, nummod, acl, clf, conj, cc, fixed, flat, compound, list, parataxis, orphan, goeswith, reparandum, dep