home edit page issue tracker

This page pertains to UD version 2.

It appears that you have Javascript disabled. Please consider enabling Javascript for this page to see the visualizations.

UD Romanian MolDoRo

Language: Romanian (code: ro)
Family: IE

This treebank has been part of Universal Dependencies since the UD v2.17 release.

The following people have contributed to making this treebank part of UD: Olesea Caftanatov, Atul Kr. Ojha.

Repository: UD_Romanian-MolDoRo
Search this treebank on-line: PML-TQ
Download all treebanks: UD 2.18

License: CC BY-SA 4.0

Genre: grammar-examples

Questions, comments? General annotation questions (either Romanian-specific or cross-linguistic) can be raised in the main UD issue tracker. You can report bugs in this treebank in the treebank-specific issue tracker on Github. If you want to collaborate, please contact [olesea • caftanatov (æt) math • md]. Development of the treebank happens directly in the UD repository, so you may submit bug fixes as pull requests against the dev branch.

Annotation	Source
Lemmas	annotated manually
UPOS	annotated manually, natively in UD style
XPOS	not available
Features	not available
Relations	annotated manually, natively in UD style

Description

A small treebank of sentences in Moldovan Romanian, using the Cyrillic writing system (as used in Moldova until 1989).

…

Acknowledgments

…

References

(citation)

Statistics of UD Romanian MolDoRo

POS Tags

ADJ – ADP – ADV – AUX – CCONJ – DET – NOUN – PART – PRON – PUNCT – SCONJ – VERB

Features

Relations

acl – advcl – advmod – amod – case – cc – conj – cop – det – expl:pv – fixed – iobj – mark – nmod – nsubj – obj – obl – parataxis – punct – root – xcomp

Tokenization and Word Segmentation

This corpus contains 30 sentences, 239 tokens and 241 syntactic words.

This corpus contains 75 tokens (31%) that are not followed by a space.

This corpus does not contain words with spaces.

This corpus contains 3 types of words that contain both letters and punctuation. Examples: -й, Лас', н'

This corpus contains 2 multi-word tokens. On average, one multi-word token consists of 2.00 syntactic words.
There are 2 types of multi-word tokens. Examples: Лас’сэ, н’ау.

Morphology

Nominal Features

Degree and Polarity

Verbal Features

Pronouns, Determiners, Quantifiers

Other Features

Syntax

Auxiliary Verbs and Copula

This corpus uses 1 lemmas as copulas (cop). Examples: фи.

This corpus does not contain auxiliaries.

Core Arguments, Oblique Arguments and Adjuncts

Here we consider only relations between verbs (parent) and nouns or pronouns (child).

nsubj
- VERB--NOUN (20)
- VERB--NOUN-ADP(де) (1)
- VERB--NOUN-ADP(дупэ)-ADP(ку) (1)
- VERB--PRON (2)

obj
- VERB--NOUN (7)
- VERB--NOUN-ADP(де) (1)
- VERB--NOUN-ADP(ку) (1)

iobj

Reflexive Verbs

This corpus contains 6 lemmas that occur at least once with an expl:pv child. Examples: фаче се, авя ышь, адуна се, муя се, теме се, ымфла се

Relations Overview

This corpus uses 1 relation subtypes: expl:pv
The following 1 main types are not used alone, they are always subtyped: expl
The following 16 relation types are not used in this corpus at all: csubj, ccomp, vocative, dislocated, discourse, aux, appos, nummod, clf, flat, compound, list, orphan, goeswith, reparandum, dep