Features
|
|
Animacy
: animacy
Similarly to Gender (and to the African noun classes), animacy is usually a lexical feature of nouns and inflectional feature of other parts of speech that mark agreement with nouns. It is independent of gender, therefore it is encoded separately in some tagsets (e.g. all the Multext-East tagsets).
In the BulTreeBank tagset Animacy
is not encoded as a special feature.
The dichotomy that plays a role here is rather: Human - Non-human.
With very few exceptions, these features are not encoded grammatically.
Anim: animate
As explicitly Animate can be considered the following pronouns:
- the masculine accusative forms of some pronouns: Pre-as-m (relative - когото /kogoto “whom”), Pce-as-m (collective - всекиго /vsekigo “everybody”, Pie-as-m (interrogative - кого /kogo “whom”), Pfe-as-m (indefinite - някого /nyakogo “somebody”), Pne-as-m (negative - никого /nikogo “nobody”))
- some pronouns for quantity of humans: Piy (interrogative - колцина / koltsina “how many”); Pfy# (indefinite - неколцина / nekoltsina “few, some”)
- the 1st and 2nd personal and possessive pronouns: Ppe#1 (аз, ние / az, nie “I, we”), Ppe#2 (ти, вие / ti, vie “you, you”), Pph#2 (Вие / Vie “you-honorific”); Ps#1# (мой / moy “my”), Ps#2# (твой / tvoy “your”)
Nhum: animate but non-human
It has the so-called count form
in contrast to the humans, but only for masculine nouns. The count form
is a kind of plural, which comes after numerals.
- два лъва / dva lava “two lions”
Inan: inanimate
It has also the so-called count form
in contrast to the humans, but only for masculine nouns. The count form
is a kind of plural, which comes after numerals.
- три стола / tri stola “three chairs”
Note that the symbol `#’, used in the Universal POS section indicates a holder for arbitrary number of features, suppressed in the respective tag as irrelevant in the BulTreeBank tagset, when mapped to the Universal one.
Aspect
: aspect
Aspect
Aspect is a feature that specifies duration of the action in time, whether the action has been completed etc.
In Bulgarian aspect is a lexical feature, as in other Slavic languages. It comprises two grammemes: perfective and imperfective.
Imp
: imperfect aspect
The action took / takes / will take some time span and there is no information whether and when it was / will be completed.
Examples
- казвам / kazvam “say”
- намирам / namiram “find”
- разбирам / razbiram “understand”
Perf
: perfect aspect
The action has been / will have been completed. Since there is emphasis on one point on the time scale (the point of completion), this aspect does not work well with the present tense for actual activities.
Examples
- кажа / kazha “say”
- намеря / namerya “find”
- разбера / razbera “understand”
Case
: case
In Bulgarian only some nouns have special vocative forms (v):
Examples
- Иване, приятелю, Родино, Стефке / Ivane, priyatelyu, Rodino, Stefke (Ivan, friend, homeland, Stefka)
The cases are still alive in personal pronouns: nominative (n), accusative (a) and dative (d).
Examples
- нея, тя, му, го / neya, tya, mu, go (her.ACC.LONG, she.NOM, him.DAT.SHORT, him.ACC.SHORT).
Accusative and dative cases are still present in the masculine, singular forms of some other pronouns – interrogative, indefinite, collective, relative, negative. Please note that the dative forms are analytical and thus, only the accusative form is marked after the preposition ‘на’.
Examples
- кого, някого, никого / kogo, nyakogo, nikogo (whom, someone.ACC, nobody.ACC)
- на кого, на някого, на никого / na kogo, na nyakogo, na nikogo (to whom, to someone.ACC, to nobody.ACC)
In our tagset another idiosyncratic case has been marked – the so-called ‘dative possessive case’ (s). It refers to situations where the short possessive pronoun comes before its possessor noun and thus – next to the verb.
Examples
- Той ми взе шапката / Toy mi vze shapkata ‘He my.POSS took hat.DEF’ (He took my hat.)
The canonical sentence would be: Той взе шапката ми / Toy vze shapkata mi ‘He took hat.DEF my.POSS’ (He took my hat).
Definite
: definiteness or state
Definiteness is typically a feature of nouns, adjectives and articles. Its value distinguishes whether we are talking about something known and concrete, or something general or unknown. It can be marked on definite and indefinite articles, or directly on nouns, adjectives etc.
In Bulgarian there are definite and indefinite articles. The definite article is part of the word, in postposition (жената / zhenata ‘woman-the’ (the woman))). The indefinite articles can be: the form един / edin (one) or the zero marker.
However, when added to a nominal phrase, the articles become phrasal affixes, i.e. Bulgarian does not have agreement is definiteness. For example, хубавата висока руса жена / hubavata visoka rusa zhena ‘pretty-the tall blond woman’ (the pretty tall blond woman).
Ind
: indefinite
Examples
- Видях една жена да минава по улицата / Vidyah edna zhena da minava po ulitsata “I saw a woman walking on the street”
- Видях жена да минава по улицата / Vidyah zhena da minava po ulitsata “I saw a woman walking on the street”
Def
: definite
Examples
- Видях жената да минава по улицата / Vidyah zhenata da minava po ulitsata “I saw the woman walking on the street”
Degree
: degree of comparison
Degree of comparison is typically an inflectional feature of some adjectives and adverbs.
In Bulgarian the comparative and superlative forms are created with the help of
the particles по
/ po “more” and най
/ nay “most”, which are part of the word and come in preposition,
separated by a defice.
Pos
: positive, first degree
This is the base form that merely states a quality of something, without comparing it to qualities of others. Note that although this degree is traditionally called “positive”, negative properties can be compared, too.
Examples
- удобен стол / udoben stol “a comfortable chair”
- млад човек / mlad chovek “a young man”
Cmp
: comparative, second degree
The quality of one object is compared to the same quality of another object.
Examples
- Моят стол е по-удобен от твоя / Moyat stol e po-udoben ot tvoya “My chair is more comfortable than yours”
- Брат ми е по-млад от мен / Brat mi e po-mlad ot men “My brother is younger than me”
Sup
: superlative, third degree
The quality of one object is compared to the same quality of all other objects within a set.
Examples
- Този стол е най-удобният от всички / Tozi stol e nay-udobniyat ot vsichki “This chair is the most comfortable of all”
- Той е най-младият учител в училището / Toy e nay-mladiyat uchitel v uchilishteto “He is the youngest teacher in the school”
Gender
: gender
Gender is usually a lexical feature of nouns and inflectional feature of other parts of speech (adjectives, verbs) that mark agreement with nouns. In Bulgarian gender is grammatical.
There are three genders: masculine(m), feminine (f) and neuter (n).
Masc: masculine gender
Nouns denoting male persons are masculine. Other nouns may be also grammatically masculine, without any relation to sex.
Example: [bg] замък / zamak “castle”
Fem: feminine gender
Nouns denoting female persons are feminine. Other nouns may be also grammatically feminine, without any relation to sex.
Example: [bg] маса / masa “table”
Neut: neuter gender
Neither masculine nor feminine (grammatically).
Example: [bg] дете / dete “child”
Mood
: mood
Mood
Mood is a feature that expresses modality and subclassifies finite verb forms. In Bulgarian there are three moods: Indicative, Imperative and Conditional.
Ind
: indicative
The indicative can be considered the default mood. A verb in indicative merely states that something happens, has happened or will happen, without adding any attitude of the speaker. Indicative covers all the 9 tenses and their passive forms in Bulgarian. It also covers the evidential forms.
Examples
- Следвам право в университета. / Sledvam pravo v universiteta “I study law at the University”.
- Той беше ходил в САЩ много пъти. / Toy beshe hodil v SASHT mnogo pati “He had been to the USA many times”.
Imp
: imperative
The speaker uses imperative to order or ask the addressee to do the action of the verb. The forms in Bulgarian are synthetic.
Examples
- Купете хляб и сирене! / Kupete hlyab i sirene “Buy some bread and cheese!”
- Подай ми солта, моля! / Poday mi solta, molya “Pass me the salt, please!”
Cnd
: conditional
The conditional mood is used to express actions that might happen under certain circumstances or that would have taken place but they actually did not / do not happen. It usually presupposes volition. The forms in Bulgarian are analytic.
Examples
- Бих дошъл, ако ме поканиш. / Bih doshal, ako me pokanish “I would come if you invite me.”
- Бих дошъл, ако имах възможност. / Bih doshal, ako imah vazmozhnost “I would come if I could.”
- Би трябвало добре да се подготвим за срещата. / Bi tryabvalo dobre da se podgotvim za sreshtata “We should prepare very well for the meeting.”
Negative
: whether the word can be or is negated
Negativeness
Negativeness is typically a feature of verbs, adjectives, sometimes also adverbs and nouns in languages that negate using bound morphemes.
In Bulgarian nouns, adjectives, attrubutive participles use bound morpheme не (with the exception of clear contrastive contexts) Verbs and transgressives, however, use the clitic не for negation.
The negativeness feature is used to distinguish response interjections yes and no.
Pos
: positive, affirmative
Examples
- човек / chovek “man”
- добър / dobar “good”
- разбралата жена / razbralata zhena “the woman that understood”
- вървя / varvya “I am walking”
- вървейки / varveyki “walking”
Neg: negative
Examples
- нечовек / nechovek “not a man”
- недобър / nedobar “not good”
- неразбралата жена / nerazbralata zhena “the woman that did not understand”
- не вървя / ne varvya “I am not walking”
- не вървейки / ne varveyki “not walking”
NumType
: numeral type
NumType
Some languages (especially Slavic) have a complex system of numerals. For example, in the school grammar of Czech, the main part of speech is “numeral”, it includes almost everything where counting is involved and there are various subtypes. It also includes interrogative, relative, indefinite and demonstrative words referring to numbers (words like kolik / how many, tolik / so many, několik / some, a few), so at the same time we may have a non-empty value of PronType. (In English, these words are called quantifiers and they are considered a subgroup of determiners.)
In this respect Bulgarian behaves like Czech language.
From the syntactic point of view, some numtypes behave like adjectives
and some behave like adverbs. We tag them u-pos/ADJ and
u-pos/ADV respectively. Thus the NumType
feature applies to
several different parts of speech:
- u-pos/NUM: cardinal numerals
- u-pos/DET: quantifiers
- u-pos/ADJ: definite adjectival, e.g. ordinal numerals
- u-pos/ADV: adverbial (e.g. ordinal and multiplicative) numerals, both definite and pronominal
Card
: cardinal number or corresponding interrogative / relative / indefinite / demonstrative word
Note that in some Indo-European languages there is a fuzzy borderline between numerals and nouns for thousand, million and billion.
Examples
- [bg] едно, две, три / edno, dve, tri “one, two, three”; колко / kolko “how many”; няколко / nyakolko “some”; толкова / tolkova “so many”; много / mnogo “many”; малко / malko “few”
Ord
: ordinal number or corresponding interrogative / relative / indefinite / demonstrative word
This is a subtype of adjective.
Examples
- [bg] adjectival: първи / parvi “first”; втори / vtori “second”, трети / treti “third”, etc.
Mult
: multiplicative numeral or corresponding interrogative / relative / indefinite / demonstrative word
This is subtype of adverb.
Examples
- [bg] веднъж / vednazh “once”; дваж / dvazh “twice”
Frac
: fraction
This is a subtype of cardinal numbers, occasionally distinguished in corpora. It may denote a fraction or just the denominator of the fraction. In Bulgarian the numerator is cardinal numeral and denominator is ordinal numeral.
Examples
- [bg] две трети / dve treti “two thirds”
Number
: number
Number is an inflectional feature of nouns, adjectives, verbs. In the tagset it is encoded as: singular (s), plural (p), count (c), pluralia tantum (l). Singularia tantum is not encoded.
Sing: singular number
A singular noun denotes one person, animal or thing.
Examples: [bg] молив / moliv (pencil)
Plur: plural number
A plural noun denotes several persons, animals or things.
Examples: [bg] моливи / molivi (pencils)
Count: count plural form
A form that is used as plural for masculine non-person nouns after numerals. This is a remnant of the dual form.
Examples: [bg] 2 молива / (2) moliva (2 pencils-count)
Ptan: plurale tantum
Some nouns appear only in the plural form even though they denote one thing (semantic singular); some tagsets mark this distinction.
Examples: [bg] финанси, дънки / finansi, danki (finances, jeans)
Coll: collective / mass / singulare tantum
Collective or mass or singulare tantum is a special case of singular. It applies to words that use grammatical singular to describe sets of objects, i.e. semantic plural.
Examples: [bg] човечество / chovechestvo (mankind)
Person
: person
Person
Person is typically feature of personal and possessive pronouns, and of verbs. On verbs it is in fact an agreement feature that marks the person of the verb’s subject. Person marked on verbs makes it unnecessary to always add a personal pronoun as subject and thus subjects are sometimes dropped (pro-drop languages).
Bulgarian is a pro-drop language, as other Slavic languages.
1
: first person
In singular, the first person refers just to the speaker / author. In plural, it must include the speaker and one or more additional persons.
Examples
- аз / az “I”
- идвам / idvam “I am coming”
2
: second person
In singular, the second person refers to the addressee of the utterance / text. In plural, it may mean several addressees and optionally some third persons too.
Examples
- ти / ti “you”
- идваш / idvash “You are coming”
3
: third person
The third person refers to one or more persons that are neither speakers nor addressees.
Examples
- той, тя, то / toy, tya, to “he, she, it”
- идва / idva “He/she/it is coming”
Poss
: possessive
Poss
Boolean feature of pronouns, determiners or adjectives. It tells whether the word is possessive.
While many tagsets would have “possessive” as one of the various pronoun types, this feature is intentionally separate from PronType, as it is orthogonal to pronominal types. Several of the pronominal types can be optionally possessive, and adjectives can too.
In BulTreeBank tagset “possessive” is one of the various pronoun types.
Yes
: it is possessive
Note that there is no No
value. If the word is not possessive, the
Poss
feature will just not be mentioned in the FEAT
column. (Which
means that empty value has the No
meaning.)
Examples
- [bg] possessive adjectives: майчина любов / maychina lyubov “mother’s love”
PronType
: pronominal type
PronType
This feature typically applies to pronouns, determiners, pronominal numerals (quantifiers) and pronominal adverbs.
Prs
: personal or possessive personal pronoun or determiner
See also the Poss feature that distinguishes normal personal
pronouns from possessives. Note that Prs
also includes reflexive
personal/possessive pronouns (e.g. [cs] se / svůj; see the
Reflex feature).
Examples
- аз, ти, той, тя, то, ние, вие, те, себе си, мой, твой, негов, неин, негов, наш, ваш, техен, свой / az, ti, toy, tya, to, nie, vie, te, sebe si, moy, tvoy, negov, nein, negov, nash, vash, tehen, svoy “I, you, he, she, it, we, they, oneself, my, your, his, her, its, our, their, mine, yours, hers, ours, theirs, oneself’s”
Rcp
: reciprocal pronoun
Examples
- един друг / edin drug “one another”
- един на друг / edin na drug “each other”
Int
: interrogative pronoun, determiner, numeral or adverb
Note that possessive interrogative determiners (whose) can be distinguished by the Poss feature.
Examples:
- [bg/en] кой /koy “who”, какво / kakvo “what”, кой / koy “which”, чий / chiy “whose”, колко / kolko “how many, how much”, къде / kade “where”, кога / koga “when”, как / kak “how”, защо / zashto “why”
Rel
: relative pronoun, determiner, numeral or adverb
In Bulgarian this class is distinct from the class of interrogatives.
Examples:
- [bg] който / koyto “which”, “that” (relative but not interrogative pronouns); чийто / chiyto “whose” (possessive relative pronoun)
Dem
: demonstrative pronoun, determiner, numeral or adverb
BulTreeBank tagset does not differenciate between pronouns for narness/distance, although in Bulgarian there is such distinction.
Examples
- [bg/en] този / този “this”, онзи / onzi “that”, такъв / takav “such”, тук / tuk “here”, там / tam “there”, etc.
Tot
: total (collective) pronoun, determiner or adverb
Examples
- [bg/en] всеки / vseki “every, everybody, everyone, each”, всичко / vsichko “everything” “all”, etc.
Neg
: negative pronoun, determiner or adverb
Examples:
- [bg/en] никой / nikoy “nobody”, нищо / nishto “nothing”, никакъв / nikakav “no”, ничий nichiy “no one’s” (possessive negative pronoun), etc.
Ind
: indefinite pronoun, determiner, numeral or adverb
Examples
- [bg/en] някой / nyakoy “somebody”, нещо / neshto “something”, някакъв / nyakakav “some”, нечий / nechiy someone’s_ (possessive indefinite pronoun), etc.
- [bg/en] който и да е / koyto i da e “whoever, anybody”, каквото и да е / kakvoto i da e “whatever, anything”, etc.
- [bg/en] еди-кой си / edi-koy si “somebody specific for the speaker, but not for the hearer”
Reflex
: reflexive
Reflex
Boolean feature, typically of pronouns or determiners. It tells whether the word is reflexive, i.e. refers to the subject of its clause.
In Bulgarian the reflexive feature is not encoded as one of the pronoun types, but as a reference type (similarly to entity, attribute, possession, etc.)
In Bulgarian there are reflexive verbs - both as form and as meaning. They are written separately: събуждам се / sabuzhdam se “to wake up”.
Yes
: it is reflexive
Note that there is no No
value. If the word is not reflexive, the
Reflex
feature will just not be mentioned in the FEAT
column. (Which means that empty value has the No
meaning.)
Examples
- [bg] reflexive personal pronouns: се, си, себе си / se, si, sebe si “oneself”; reflexive possessive pronoun: свой / svoy “oneself’s”.
Tense
: tense
Tense
Tense is a feature that specifies the time when the action took / takes / will take place, in relation to the current moment or to another action in the utterance. In Bulgarian aspect and tense are separate, although not completely independent of each other.
In Bulgarian there are 9 tenses: 3 synthetic and 6 analytic.
Since the feature Tense is assigned to a single word, i.e. it relates to synthetic forms, in Bulgarian it is applicable to only 3 tenses: Present, Aorist and Imperfect.
Past
: past tense / preterite / aorist
The past tense denotes actions that happened before the current moment. In Bulgarian, this is aorist. It can be used with both imperfective and perfective verbs.
Examples
- Те дойдоха навреме. / Te doydoha navreme “They came on time”.
- Взе ли си изпита? / Vze li si izpita? “Did you take the exam?”
Pres
: present tense
The present tense denotes actions that are happening right now, that are crossing the moment of speaking or that usually happen. In Bulgarian present tense has a lot of usages: for actual activities (where the perfective verbs are blocked); for historical events, for habitual activities, etc.
Examples
- В момента чета. / V momenta cheta “I am reading now”.
- Всеки ден чета. / Vseki den cheta “I read every day”.
Imp
: imperfect
Imperfect is a special case of the past tense. It denotes actions that are happening during some past moment. These actions might continue after the moment of speaking, but also might not, i.e. the evidence is not in the form itself, but it is in the context. Both verbs - perfective and imperfective - are used in imperfect tense.
- Когато се прибрах вкъщи, децата вече спяха. / Kogato se pribrah vkashti, detsata veche spyaha “When I came home, the children were already asleep.”
- Щом дойдеше, веднага запалваше цигара. / Shtom doydeshe, vednaga zapalvashe tsigara “Every time he came, he always lit a cigarette”.
VerbForm
: form of verb or deverbative
Even though the name of the feature seems to suggest that it is used
exclusively with verbs, it is not the case. Some verb
forms in some languages actually form a gray zone between verbs and
other parts of speech (nouns, adjectives
and adverbs). For instance, participles may be either
classified as verbs or as adjectives, depending on language and
context. In both cases VerbForm=Part
may be used to separate them
from other verb forms or other types of adjectives.
Bulgarian does not have an infinitive. It distinguishes: finite verbs and non-finite verbs (participles and transgressives).
Fin
: finite verb
Rule of thumb: if it has non-empty Mood, it is finite.
This features is encoded in the following values as second position in verbal tags: Vp#
(personal verb); Vn#
(impersonal verb); Vx#
, Vy#
and Vi#
(auxiliary verbs).
Examples
- Аз съм, ти си / Az sam, ti si “I am, you are”
- Трябва да дойдеш /Tryabva da doydesh “You must come”
- Прочетох книгата / Prochetoh knigata “I read the book”
Part
: participle
Participle is a non-finite verb form that shares properties of verbs
and adjectives. The participle in Bulgarian is encoded as c
in fifth position of the tag: V#c#
.
In Bulgarian there are four types of participles: present active, past perfective active, past imperfective active, past passive. The present active one can be used only adjectively; the past imperfective one can be used only in evidential verb forms; the other have the two usages. The present active can be derived only from imperfective verbs.
Examples
- виждащ / vizhdasht “seeing” (present active). BulTreeBank tag:
V#car#
- видял / vidyal “seen” (past perfective active). BulTreeBank tag:
V#cao#
- видел / videl “seen” (past imperfective active). BulTreeBank tag:
V#cam#
- видян / vidyan “seen” (past passive). BulTreeBank tag:
V#cv#
Trans
: transgressive
The transgressive, also called adverbial participle, is a non-finite verb form that shares properties of verbs and adverbs. It appears e.g. in Slavic and Indo-Aryan languages.
In Bulgarian it can be derived only from imperfective verbs.
Examples
- Виждайки това, той се разстрои / Vizhdayki tova, toy se razstroi “Having seen this, he became upset”. BulTreebang tag:
V#g
Note that the symbol `#’, used in the Universal POS section indicates a holder for arbitrary number of features, suppressed in the respective tag as irrelevant in the BulTreeBank tagset, when mapped to the Universal one.
Voice
: voice
Voice
For Indo-European speakers, voice means mainly the active-passive distinction. In other languages, other shades of verb meaning are categorized as voice.
In Bulgarian linguistics there are various theories of Voice distinctions: 2-voice one (active vs. passive), 3-voice one (active vs. passive vs. reflexive), 4-voice one(active vs. passive vs. reflexive vs. impersonal).
Here the 2-voice theory is adopted.
Act
: active voice
The subject of the verb is the doer of the action (agent), the object is affected by the action (pacient).
Examples
- Нападнахме врага. / Napadnahme vraga “We attacked the enemy”.
- Децата се засмяха. / Detsata se zasmyaha “The children laughed”.
- Децата се измиха. / Detsata se izmiha “The children washed themselves”.
Pass
: passive voice
The subject of the verb is affected by the action (patient). The doer (agent) is either unexpressed or it appears as an object of the verb. In Bulgarian there are two ways of forming passive:
- tenses plus the reflexive particle se
- special participial conjugation
Examples
- Тази книга се чете лесно. / Tazi kniga se chete lesno “This book reads easily”.
- Тази книга беше прочетена по-бързо от другите. / Tazi kniga beshe prochetena po-barzo ot drugite “This book was read faster than the others”.