home issue tracker

This page still pertains to UD version 1.

Features

Lexical features
PronType
NumType
+NumForm
Poss
Reflex
+Abbr
+Foreign
Inflectional features
Nominal Verbal
Gender VerbForm
Animacy Mood
Number Tense
Case Aspect
Definite Voice
Degree Person
+Gender[psor] Negative
+Number[psor]
+Variant

Abbr: abbreviation

Abbreviation is a feature of X: other to mark abbreviations ending with a dot.

Yes: it is abbreviation

Conversion from JOS

Currently, all abbreviations are converted to X and assigned the Abbr=Yes feature.

edit Abbr

Animacy: animacy

In contrast to some other languages, the Slovenian tagset does not consider Animacy to be a lexical feature, as certain types of inanimate nouns, such as institutions, personified objects, brand names etc., often take on both semantic and grammatical features of animate nouns.

Animacy is thus only marked as an inflectional feature of masculine nouns and proper nouns to distinguish between animate and inanimate word forms in accusative singular, e.g. Odstrigla si je koder. “She cut off a curl.” (inanimate) vs. Videla je kodra. “She saw a poodle.” (animate).

Anim: animate

Animate value is attributed to masculine nouns in accusative singular usually ending in -a:

Note that grammatical animatness can also apply to semantically inanimate nouns, such as car names, personified objects, brand names, card names etc.

Inan: inanimate

Inanimate value is attributed to all other masculine nouns in accusative singular:

Conversion from JOS

All nouns with Animate=yes are converted to Animacy=Anim and all nouns with Animate=no are converted to Animacy=Inan.

edit Animacy

Aspect: aspect

Aspect is a lexical feature of verbs that specifies the duration and completion of action in time.

Slovenian grammar distinguishes two aspect values: imperfect and perfect. Pairs of imperfective and perfective verbs exist and are often morphologically related, but the verbs are considered to belong to separate lemmas.

Imp: imperfect aspect

The action took / takes / will take some time span and there is no information whether and when it was / will be completed.

Perf: perfect aspect

The action has been / will have been completed. Since there is emphasis on one point on the time scale (the point of completion).

Verbs without Aspect

Verbs without Aspect are considered to be biaspectual, i.e. they can either denote duration or completion, but their actual interpretation depends on the context.

Conversion from JOS

Verbs with Aspect=perfective are converted to Aspect=Perf, verbs with Aspect=imperfective are converted Aspect=Imp and verbs with Aspect=biaspectual are not assigned the Aspect feature.

edit Aspect

Case: case

Case is an inflectional feature of adjectives, determiners, nouns, numerals, pronouns and proper nouns, as well as a valency (lexical) feature of adpositions.

Slovenian morphology distinguishes six cases: Nom, Gen, Dat, Acc, Loc and Ins.

Nom: nominative

The base form of the noun, also used as citation form (lemma). This is the word form usually used for subjects of clauses and answers the question Kdo ali kaj (je)? “Who or what (is there)?”.

Gen: genitive

In addition to its prototypical meaning of a noun modifying another noun, genitive in Slovenian is used in certain syntactic positions, such as negation, or with certain prepositions. Word form in genetive answers the question Koga ali česa (ni)? “Who or what (is not there)?”

Dat: dative

This is the word form often used for indirect objects of verbs. It answers the question Komu ali čemu (dam)? “To whom or what (do I give)?”

Acc: accusative

This is the word form most frequently used for direct objects of verbs. It answers the question Koga ali kaj (vidim)? “Who or what (do I see)?”

Loc: locative

The locative case often expresses location in space or time, but non-locational meanings also exist. The locative word form answers the question O kom ali čem (govorim)? “Of whom or what (do I speak)?” and usually appears with a preposition.

Ins: instrumental

The instrumental case often expresses the instrument, means or accompaniment of an action, but other meanings also exist. The instrumental word form answers the question S kom ali čim (delam)? “With whom or what (do I work)?”

Conversion from JOS

Tokens with feature Case=nominative are converted to Case=Nom, tokens with feature Case=genitive are converted to Case=Gen, tokens with feature Case=dative are converted to Case=Dat, tokens with feature Case=accusative are converted to Case=Acc, tokens with feature Case=locative are converted to Case=Loc, tokens with feature Case=instrumental are converted to Case=Ins.

edit Case

Definite: definiteness or state

Definiteness in Slovenian is an inflectional feature of masculine word forms in nominative and accusative singular that distinguishes whether we are talking about something known and concrete, or something general or unknown. It is currently marked on some adjectives and some determiners.

Ind: indefinite

Examples

Def: definite

Examples

Conversion from JOS

All adjectives and numerals with feature Definiteness=yes are converted to Definite=Def and all adjectives and numerals with feature Definiteness=no are converted to Definite=Ind.

However, note that definiteness has not been sufficiently solved within JOS, as it could also be attributed to other POS categories or category types, such as ordinal written numerals (prvi, drugi, tretji) and pronouns (neki). Consequently, some inconsistencies occur, as different JOS categories are merged into one UD category. For example, JOS-adjective slovenski “Slovenian” has feature Definite=Def, whereas JOS-numeral prvi “the first” does not have any Definite feature, although they are both tagged as UD ADJ and display identical grammatical characteristics.

edit Definite

Degree: degree of comparison

Degree of comparison is an inflectional feature of some adjectives and adverbs.

Pos: positive, first degree

This is the base form that merely states a quality of something, without comparing it to qualities of others.

Examples

Cmp: comparative, second degree

The quality of one object is compared to the same quality of another object. For most adverbs, two different word forms can be used interchangeably.

Examples

Sup: superlative, third degree

The quality of one object is compared to the same quality of all other objects within a set. For most adverbs, two different word forms can be used interchangeably.

Examples

Conversion from JOS

All adjectives and adverbs with feature Degree=positive are converted to Degree=Pos, all adjectives and adverbs with feature Degree=comparative are converted to Degree=Cmp, all adjectives and adverbs with Degree=superlative are converted to Degree=Sup.

edit Degree

Foreign: This is this a foreign word

Foreign is a lexical feature of some words belonging to class X: other. It is assigned to intervening foreign words that have not been analyzed grammatically. These can appear as a string or in combination with other Slovenian words. If a word is commonly used in Slovenian and displays Slovenian grammatical behavior, such as inflection, it is considered to be a Slovenian (loan) word, not a foreign word.

Foreign: it is foreign

Examples

Conversion from JOS

All tokens with tag Residual and Type=foreign are converted to X and Foreign=Yes.

edit Foreign

Gender: gender

Gender is a lexical feature of nouns and proper nouns, and an inflectional feature of other parts of speech (adjectives, verbs, auxiliary, pronouns, determiners and numerals) that mark agreement with nouns.

Masc: masculine gender

Examples

Fem: feminine gender

Examples

Neut: neuter gender

Examples

Conversion from JOS

All tokens with feature Gender=masculine are converted to Gender=Masc, all tokens with feature Gender=feminine are converted to Gender=Fem and all tokens with feature Gender=neuter are converted to Gender=Neut.

edit Gender

Gender[psor]: possessor’s gender

Possessive pronouns and determiners may have two different genders: that of the possessed object (in agreement with the modifying noun, inflectional feature) and that of the possessor (inherent, lexical feature). The Gender[psor] feature denotes the possessor’s gender.

Masc: masculine possessor

Examples

Fem: feminine possessor

Examples

Neut: neuter possessor

As possessor can also be a neuter noun, we also distinguish the neuter possessor gender. However, its word forms are identical to that of masculine possessor gender and can only be disambiguated within context.

Examples

Conversion from JOS

All pronouns with feature Owner_gender=masculine are converted to Gender[psor]=Masc, all pronouns with Owner_gender=feminine are converted to Gender[psor]=Fem and all pronouns with Owner_gender=neuter are converted to Gender[psor]=Neut.

Note that JOS annotation scheme does not assign possessor’s gender to possessive adjectives. For example, the possessive adjectival word form sinova (son’s) in sinova mama “son’s mother” is currently annotated with Gender=Fem, whereas it should be annotated with Gender[psor]=Masc|Gender=Fem in the future.

edit Gender[psor]

Mood: mood

Mood is a feature that expresses modality and subclassifies finite verb forms. It is an inflectional feature of auxiliaries and verbs.

Ind: indicative

The indicative can be considered the default mood. A verb in indicative merely states that something happens, has happened or will happen, without adding any attitude of the speaker.

Examples

Imp: imperative

The speaker uses imperative to order or ask the addressee to do the action of the verb.

Examples

Cnd: conditional

Generally, the conditional mood is used to express actions that would have taken place under some circumstances but they actually did not / do not happen. In Slovenian, present and past conditional are formed using the participle of the content verb and a special conditional form of the auxiliary verb biti “to be”. Thus, only this form is marked as Cnd, regardless of whether it is used to form a conditional or any other type of modality.

Examples

Conversion from JOS

All verbs with VForm=present and VForm=future are converted to Mood=Ind, all verbs with VForm=imperative are converted to Mood=Imp and all verbs with VForm=conditional are converted to Mood=Cnd. The non-finite verb forms (participle, infinitive, supine) do not have any Mood.

edit Mood

Negative: whether the word can be or is negated

In Slovenian, negation can be expressed in different ways. Syntactically, it can be marked by using negation particle ne “not”, as in Tega ne vem “I do not know that.” or Šli smo na ne najbolj zanimivo predavanje. “We attended a not too interesting lecture.”, or by using special negated verb form, as in To ni ona. “This is not her.” Morhplogically, negation is marked by prefix ne-, as in nepravičen “unfair”, neželen “unwanted” etc.

In the Slovenian UD Treebank, we currently only mark negation as an inflectional feature of a limited set of verbs and auxiliaries: biti “to be”, imeti “to have”, hoteti “to want”.

Neg: negative

Negative is assigned to negated word forms of verbs biti, imeti, hoteti.

Examples

Pos: positive, affirmative

Positive is assigned to non-negated word forms of verbs biti, imeti, hoteti with identical set of other grammatical features.

Examples

Conversion from JOS

All verbs with feature Negative=no are converted to Negative=Pos and all verbs with feature Negative=yes are converted to Negative=Neg.

edit Negative

NumForm: numeral form

NumForm is a lexical feature of numerals that marks whether the number is expressed by digits or letters.

Word: number expressed as word

Examples

Digit: number expressed using digits

Examples

Roman: roman numeral

Examples

Conversion from JOS

NumForm is assigned to all numerals that are converted to UD NUM. Numerals with Form=digit are converted to NumForm=Digit, numerals with Form=roman are converted to NumForm=Roman and numerals with Form=letter are converted to NumForm=Word. Note, however, that (word) numerals that are converted to UD ADJ, do not have any NumForm.

edit NumForm

NumType: numeral type

In Slovenian UD Treebank, NumType is a lexical feature of numerals and some adjectives that denote counting by numbers.

Card: cardinal number

Examples

Ord: ordinal number

Examples

Sets: number of sets of things

Numerals used to count sets of things or nouns that are pluralia tantum.

Examples

Gen: generic numeral, i.e. a numeral that is neither of the above

Examples

Conversion from JOS

All numerals with Type=cardinal are converted to NumType=Card and all numerals with Type=ordinal are converted to NumType=Ord. Numerals with Type=pronominal are either converted to NumType=Card (lemmas en and eden) or to NumType=Ord (lemma drug). Numerals with Type=special are either converted to NumType=Sets (lemmas not ending in -en) or to NumType=Gen (lemmas ending in -en).

Note that other types of quantifying words have not been explicitly marked in JOS, so assigning these and other NumType values to other words or part-of-speech categories, such as adjectives (enkraten, dvakraten, trikraten), adverbs (enkrat, dvakrat, trikrat; prvič, drugič, tretjič), determiners (veliko, malo, nekaj, koliko) and nouns (tretjina, polovica, četrtina), remains for future work.

edit NumType

Number: number

In Slovenian, Number is an inflectional feature of nouns and proper nouns, and other parts of speech (adjectives, auxiliaries, determiners, numerals, pronouns, verbs) that mark agreement with nouns.

Slovenian distinguishes three Number values: singular, dual and plural. Plurale tantum and Singulare tantum are not explicitly marked and are tagged as plural or singular, respectively.

Sing: singular number

Examples

Dual: dual number

Examples

Plur: plural number

Examples

Conversion from JOS

All tokens with feature Number=singular are converted to Number=Sing, all tokens with Number=dual are converted to Number=Dual and all tokens with Number=plural are converted to Number=Plur.

edit Number

Number[psor]: possessor’s number

Possessive pronouns and determiners may have two different numbers: that of the possessed object (in agreement with the modifying noun, inflectional feature) and that of the possessor (inherent, lexical feature). The Number[psor] feature denotes the possessor’s number.

Sing: singular possessor

Examples

Dual: dual possessor

Examples

Plur: plura possessor

Examples

Conversion from JOS

All pronouns with feature Owner_number=singular are converted to Number[psor]=Sing, all pronouns with Owner_number=dual are converted to Number[psor]=Dual and all pronouns with Owner_number=plural are converted to Number[psor]=Plur.

edit Number[psor]

Person: person

Person is a lexical feature of personal and possessive pronouns and determiners, and an inflectional feature of auxiliaries and verbs.

1: first person

Examples

2: second person

Examples

3: third person

Examples

Conversion from JOS

All tokens with feature Person=first are converted to Person=1, all tokens with feature Person=second are converted to Person=2 and all tokens with feature Person=third are converted to Person=3.

edit Person

Poss: possessive

Possessive is a lexical feature of adjectives, determiners and pronouns. It tells whether the word is possessive. Words without the Poss feature are not possessive.

Yes: it is possessive

Examples

Conversion from JOS

All adjectives and pronouns with Type=possessive are converted to Poss=Yes. Additionally, the reflexive pronoun svoj is also converted to Poss=Yes. Note that within JOS annotation scheme, possessiveness is not explicitly marked with other types of pronouns that denote possession, such as ćigav, čigaver, nikogaršnji etc.

edit Poss

PronType: pronominal type

This is a lexical feature of pronouns and some determiners.

Prs: personal pronoun or determiner

This feature includes both nominal personal, possessive and reflexive pronouns or determiners.

Examples

Int: interrogative pronoun or determiner

Examples:

Rel: relative pronoun or determiner

Examples:

Dem: demonstrative pronoun or determiner

Examples

Tot: total (collective) pronoun or determiner

Examples

Neg: negative pronoun, determiner or adverb

Examples

Ind: indefinite pronoun, determiner, numeral or adverb

Examples

Conversion from JOS

All pronouns with feature Type=personal, Type=reflexive and Type=possessive are converted to UD PronType=Prs. All pronouns with Type=interrogative are converted to UD PronType=Int, all pronouns with Type=relative are converted to UD PronType=Rel, all pronouns with Type=demonstrative are converted to UD PronType=Dem, all pronouns with Type=general are converted to UD PronType=Tot, all pronouns with Type=negative are converted to UD PronType=Neg and all pronouns with Type=indefinite are converted to UD PronType=Ind.

Not that currently PronType is only assigned to pronouns and determiners, but not to other POS categories, such as adverbs (zakaj “why”, čemu “what for”, kako “how”, tukaj “here”, tam “there”, tolikokrat “this many times” etc.)

edit PronType

Reflex: reflexive

Reflexiveness is a lexical feature of some pronouns and determiners. It tells whether the word is reflexive, i.e. refers to the subject of its clause.

We distinguish three types of reflexive word forms in Slovenian: reflexive pronoun se “oneself”, bound reflexive pronouns (e.g. zase “for oneself”) and possessive pronoun/determiner svoj “one’s own”.

Note that reflexive pronoun se can also be used in subjectless passive constructions and as a free morpheme of pseudo-reflexive verbs.

Yes: it is reflexive

Examples

Conversion to JOS

All pronouns with feature Type=reflexive are converted to UD Reflexice=Yes.

edit Reflex

Tense: tense

Tense is an inflectional feature of nouns and auxiliaries that specifies the time when the action took / takes / will take place, in relation to the current moment or to another action in the utterance. In Slovenian, only Present tense and Future tense can be expressed morphologically, while past tense is formed syntactically, by a combination of a present auxiliary verb biti “to be” and past participle (l-participle), e.g. sem šel “I went”.

Pres: present tense

The present tense denotes actions that are happening right now or that usually happen.

Examples

Fut: future tense

The future tense denotes actions that will happen after the current moment. Simmilarly to past tense, future tense for most Slovenian verbs is formed by a combination of future auxiliary verb biti “to be “ and l-participle (bom hodil “I will walk”). Thus, the Tense=Fut feature is only to biti that can either be used as a content or auxiliary verb.

Examples:

Conversion from JOS

All verbs with VForm=present are converted to UD Tense=Pres and all verbs with VForm=future are converted to UD Ťense=Fut. We do not assign tense to other verb forms, such as participle, infinitive, supine, conditional and imperative.

edit Tense

Variant: alternative form of word

In Slovenian, the Variant feature is either a lexical or inflectional features of some pronouns.

Bound: bound form

This value is assigned as a lexical feature of fused combinations of prepositions and personal pronouns that are currently tokenized as one word form and annotated as personal pronouns.

Examples

Short: clitic form

This value is assigned as an inflectional feature to clitic personal pronouns in genitive, dative and accusative to distinguish them from their longer counterparts with the same lemma and set of features.

Examples

edit Variant

VerbForm: form of verb or deverbative

Principally, VerbForm is an inflectional feature of verbs and auxiliaries, however, it is also used as a lexical feature of some adjectives and adverbs.

Fin: finite verb

Verbs that have a non-empty Mood are considered finite.

Examples

Inf: infinitive

Infinitive is the citation form of verbs and it appears as the argument of modal and other verbs.

Examples

Part: participle

Participle is a non-finite verb form that shares properties of verbs and adjectives. We distinguish two groups of participles: l-participles that can either be classified as verbs or adjectives, and all other participles (usually ending in , -n, or -t) that are always classified as adjectives, regardless of whether they are used as attributes or predicates.

As verbs, L-participles are used to form the past and future tense, and the conditional mood in present or past tense. As adjectives, both groups of participles can be used either as noun attributes (ukradena denarnica “stolen wallet”), as subject complements (denarnica je ukradena “the wallet is stolen”) or in passive constructions (denarnica je bila ukradena “the wallet has been stolen”).

Examples

Trans: transgressive

The transgressive, also called adverbial participle, is a non-finite verb form that shares properties of verbs and adverbs. In Slovenian, transgressives are always marked as adverbs.

Examples

Conversion from JOS

All verbs with feature VForm=present, VForm=future, VForm=conditional and VForm=imperative are converted to UD VerbForm=Fin. All verbs with VForm=Infinitive are converted to UD VerbForm=Inf, all verbs with VForm=Supine are converted to UD VerbForm=Supine, and all verbs with VForm=Participle are converted to UD VerbForm=Part. Additionally, all adjectives with Type=participle are converted to UD VerbForm=Part and all adverbs with Type=participle are converted to UD VerbForm=Trans.

Note that gerunds are currently marked as nouns and do not have a special VerbForm feature to distinguish them from other common nouns.

edit VerbForm