This is part of archived UD v1 documentation. See http://universaldependencies.org/ for the current version.
home sl/pos issue tracker

DET: determiner

Definition

Determiners are words that modify nouns or noun phrases and express the reference of the noun phrase in context. That is, a determiner may indicate whether the noun is referring to a definite or indefinite element of a class, to a closer or more distant element, to an element belonging to a specified person or thing, to a particular number or quantity, etc.

The traditional grammar of Slovenian does not define determiners as a separate word class. Instead, words that perform the syntactic function of determiners are either categorizied as adverbs (nekaj “some”, veliko “a lot of”, dovolj “enough of” etc.) or pronouns (ta “this”, ves “all”, moj “my”, vsak “each” etc.), regardless of whether they are used as attributives (To.DET besedilo je nerazumljivo. “This text is incomprehensible.”) or substantives (To.PRON sem že slišal. “I have heard this before.”).

Conversion from JOS

Since JOS morphosyntactic specifications do not distinguish substantive and attributive pronouns or quantifying and other adverbs, the conversion is done based on syntactic information. The pronouns modifying a noun are thus marked as DET, otherwise they are marked as PRON. Similarly, the list of adverbs modifying a noun was manually validated to define a closed set of quantifying adverbs marked as DET.

Examples


Treebank Statistics (UD_Slovenian)

There are 63 DET lemmas (0%), 289 DET types (1%) and 3310 DET tokens (2%). Out of 16 observed tags, the rank of DET is: 8 in number of lemmas, 7 in number of types and 13 in number of tokens.

The 10 most frequent DET lemmas: ta, svoj, ves, njegov, nekaj, naš, njen, več, njihov, vsak

The 10 most frequent DET types: nekaj, ta, več, svoje, tem, te, svojo, vse, to, vseh

The 10 most frequent ambiguous lemmas: ta (PRON 807, DET 552), svoj (DET 456, PRON 23), ves (DET 289, PRON 198, ADV 1), njegov (DET 244, PRON 6), nekaj (DET 166, PRON 65, ADV 19), naš (DET 141, PRON 3), njen (DET 135, PRON 4), več (DET 125, PART 79, ADV 54), njihov (DET 117, PRON 2), vsak (DET 117, PRON 21)

The 10 most frequent ambiguous types: nekaj (DET 147, PRON 58, ADV 17), ta (DET 85, PRON 31), več (DET 121, PART 79, ADV 50), svoje (DET 107, PRON 12), tem (PRON 177, DET 96, NOUN 1, ADV 1), te (DET 78, PRON 23), svojo (DET 83, PRON 4), vse (PRON 88, DET 74, ADV 38), to (PRON 273, DET 63), vseh (DET 61, PRON 8)

Morphology

The form / lemma ratio of DET is 4.587302 (the average of all parts of speech is 1.894262).

The 1st highest number of forms (12) was observed with the lemma “tisti”: tist, tista, tiste, tistega, tistem, tistemu, tisti, tistih, tistim, tistimi, tistmu, tisto.

The 2nd highest number of forms (11) was observed with the lemma “naš”: naš, naša, naše, našega, našem, našemu, naši, naših, našim, našimi, našo.

The 3rd highest number of forms (11) was observed with the lemma “njihov”: njihov, njihova, njihove, njihovega, njihovem, njihovemu, njihovi, njihovih, njihovim, njihovimi, njihovo.

DET occurs with 10 features: sl-feat/PronType (3310; 100% instances), sl-feat/Case (2855; 86% instances), sl-feat/Gender (2855; 86% instances), sl-feat/Number (2855; 86% instances), sl-feat/Poss (1260; 38% instances), sl-feat/Number[psor] (804; 24% instances), sl-feat/Person (804; 24% instances), sl-feat/Reflex (456; 14% instances), sl-feat/Degree (455; 14% instances), sl-feat/Gender[psor] (379; 11% instances)

DET occurs with 33 feature-value pairs: Case=Acc, Case=Dat, Case=Gen, Case=Ins, Case=Loc, Case=Nom, Degree=Cmp, Degree=Pos, Degree=Sup, Gender=Fem, Gender=Masc, Gender=Neut, Gender[psor]=Fem, Gender[psor]=Masc, Gender[psor]=Neut, Number=Dual, Number=Plur, Number=Sing, Number[psor]=Dual, Number[psor]=Plur, Number[psor]=Sing, Person=1, Person=2, Person=3, Poss=Yes, PronType=Dem, PronType=Ind, PronType=Int, PronType=Neg, PronType=Prs, PronType=Rel, PronType=Tot, Reflex=Yes

DET occurs with 384 feature combinations. The most frequent feature combination is Degree=Pos|PronType=Ind (288 tokens). Examples: nekaj, veliko, dovolj, malo, pol, mnogo, par, preveč, dosti, nešteto

Relations

DET nodes are attached to their parents using 1 different relations: sl-dep/det (3310; 100% instances)

Parents of DET nodes belong to 2 different parts of speech: NOUN (3289; 99% instances), PROPN (21; 1% instances)

3187 (96%) DET nodes are leaves.

111 (3%) DET nodes have one child.

12 (0%) DET nodes have two children.

The highest child degree of a DET node is 2.

Children of DET nodes are attached using 6 different relations: sl-dep/advmod (69; 51% instances), sl-dep/mwe (53; 39% instances), sl-dep/cc (5; 4% instances), sl-dep/conj (5; 4% instances), sl-dep/nmod (2; 1% instances), sl-dep/case (1; 1% instances)

Children of DET nodes belong to 7 different parts of speech: SCONJ (48; 36% instances), ADV (46; 34% instances), PART (28; 21% instances), CONJ (5; 4% instances), PRON (5; 4% instances), ADJ (2; 1% instances), ADP (1; 1% instances)


Treebank Statistics (UD_Slovenian-SST)

There are 44 DET lemmas (1%), 171 DET types (3%) and 712 DET tokens (2%). Out of 16 observed tags, the rank of DET is: 11 in number of lemmas, 7 in number of types and 14 in number of tokens.

The 10 most frequent DET lemmas: ta, naš, nek, ves, tak, tisti, moj, kakšen, vsak, svoj

The 10 most frequent DET types: ta, to, te, tem, teh, vse, tega, tej, neko, nekaj

The 10 most frequent ambiguous lemmas: ta (PRON 613, DET 208), naš (DET 47, PRON 6), nek (DET 43, PRON 3), ves (PRON 54, DET 42), tak (DET 33, PRON 16), tisti (PRON 53, DET 32), moj (DET 27, PRON 8), kakšen (DET 26, PRON 20), vsak (DET 22, PRON 7), svoj (DET 21, PRON 4)

The 10 most frequent ambiguous types: ta (DET 79, PRON 28), to (PRON 489, DET 26, X 1), te (PRON 29, DET 21, ADV 21), tem (PRON 29, DET 21), teh (DET 20, PRON 3), vse (PRON 41, DET 17, ADV 7), tega (PRON 31, DET 15), neko (DET 12, PRON 1), nekaj (PRON 29, ADV 23, DET 11), take (DET 11, PRON 4)

Morphology

The form / lemma ratio of DET is 3.886364 (the average of all parts of speech is 1.575031).

The 1st highest number of forms (10) was observed with the lemma “svoj”: svoj, svoja, svoje, svojega, svojem, svojemu, svojih, svojim, svojimi, svojo.

The 2nd highest number of forms (9) was observed with the lemma “ta”: ta, te, tega, teh, tej, tem, temi, ti, to.

The 3rd highest number of forms (9) was observed with the lemma “ves”: ves, vsa, vse, vsega, vseh, vsem, vsemi, vsi, vso.

DET occurs with 9 features: sl-feat/PronType (712; 100% instances), sl-feat/Case (660; 93% instances), sl-feat/Gender (660; 93% instances), sl-feat/Number (660; 93% instances), sl-feat/Number[psor] (113; 16% instances), sl-feat/Person (113; 16% instances), sl-feat/Poss (113; 16% instances), sl-feat/Degree (52; 7% instances), sl-feat/Gender[psor] (9; 1% instances)

DET occurs with 30 feature-value pairs: Case=Acc, Case=Dat, Case=Gen, Case=Ins, Case=Loc, Case=Nom, Degree=Cmp, Degree=Pos, Degree=Sup, Gender=Fem, Gender=Masc, Gender=Neut, Gender[psor]=Fem, Gender[psor]=Masc, Number=Dual, Number=Plur, Number=Sing, Number[psor]=Plur, Number[psor]=Sing, Person=1, Person=2, Person=3, Poss=Yes, PronType=Dem, PronType=Ind, PronType=Int, PronType=Neg, PronType=Prs, PronType=Rel, PronType=Tot

DET occurs with 178 feature combinations. The most frequent feature combination is Case=Nom|Gender=Masc|Number=Sing|PronType=Dem (42 tokens). Examples: ta, tak, tale, tisti, oni, takšen

Relations

DET nodes are attached to their parents using 3 different relations: sl-dep/det (710; 100% instances), sl-dep/dobj (1; 0% instances), sl-dep/reparandum (1; 0% instances)

Parents of DET nodes belong to 9 different parts of speech: NOUN (642; 90% instances), ADJ (30; 4% instances), PROPN (17; 2% instances), PRON (10; 1% instances), NUM (4; 1% instances), X (4; 1% instances), VERB (3; 0% instances), ADV (1; 0% instances), DET (1; 0% instances)

684 (96%) DET nodes are leaves.

23 (3%) DET nodes have one child.

2 (0%) DET nodes have two children.

3 (0%) DET nodes have three or more children.

The highest child degree of a DET node is 3.

Children of DET nodes are attached using 6 different relations: sl-dep/advmod (15; 42% instances), sl-dep/reparandum (15; 42% instances), sl-dep/discourse:filler (3; 8% instances), sl-dep/amod (1; 3% instances), sl-dep/cc (1; 3% instances), sl-dep/conj (1; 3% instances)

Children of DET nodes belong to 9 different parts of speech: ADV (13; 36% instances), X (9; 25% instances), PRON (5; 14% instances), INTJ (3; 8% instances), PART (2; 6% instances), ADJ (1; 3% instances), ADP (1; 3% instances), CONJ (1; 3% instances), DET (1; 3% instances)


DET in other languages: [bg] [cs] [de] [el] [en] [es] [eu] [fa] [fi] [fr] [ga] [he] [hu] [it] [ja] [ko] [sv] [u]