DET
: determiner
Definition
Determiners are words that modify nouns or noun phrases and express the reference of the noun phrase in context. That is, a determiner may indicate whether the noun is referring to a definite or indefinite element of a class, to a closer or more distant element, to an element belonging to a specified person or thing, to a particular number or quantity, etc.
An important point to note is that the traditional grammar of Czech does not
define determiners as a separate word class. Czech does not have articles.
Most determiners are traditionally called pronouns; that is, an UD-conformant
annotation of Czech must distinguish between substantive pronouns (UD tag PRON)
and attributive pronouns (UD tag DET
).
Also note that the DET
tag includes (pronominal) quantifiers (words
like mnoho, málo “many, few”), which the traditional grammar classifies
as a special subclass of numerals. However,
cardinal numerals in the narrow sense (jeden, pět, sto) are not
tagged DET
even though some authors would include them in
quantifiers. Cardinal numbers have their own tag NUM.
Conversion from the Prague Dependency Treebank
Since the PDT tagset (like all other Czech tagsets) does not distinguish
substantive and attributive pronouns, morphological tags alone are not enough
to find the correct universal POS tag.
Morphological rules could help, as the inflection patterns of some pronouns
bear similarities to adjectival inflection; nevertheless, there will be other
cases that cannot be solved this way.
We have to examine the dependency tree.
If a pronoun modifies a noun, it should be tagged DET
.
Otherwise it is PRON
.
As a result, all words that can be tagged DET
can also be tagged PRON
,
but some words can only be tagged PRON
.
(We cannot recognize cases where the pronoun is in fact attributive, but the
modified noun has been elided and is not represented in the tree.)
For instance, tohle “this” is either pronoun (Tohle jsem viděl včera. “I saw this yesterday.”) or determiner (Tohle auto jsem viděl včera. “I saw this car yesterday.”)
Examples
- possessive determiners: můj, tvůj, jeho, její, náš, váš, jejich “my, your, his, her, our, your, their”
- reflexive possessive determiner: svůj “one’s own”
- demonstrative determiners: tohle as in Tohle auto jsem viděl včera. “I saw this car yesterday.”
- interrogative determiners: který as in Které auto se ti líbí? “Which car do you like?”
- relative determiners: který as in Zajímá mě, které auto se ti líbí. “I wonder which car you like.”
- relative possessive determiner: jehož “whose”
- indefinite determiners: nějaký, některý
- total determiners: každý, všechen
- negative determiners: žádný as in Nemáme žádná auta. “We have no cars available.”
References
Treebank Statistics (UD_Czech)
There are 55 DET
lemmas (0%), 325 DET
types (0%) and 27813 DET
tokens (2%).
Out of 17 observed tags, the rank of DET
is: 10 in number of lemmas, 8 in number of types and 11 in number of tokens.
The 10 most frequent DET
lemmas: tento, jeho, svůj, můj, ten, některý, několik, takový, žádný, jenž
The 10 most frequent DET
types: jeho, jejich, své, této, její, tento, tohoto, svou, tato, těchto
The 10 most frequent ambiguous lemmas: tento (DET 6202, PRON 99), jeho (DET 5790, PRON 46), svůj (DET 4767, PRON 113, ADJ 4), můj (DET 2581, PRON 71), ten (PRON 11968, DET 1312), některý (DET 1096, PRON 234), několik (DET 871, PRON 26), takový (DET 866, PRON 169), žádný (DET 744, PRON 87), jenž (PRON 2211, DET 648)
The 10 most frequent ambiguous types: jeho (DET 2456, PRON 33), jejich (DET 1697, PRON 12), své (DET 1366, PRON 40, ADJ 1), této (DET 993, PRON 3), její (DET 711, PRON 8), tento (DET 585, PRON 10), svou (DET 607, PRON 7), tato (DET 377, PRON 7), těchto (DET 581, PRON 8), tyto (DET 432, PRON 1)
- jeho
- jejich
- své
- DET 1366: A čeho si má při své návštěvě především všímat ?
- PRON 40: Stále převažující nabídka nad poptávkou dělá své .
- ADJ 1: Jinak samozřejmě nevíme , co všechno vzalo za své v plamenech žároviště , a nemůžeme proto jednoznačně odpovědět na otázku , zda šlo skutečně o natolik chudé obyvatelstvo , že jedinou výbavou jejich hrobů byly rozbité nádoby .
- této
- její
- tento
- svou
- tato
- těchto
- tyto
Morphology
The form / lemma ratio of DET
is 5.909091 (the average of all parts of speech is 2.195970).
The 1st highest number of forms (27) was observed with the lemma “můj”: Mí, moje, moji, mojí, mou, má, mé, mého, mém, mému, mých, mýho, mým, mými, můj, n, naše, našeho, našem, našemu, naši, našich, našim, našimi, naší, naším, náš
The 2nd highest number of forms (19) was observed with the lemma “jakýkoliv”: jakoukoli, jakoukoliv, jakákoli, jakákoliv, jakéhokoli, jakéhokoliv, jakékoli, jakékoliv, jakémkoli, jakémkoliv, jakémukoli, jakémukoliv, jakýchkoli, jakýchkoliv, jakýkoli, jakýkoliv, jakýmikoliv, jakýmkoli, jakýmkoliv
The 3rd highest number of forms (16) was observed with the lemma “ten”: ta, ten, ti, to, toho, tom, tomu, tou, tu, ty, té, tím, těch, těm, těma, těmi
DET
occurs with 16 features: cs-feat/PronType (27813; 100% instances), cs-feat/Case (22385; 80% instances), cs-feat/Number (21264; 76% instances), cs-feat/Gender (17996; 65% instances), cs-feat/Poss (14044; 50% instances), cs-feat/Number[psor] (9276; 33% instances), cs-feat/Person (9276; 33% instances), cs-feat/Reflex (4768; 17% instances), cs-feat/Gender[psor] (4331; 16% instances), cs-feat/Animacy (2621; 9% instances), cs-feat/NumType (1552; 6% instances), cs-feat/Negative (744; 3% instances), cs-feat/Abbr (15; 0% instances), cs-feat/Style (14; 0% instances), cs-feat/Foreign (1; 0% instances), cs-feat/NameType (1; 0% instances)
DET
occurs with 40 feature-value pairs: Abbr=Yes
, Animacy=Anim
, Animacy=Inan
, Case=Acc
, Case=Dat
, Case=Gen
, Case=Ins
, Case=Loc
, Case=Nom
, Case=Voc
, Foreign=Foreign
, Gender=Fem
, Gender=Fem,Neut
, Gender=Masc
, Gender=Masc,Neut
, Gender=Neut
, Gender[psor]=Fem
, Gender[psor]=Masc,Neut
, NameType=Oth
, Negative=Neg
, NumType=Card
, NumType=Ord
, Number=Dual
, Number=Plur
, Number=Sing
, Number[psor]=Plur
, Number[psor]=Sing
, Person=1
, Person=2
, Person=3
, Poss=Yes
, PronType=Dem
, PronType=Dem,Ind
, PronType=Ind
, PronType=Int,Rel
, PronType=Neg
, PronType=Prs
, PronType=Rel
, Reflex=Yes
, Style=Coll
DET
occurs with 273 feature combinations.
The most frequent feature combination is Gender[psor]=Masc,Neut|Number[psor]=Sing|Person=3|Poss=Yes|PronType=Prs
(2718 tokens).
Examples: jeho
Relations
DET
nodes are attached to their parents using 8 different relations: cs-dep/det (26231; 94% instances), cs-dep/det:numgov (978; 4% instances), cs-dep/det:nummod (564; 2% instances), cs-dep/advcl (24; 0% instances), cs-dep/acl (9; 0% instances), cs-dep/nmod (3; 0% instances), cs-dep/ccomp (2; 0% instances), cs-dep/csubj (2; 0% instances)
Parents of DET
nodes belong to 6 different parts of speech: NOUN (27483; 99% instances), PROPN (108; 0% instances), ADJ (106; 0% instances), PRON (97; 0% instances), NUM (16; 0% instances), DET (3; 0% instances)
27203 (98%) DET
nodes are leaves.
370 (1%) DET
nodes have one child.
181 (1%) DET
nodes have two children.
59 (0%) DET
nodes have three or more children.
The highest child degree of a DET
node is 12.
Children of DET
nodes are attached using 18 different relations: cs-dep/advmod:emph (180; 19% instances), cs-dep/punct (130; 13% instances), cs-dep/conj (117; 12% instances), cs-dep/cc (99; 10% instances), cs-dep/case (93; 10% instances), cs-dep/acl (90; 9% instances), cs-dep/advmod (62; 6% instances), cs-dep/advcl (34; 4% instances), cs-dep/mark (30; 3% instances), cs-dep/nmod (28; 3% instances), cs-dep/amod (26; 3% instances), cs-dep/appos (26; 3% instances), cs-dep/cop (17; 2% instances), cs-dep/dep (12; 1% instances), cs-dep/nsubj (12; 1% instances), cs-dep/xcomp (4; 0% instances), cs-dep/det:nummod (2; 0% instances), cs-dep/neg (1; 0% instances)
Children of DET
nodes belong to 13 different parts of speech: ADV (175; 18% instances), PUNCT (130; 13% instances), CONJ (129; 13% instances), VERB (114; 12% instances), ADP (92; 10% instances), ADJ (88; 9% instances), NOUN (73; 8% instances), PRON (67; 7% instances), PART (46; 5% instances), SCONJ (30; 3% instances), PROPN (12; 1% instances), NUM (4; 0% instances), DET (3; 0% instances)
DET in other languages: [bg] [cs] [de] [el] [en] [es] [eu] [fa] [fi] [fr] [ga] [he] [hu] [it] [ja] [ko] [sv] [u]