home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_German-HDT: POS Tags: DET

There are 48 DET lemmas (0%), 197 DET types (0%) and 497671 DET tokens (14%). Out of 16 observed tags, the rank of DET is: 9 in number of lemmas, 9 in number of types and 2 in number of tokens.

The 10 most frequent DET lemmas: der, ein, dieser, sein, ihr, alle, anderer, kein, mehr, viel

The 10 most frequent DET types: der, die, dem, den, das, des, eine, ein, einen, einer

The 10 most frequent ambiguous lemmas: der (DET 359942, PRON 28690, X 2), ein (DET 69034, ADP 1487), sein (AUX 43408, DET 9257), ihr (DET 7652, PRON 22), alle (DET 7640, ADJ 2), anderer (DET 5696, X 1), mehr (ADV 4015, DET 3382, ADJ 4, X 3, PROPN 2), viel (DET 2862, ADV 373), beide (ADJ 1011, DET 1005), solcher (DET 880, ADJ 6)

The 10 most frequent ambiguous types: der (DET 91438, PRON 4857, X 2), die (DET 77836, PRON 12604, X 2), dem (DET 66367, PRON 1681, X 1), den (DET 37055, PRON 620, ADJ 1, PROPN 1), das (DET 25405, PRON 4240, X 1), des (DET 22379, X 16, PROPN 3), ein (DET 14707, ADP 1487), alle (DET 2463, ADJ 2), mehr (ADV 3905, DET 1474, ADJ 3, X 2), allem (DET 2244, ADJ 2)

Morphology

The form / lemma ratio of DET is 4.104167 (the average of all parts of speech is 2.529657).

The 1st highest number of forms (11) was observed with the lemma “ein”: ‘n, ein, eine, eine(n), einem, einem/er, einen, einer, eines, eins, geeinte.

The 2nd highest number of forms (8) was observed with the lemma “anderer”: a., andere, anderem, anderen, anderer, anderes, andern, anders.

The 3rd highest number of forms (8) was observed with the lemma “derjenige”: dasjenige, demjenigen, denjenigen, derjenige, derjenigen, desjenigen, diejenige, diejenigen.

DET occurs with 13 features: PronType (497671; 100% instances), Number (495401; 100% instances), Case (490512; 99% instances), Definite (428976; 86% instances), Gender (395651; 80% instances), NumType (70039; 14% instances), Person (18373; 4% instances), Poss (18373; 4% instances), Number[psor] (16885; 3% instances), Gender[psor] (15471; 3% instances), Degree (4574; 1% instances), Polite (50; 0% instances), Foreign (17; 0% instances)

DET occurs with 34 feature-value pairs: Case=Acc, Case=Dat, Case=Gen, Case=Nom, Definite=Def, Definite=Ind, Degree=Cmp, Degree=Pos, Degree=Sup, Foreign=Yes, Gender=Fem, Gender=Masc, Gender=Masc,Neut, Gender=Neut, Gender[psor]=Fem, Gender[psor]=Masc,Neut, NumType=Card, Number=Plur, Number=Sing, Number[psor]=Plur, Number[psor]=Sing, Person=1, Person=2, Person=3, Polite=Form, Poss=Yes, PronType=Art, PronType=Dem, PronType=Ind, PronType=Int, PronType=Int,Rel, PronType=Neg, PronType=Prs, PronType=Tot

DET occurs with 315 feature combinations. The most frequent feature combination is Case=Dat|Definite=Def|Gender=Masc,Neut|Number=Sing|PronType=Art (47860 tokens). Examples: dem

Relations

DET nodes are attached to their parents using 19 different relations: det (485984; 98% instances), nsubj (3946; 1% instances), obj (2216; 0% instances), obl (2027; 0% instances), root (1233; 0% instances), nsubj:pass (574; 0% instances), conj (468; 0% instances), appos (447; 0% instances), nmod (424; 0% instances), advmod (81; 0% instances), xcomp (66; 0% instances), obl:arg (52; 0% instances), advcl (38; 0% instances), parataxis (37; 0% instances), ccomp (32; 0% instances), acl (20; 0% instances), reparandum (13; 0% instances), det:poss (11; 0% instances), csubj (2; 0% instances)

Parents of DET nodes belong to 14 different parts of speech: NOUN (454443; 91% instances), PROPN (22524; 5% instances), VERB (7615; 2% instances), X (7528; 2% instances), ADJ (2434; 0% instances), (1233; 0% instances), DET (847; 0% instances), AUX (371; 0% instances), NUM (292; 0% instances), ADV (238; 0% instances), PRON (133; 0% instances), ADP (6; 0% instances), SCONJ (6; 0% instances), INTJ (1; 0% instances)

484949 (97%) DET nodes are leaves.

9180 (2%) DET nodes have one child.

2347 (0%) DET nodes have two children.

1195 (0%) DET nodes have three or more children.

The highest child degree of a DET node is 10.

Children of DET nodes are attached using 26 different relations: case (5039; 27% instances), advmod (4681; 25% instances), nmod (3051; 16% instances), punct (2209; 12% instances), det (680; 4% instances), obl (570; 3% instances), cop (485; 3% instances), acl (472; 2% instances), nsubj (465; 2% instances), cc (410; 2% instances), conj (330; 2% instances), appos (135; 1% instances), obj (82; 0% instances), mark (63; 0% instances), advcl (54; 0% instances), ccomp (46; 0% instances), parataxis (45; 0% instances), aux (42; 0% instances), amod (13; 0% instances), reparandum (13; 0% instances), csubj (12; 0% instances), xcomp (8; 0% instances), flat:name (5; 0% instances), nummod (5; 0% instances), expl (3; 0% instances), flat (1; 0% instances)

Children of DET nodes belong to 15 different parts of speech: ADP (4863; 26% instances), ADV (3940; 21% instances), NOUN (2952; 16% instances), PUNCT (2209; 12% instances), PROPN (1043; 6% instances), DET (847; 4% instances), ADJ (705; 4% instances), CCONJ (628; 3% instances), VERB (570; 3% instances), AUX (567; 3% instances), PRON (277; 1% instances), PART (165; 1% instances), NUM (83; 0% instances), X (40; 0% instances), SCONJ (30; 0% instances)