home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Estonian-EDT: POS Tags: DET

There are 55 DET lemmas (0%), 366 DET types (1%) and 5775 DET tokens (2%). Out of 16 observed tags, the rank of DET is: 12 in number of lemmas, 8 in number of types and 13 in number of tokens.

The 10 most frequent DET lemmas: see, üks, kõik, teine, iga, selline, mõni, sama, mingi, mitu

The 10 most frequent DET types: see, kõik, selle, kogu, üks, iga, need, seda, sel, ühe

The 10 most frequent ambiguous lemmas: see (PRON 3840, DET 1560, NOUN 6), üks (DET 641, NUM 344, PRON 112), kõik (DET 546, PRON 294, NOUN 5, ADV 1), teine (DET 457, PRON 289, ADJ 189, NUM 1), iga (DET 357, PRON 12, NOUN 10), selline (DET 324, ADJ 63, PRON 17), mõni (DET 322, PRON 34, ADJ 16), sama (DET 272, ADV 38, ADJ 28, PRON 21), mingi (DET 254, PRON 2), mitu (DET 250, PRON 22)

The 10 most frequent ambiguous types: see (PRON 866, DET 239), kõik (DET 218, PRON 149), selle (PRON 373, DET 203, NOUN 5), kogu (DET 209, NOUN 10), üks (DET 180, NUM 118, PRON 41), iga (DET 146, NOUN 2), need (PRON 147, DET 126), seda (PRON 609, DET 132), sel (DET 86, PRON 9), ühe (DET 101, NUM 99, PRON 7)

Morphology

The form / lemma ratio of DET is 6.654545 (the average of all parts of speech is 1.912184).

The 1st highest number of forms (32) was observed with the lemma “see”: Nendeks, Seks, Selleski, need, neid, neidki, neil, neile, neilt, neis, neisse, neist, nende, nendel, nendele, nendes, nendest, seda, see, seegi, sel, selle, selleks, sellel, sellele, sellelt, selleltki, selles, sellesse, sellest, ses, sest.

The 2nd highest number of forms (26) was observed with the lemma “teine”: teine, teinegi, teise, teised, teisedki, teisegi, teiseks, teisel, teisele, teiselt, teises, teisest, teisi, teisigi, teist, teiste, teistegi, teistel, teistele, teistelegi, teistelt, teistes, teisteski, teistesse, teistest, teistestki.

The 3rd highest number of forms (23) was observed with the lemma “mõni”: mõnd, mõnda, mõndagi, mõne, mõned, mõnede, mõnedele, mõnedes, mõnedki, mõnegi, mõneks, mõnel, mõnele, mõnelegi, mõnelgi, mõnelt, mõnes, mõnesid, mõneski, mõnesse, mõnest, mõni, mõnigi.

DET occurs with 7 features: PronType (5775; 100% instances), Case (5527; 96% instances), Number (5527; 96% instances), NumForm (1; 0% instances), Person (1; 0% instances), Poss (1; 0% instances), Reflex (1; 0% instances)

DET occurs with 27 feature-value pairs: Case=Abl, Case=Add, Case=Ade, Case=All, Case=Com, Case=Ela, Case=Ess, Case=Gen, Case=Ill, Case=Ine, Case=Nom, Case=Par, Case=Tra, NumForm=Letter, Number=Plur, Number=Sing, Person=3, Poss=Yes, PronType=Dem, PronType=Ind, PronType=Int, PronType=Int,Rel, PronType=Prs, PronType=Rcp, PronType=Rel, PronType=Tot, Reflex=Yes

DET occurs with 85 feature combinations. The most frequent feature combination is Case=Nom|Number=Sing|PronType=Dem (577 tokens). Examples: see, selline, teine, sama, esimene, too, niisugune, samasugune, seegi, seesama

Relations

DET nodes are attached to their parents using 5 different relations: det (5757; 100% instances), conj (15; 0% instances), acl:relcl (1; 0% instances), obl (1; 0% instances), root (1; 0% instances)

Parents of DET nodes belong to 10 different parts of speech: NOUN (5296; 92% instances), PRON (183; 3% instances), NUM (92; 2% instances), ADJ (71; 1% instances), PROPN (56; 1% instances), DET (47; 1% instances), ADV (27; 0% instances), CCONJ (1; 0% instances), (1; 0% instances), VERB (1; 0% instances)

5435 (94%) DET nodes are leaves.

325 (6%) DET nodes have one child.

8 (0%) DET nodes have two children.

7 (0%) DET nodes have three or more children.

The highest child degree of a DET node is 7.

Children of DET nodes are attached using 18 different relations: advmod (215; 58% instances), conj (37; 10% instances), det (35; 9% instances), cc (16; 4% instances), acl:relcl (11; 3% instances), punct (11; 3% instances), acl (9; 2% instances), amod (9; 2% instances), nmod (6; 2% instances), obl (4; 1% instances), cop (3; 1% instances), nsubj:cop (3; 1% instances), advcl (2; 1% instances), case (2; 1% instances), fixed (2; 1% instances), mark (2; 1% instances), appos (1; 0% instances), cc:preconj (1; 0% instances)

Children of DET nodes belong to 12 different parts of speech: ADV (218; 59% instances), DET (47; 13% instances), PRON (23; 6% instances), VERB (18; 5% instances), ADJ (16; 4% instances), CCONJ (16; 4% instances), PUNCT (11; 3% instances), NOUN (10; 3% instances), AUX (3; 1% instances), SCONJ (3; 1% instances), ADP (2; 1% instances), PROPN (2; 1% instances)