home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Estonian-EDT: POS Tags: DET

There are 55 DET lemmas (0%), 379 DET types (0%) and 6796 DET tokens (2%). Out of 16 observed tags, the rank of DET is: 12 in number of lemmas, 8 in number of types and 13 in number of tokens.

The 10 most frequent DET lemmas: see, üks, kõik, teine, iga, mõni, selline, sama, mingi, kogu

The 10 most frequent DET types: see, kõik, kogu, selle, üks, iga, need, seda, sel, ühe

The 10 most frequent ambiguous lemmas: see (PRON 4510, DET 1843, NOUN 6), üks (DET 755, NUM 425, PRON 155), kõik (DET 633, PRON 330, NOUN 6, ADV 2), teine (DET 521, PRON 341, ADJ 217, NUM 2, PROPN 1), iga (DET 423, NOUN 17, PRON 13), mõni (DET 395, PRON 37, ADJ 21), selline (DET 387, ADJ 54, PRON 22), sama (DET 330, ADV 56, PRON 30, ADJ 24), mingi (DET 320, PRON 2), kogu (DET 302, NOUN 25, ADJ 1)

The 10 most frequent ambiguous types: see (PRON 998, DET 292), kõik (DET 251, PRON 171), kogu (DET 264, NOUN 14), selle (PRON 436, DET 231, NOUN 5), üks (DET 222, NUM 149, PRON 52), iga (DET 183, NOUN 4), need (PRON 170, DET 153), seda (PRON 705, DET 155), sel (DET 107, PRON 10), ühe (DET 122, NUM 117, PRON 8)

Morphology

The form / lemma ratio of DET is 6.890909 (the average of all parts of speech is 1.911857).

The 1st highest number of forms (34) was observed with the lemma “see”: Nendeks, Seks, Selgi, Selleski, need, neid, neidki, neil, neile, neilt, neis, neisse, neist, nende, nendel, nendele, nendes, nendesse, nendest, seda, see, seegi, sel, selle, selleks, sellel, sellele, sellelt, selleltki, selles, sellesse, sellest, ses, sest.

The 2nd highest number of forms (26) was observed with the lemma “teine”: teine, teinegi, teise, teised, teisedki, teisegi, teiseks, teisel, teisele, teiselt, teises, teisest, teisi, teisigi, teist, teiste, teistegi, teistel, teistele, teistelegi, teistelt, teistes, teisteski, teistesse, teistest, teistestki.

The 3rd highest number of forms (25) was observed with the lemma “mõni”: mõnd, mõnda, mõndagi, mõne, mõned, mõnede, mõnedel, mõnedele, mõnedes, mõnedest, mõnedki, mõnegi, mõneks, mõnel, mõnele, mõnelegi, mõnelgi, mõnelt, mõnes, mõnesid, mõneski, mõnesse, mõnest, mõni, mõnigi.

DET occurs with 7 features: PronType (6796; 100% instances), Case (6487; 95% instances), Number (6487; 95% instances), NumForm (1; 0% instances), Person (1; 0% instances), Poss (1; 0% instances), Reflex (1; 0% instances)

DET occurs with 27 feature-value pairs: Case=Abl, Case=Add, Case=Ade, Case=All, Case=Com, Case=Ela, Case=Ess, Case=Gen, Case=Ill, Case=Ine, Case=Nom, Case=Par, Case=Tra, NumForm=Letter, Number=Plur, Number=Sing, Person=3, Poss=Yes, PronType=Dem, PronType=Ind, PronType=Int, PronType=Int,Rel, PronType=Prs, PronType=Rcp, PronType=Rel, PronType=Tot, Reflex=Yes

DET occurs with 85 feature combinations. The most frequent feature combination is Case=Nom|Number=Sing|PronType=Dem (692 tokens). Examples: see, selline, teine, sama, esimene, niisugune, too, samasugune, seegi, seesama

Relations

DET nodes are attached to their parents using 6 different relations: det (6774; 100% instances), conj (17; 0% instances), obl (2; 0% instances), acl:relcl (1; 0% instances), nmod (1; 0% instances), root (1; 0% instances)

Parents of DET nodes belong to 11 different parts of speech: NOUN (6261; 92% instances), PRON (206; 3% instances), NUM (101; 1% instances), ADJ (82; 1% instances), PROPN (67; 1% instances), DET (48; 1% instances), ADV (25; 0% instances), VERB (3; 0% instances), CCONJ (1; 0% instances), (1; 0% instances), SYM (1; 0% instances)

6413 (94%) DET nodes are leaves.

368 (5%) DET nodes have one child.

9 (0%) DET nodes have two children.

6 (0%) DET nodes have three or more children.

The highest child degree of a DET node is 7.

Children of DET nodes are attached using 18 different relations: advmod (260; 63% instances), conj (44; 11% instances), det (34; 8% instances), cc (18; 4% instances), punct (11; 3% instances), acl:relcl (9; 2% instances), acl (8; 2% instances), nmod (8; 2% instances), obl (5; 1% instances), amod (2; 0% instances), cop (2; 0% instances), fixed (2; 0% instances), mark (2; 0% instances), nsubj:cop (2; 0% instances), advcl (1; 0% instances), appos (1; 0% instances), case (1; 0% instances), cc:preconj (1; 0% instances)

Children of DET nodes belong to 12 different parts of speech: ADV (263; 64% instances), DET (48; 12% instances), PRON (27; 7% instances), CCONJ (18; 4% instances), VERB (16; 4% instances), ADJ (11; 3% instances), PUNCT (11; 3% instances), NOUN (9; 2% instances), SCONJ (3; 1% instances), AUX (2; 0% instances), PROPN (2; 0% instances), ADP (1; 0% instances)