home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Latvian-LVTB: POS Tags: DET

There are 56 DET lemmas (0%), 260 DET types (0%) and 7338 DET tokens (2%). Out of 17 observed tags, the rank of DET is: 11 in number of lemmas, 8 in number of types and 11 in number of tokens.

The 10 most frequent DET lemmas: šis, šī, sava, tas, savs, viss, tā, kāds, visa, cita

The 10 most frequent DET types: savu, šo, to, šī, šajā, šīs, tā, tās, savas, kādu

The 10 most frequent ambiguous lemmas: šis (DET 798, PRON 94), šī (DET 645, PRON 32), sava (DET 589, PRON 3), tas (PRON 2727, DET 525, ADV 2), savs (DET 437, PRON 6), viss (PRON 448, DET 371), (PRON 951, ADV 398, DET 354, SCONJ 46, PART 32, CCONJ 11), kāds (DET 279, PRON 175), visa (DET 272, PRON 14), cita (DET 214, PRON 31)

The 10 most frequent ambiguous types: savu (DET 436, PRON 6), šo (DET 320, PRON 10), to (PRON 958, DET 240, X 4, PART 1), šī (DET 192, PRON 12), šajā (DET 148, PRON 2), šīs (DET 158, PRON 2), (PRON 395, ADV 323, DET 162, PART 20, SCONJ 13, CCONJ 10), tās (PRON 231, DET 157), kādu (DET 137, PRON 33), visu (PRON 120, DET 119)

Morphology

The form / lemma ratio of DET is 4.642857 (the average of all parts of speech is 2.328168).

The 1st highest number of forms (13) was observed with the lemma “tas”: t, tai, tais, tajos, tajā, tam, tanī, tas, tie, tiem, to, tos, tā.

The 2nd highest number of forms (11) was observed with the lemma “šis”: šai, šajos, šajā, šie, šiem, šim, šis, šo, šos, šā, šī.

The 3rd highest number of forms (11) was observed with the lemma “šāds”: Šādas, šadā, šāda, šādam, šādi, šādiem, šādos, šāds, šādu, šādus, šādā.

DET occurs with 9 features: Case (7338; 100% instances), Gender (7260; 99% instances), Number (7260; 99% instances), PronType (6981; 95% instances), Person (2326; 32% instances), Poss (1314; 18% instances), Definite (357; 5% instances), Degree (357; 5% instances), Typo (11; 0% instances)

DET occurs with 23 feature-value pairs: Case=Acc, Case=Dat, Case=Gen, Case=Loc, Case=Nom, Definite=Def, Definite=Ind, Degree=Pos, Gender=Fem, Gender=Masc, Number=Plur, Number=Sing, Person=2, Person=3, Poss=Yes, PronType=Dem, PronType=Ind, PronType=Ind,Neg, PronType=Int, PronType=Prs, PronType=Rel, PronType=Tot, Typo=Yes

DET occurs with 178 feature combinations. The most frequent feature combination is Case=Gen|Gender=Masc|Number=Sing|Person=3|PronType=Dem (282 tokens). Examples: tā, šī, šā

Relations

DET nodes are attached to their parents using 1 different relations: det (7338; 100% instances)

Parents of DET nodes belong to 11 different parts of speech: NOUN (6964; 95% instances), ADJ (180; 2% instances), VERB (73; 1% instances), PROPN (55; 1% instances), NUM (33; 0% instances), PRON (14; 0% instances), DET (10; 0% instances), ADV (3; 0% instances), INTJ (3; 0% instances), X (2; 0% instances), SYM (1; 0% instances)

6900 (94%) DET nodes are leaves.

371 (5%) DET nodes have one child.

62 (1%) DET nodes have two children.

5 (0%) DET nodes have three or more children.

The highest child degree of a DET node is 3.

Children of DET nodes are attached using 19 different relations: discourse (141; 28% instances), compound (84; 16% instances), advmod (80; 16% instances), case (63; 12% instances), acl (57; 11% instances), conj (21; 4% instances), fixed (19; 4% instances), ccomp (13; 3% instances), det (10; 2% instances), nmod (5; 1% instances), obl (5; 1% instances), punct (4; 1% instances), dep (2; 0% instances), advcl (1; 0% instances), amod (1; 0% instances), cc (1; 0% instances), goeswith (1; 0% instances), iobj (1; 0% instances), parataxis (1; 0% instances)

Children of DET nodes belong to 13 different parts of speech: PART (141; 28% instances), PRON (99; 19% instances), ADV (82; 16% instances), VERB (64; 13% instances), ADP (63; 12% instances), SCONJ (19; 4% instances), ADJ (13; 3% instances), NOUN (11; 2% instances), DET (10; 2% instances), PUNCT (4; 1% instances), PROPN (2; 0% instances), CCONJ (1; 0% instances), X (1; 0% instances)