home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Icelandic-IcePaHC: POS Tags: DET

There are 154 DET lemmas (0%), 624 DET types (1%) and 44947 DET tokens (5%). Out of 16 observed tags, the rank of DET is: 8 in number of lemmas, 7 in number of types and 9 in number of tokens.

The 10 most frequent DET lemmas: sá, allur, þessi, mikill, einn, hinn, enginn, margur, nokkur, hver

The 10 most frequent DET types: þetta, sá, allt, einn, það, þeim, þessi, þann, allir, þá

The 10 most frequent ambiguous lemmas: (DET 8795, PRON 82, VERB 33, NOUN 6, PROPN 6, ADV 5, NUM 2), allur (DET 7021, ADV 87, PRON 18, ADJ 7, NOUN 2, ADP 1), þessi (DET 7014, PRON 6, ADJ 1, NOUN 1), mikill (DET 3640, ADJ 215, ADV 196), einn (DET 3199, ADV 176, ADJ 79, NUM 43, PART 7, PRON 2, PROPN 2, X 1), hinn (DET 3023, PRON 60, NOUN 7, ADJ 1, ADV 1), enginn (DET 2325, ADV 48, NOUN 29, PRON 6, ADJ 4, PROPN 4), margur (DET 1958, ADV 101, ADJ 50, NOUN 1), nokkur (DET 1713, NOUN 23, ADV 15, ADJ 8, NUM 1, PRON 1), hver (PRON 2814, DET 1475, SCONJ 52, ADV 36, NOUN 24, ADJ 1, INTJ 1, NUM 1)

The 10 most frequent ambiguous types: þetta (DET 2011, PRON 6, NOUN 2), (DET 1351, VERB 612, PRON 8, ADV 1, NOUN 1), allt (DET 1508, ADV 17), einn (DET 1185, ADV 59, ADJ 44, NUM 3), það (PRON 7461, DET 1152, SCONJ 85, ADV 4, NOUN 2, ADP 1), þeim (PRON 2554, DET 1136, NOUN 1, VERB 1), þessi (DET 900, NOUN 1), þann (DET 1003, PRON 1), allir (DET 913, ADV 24), þá (ADV 6747, PRON 1043, DET 876, VERB 13, ADP 4, NOUN 3)

Morphology

The form / lemma ratio of DET is 4.051948 (the average of all parts of speech is 1.856953).

The 1st highest number of forms (66) was observed with the lemma “nokkur”: nakkvað, nakkvert, nekkvað, nekkver, nekkverir, nekkverja, nekkverjum, nekkvern, nekkverra, nekkverri, nekkvers, nekkvert, nekkveru, nekkvi, nokkora, nokkorar, nokkorir, nokkoro, nokkorra, nokkra, nokkrar, nokkri, nokkrir, nokkru, nokkrum, nokkur, nokkura, nokkurar, nokkurir, nokkurn, nokkurra, nokkurrar, nokkurri, nokkurs, nokkurt, nokkuru, nokkurum, nokkuð, nokkvað, nökkra, nökkrar, nökkri, nökkru, nökkrum, nökkur, nökkurir, nökkurn, nökkurri, nökkurs, nökkurt, nökkuru, nökkurum, nökkut, nökkuð, nökkvat, nökkvað, nökkver, nökkverir, nökkverja, nökkverjar, nökkverju, nökkverjum, nökkvern, nökkverr, nökkvers, nökkvi.

The 2nd highest number of forms (41) was observed with the lemma “enginn”: einkis, einskis, einugi, ekkert, ekki, enga, engan, engar, engi, engin, enginn, engir, engis, engra, engrar, engri, engu, engum, engva, engvan, engvann, engvir, enkis, enskis, önga, öngan, öngar, öngir, öngra, öngrar, öngri, öngu, öngum, öngva, öngvan, öngvaninn, öngvar, öngvir, öngvu, öngvum, önvu.

The 3rd highest number of forms (39) was observed with the lemma “hinn”: en, ena, enir, enn, ennu, enu, enum, eð, hin, hina, hinar, hinir, hinn, hinna, hinnar, hinni, hins, hinu, hinum, hitt, hið, in, ina, inar, inir, inn, inna, innar, inni, ins, inu, inum, it, ið, na, nir, num, sá, þá.

DET occurs with 13 features: Number (43184; 96% instances), Case (43174; 96% instances), Gender (43064; 96% instances), PronType (31284; 70% instances), Definite (7581; 17% instances), Degree (7287; 16% instances), NumType (1853; 4% instances), VerbForm (184; 0% instances), Voice (184; 0% instances), Person (119; 0% instances), Mood (116; 0% instances), Tense (116; 0% instances), Foreign (34; 0% instances)

DET occurs with 35 feature-value pairs: Case=Acc, Case=Dat, Case=Gen, Case=Nom, Definite=Def, Definite=Ind, Degree=Cmp, Degree=Pos, Degree=Sup, Foreign=Yes, Gender=Fem, Gender=Masc, Gender=Neut, Mood=Imp, Mood=Ind, Mood=Sub, NumType=Card, NumType=Ord, Number=Plur, Number=Sing, Person=1, Person=2, Person=3, PronType=Dem, PronType=Ind, PronType=Int, PronType=Prs, Tense=Past, Tense=Pres, VerbForm=Fin, VerbForm=Inf, VerbForm=Part, VerbForm=Sup, Voice=Act, Voice=Mid

DET occurs with 330 feature combinations. The most frequent feature combination is Case=Nom|Gender=Masc|Number=Sing|PronType=Dem (2197 tokens). Examples: sá, þessi, hinn, sjá, hvílíkur

Relations

DET nodes are attached to their parents using 21 different relations: det (16916; 38% instances), amod (13309; 30% instances), obl (4556; 10% instances), nsubj (4365; 10% instances), obj (2839; 6% instances), appos (610; 1% instances), conj (606; 1% instances), root (350; 1% instances), nmod:poss (343; 1% instances), advcl (292; 1% instances), xcomp (240; 1% instances), ccomp (210; 0% instances), iobj (128; 0% instances), acl:relcl (66; 0% instances), acl (64; 0% instances), dep (22; 0% instances), vocative (12; 0% instances), nmod (9; 0% instances), discourse (4; 0% instances), parataxis (4; 0% instances), advmod (2; 0% instances)

Parents of DET nodes belong to 16 different parts of speech: NOUN (27033; 60% instances), VERB (10753; 24% instances), ADJ (1874; 4% instances), PRON (1510; 3% instances), PROPN (1158; 3% instances), DET (948; 2% instances), ADV (780; 2% instances), (350; 1% instances), AUX (297; 1% instances), NUM (80; 0% instances), X (65; 0% instances), PART (39; 0% instances), ADP (28; 0% instances), CCONJ (23; 0% instances), SCONJ (6; 0% instances), INTJ (3; 0% instances)

35884 (80%) DET nodes are leaves.

5545 (12%) DET nodes have one child.

2074 (5%) DET nodes have two children.

1444 (3%) DET nodes have three or more children.

The highest child degree of a DET node is 15.

Children of DET nodes are attached using 30 different relations: punct (2885; 19% instances), case (2087; 13% instances), acl:relcl (1700; 11% instances), amod (1612; 10% instances), obl (1341; 9% instances), advmod (874; 6% instances), cc (827; 5% instances), cop (811; 5% instances), det (700; 5% instances), nsubj (512; 3% instances), mark (455; 3% instances), advcl (396; 3% instances), conj (384; 2% instances), xcomp (264; 2% instances), ccomp (137; 1% instances), obj (88; 1% instances), acl (77; 0% instances), compound:prt (70; 0% instances), nmod:poss (52; 0% instances), aux (48; 0% instances), nummod (45; 0% instances), nmod (43; 0% instances), dep (23; 0% instances), appos (16; 0% instances), discourse (15; 0% instances), flat:foreign (14; 0% instances), vocative (10; 0% instances), expl (4; 0% instances), iobj (3; 0% instances), parataxis (3; 0% instances)

Children of DET nodes belong to 16 different parts of speech: PUNCT (2885; 19% instances), ADP (2203; 14% instances), VERB (2033; 13% instances), NOUN (1429; 9% instances), ADJ (1310; 8% instances), ADV (1103; 7% instances), PRON (1031; 7% instances), AUX (975; 6% instances), DET (948; 6% instances), CCONJ (847; 5% instances), SCONJ (448; 3% instances), PROPN (153; 1% instances), NUM (78; 1% instances), X (22; 0% instances), INTJ (17; 0% instances), PART (14; 0% instances)