home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Icelandic-Modern: POS Tags: VERB

There are 725 VERB lemmas (12%), 1960 VERB types (18%) and 9295 VERB tokens (12%). Out of 16 observed tags, the rank of VERB is: 4 in number of lemmas, 2 in number of types and 3 in number of tokens.

The 10 most frequent VERB lemmas: koma, eiga, gera, fara, verða, taka, segja, halda, þurfa, sjá

The 10 most frequent VERB types: fara, gera, hringir, held, koma, taka, þakka, kemur, á, segja

The 10 most frequent ambiguous lemmas: koma (VERB 485, NOUN 3), eiga (VERB 363, NOUN 2), verða (VERB 327, AUX 126, ADV 1), taka (VERB 312, ADV 1, NOUN 1, X 1), segja (VERB 295, ADV 4), sjá (VERB 148, X 1), hafa (AUX 858, VERB 140), (VERB 136, AUX 2), vinna (VERB 136, NOUN 30), varða (VERB 113, ADV 6)

The 10 most frequent ambiguous types: á (ADP 1549, VERB 97, ADV 18, X 1), segja (VERB 90, ADV 4), verður (VERB 71, AUX 45, ADV 1), vinna (VERB 70, NOUN 2), (VERB 62, DET 1), tala (VERB 57, NOUN 2), verði (VERB 51, AUX 33, NOUN 3), hefur (AUX 216, VERB 42), langar (VERB 37, ADJ 1), verða (VERB 36, AUX 31)

Morphology

The form / lemma ratio of VERB is 2.703448 (the average of all parts of speech is 1.738114).

The 1st highest number of forms (24) was observed with the lemma “koma”: kem, kemst, kemur, kom, koma, komandi, komast, komi, komin, kominn, komist, komið, komnar, komnir, komst, komu, komum, komumst, komust, kæmi, kæmist, kæmu, kæmum, kæmust.

The 2nd highest number of forms (20) was observed with the lemma “taka”: taka, takast, taki, tek, tekin, tekinn, tekist, tekið, teknar, teknir, tekst, tekur, tæki, tækju, tækjum, tók, tókst, tóku, tókum, tökum.

The 3rd highest number of forms (17) was observed with the lemma “leggja”: lagið, lagst, lagt, lagðar, lagði, lagðist, legg, leggi, leggja, leggjast, leggjum, leggst, leggur, legði, lögð, lögðu, lögðum.

VERB occurs with 10 features: VerbForm (9221; 99% instances), Voice (9059; 97% instances), Number (5443; 59% instances), Tense (4921; 53% instances), Mood (4763; 51% instances), Person (4699; 51% instances), Case (758; 8% instances), Gender (744; 8% instances), Degree (10; 0% instances), Definite (8; 0% instances)

VERB occurs with 26 feature-value pairs: Case=Acc, Case=Dat, Case=Gen, Case=Nom, Definite=Def, Definite=Ind, Degree=Pos, Gender=Fem, Gender=Masc, Gender=Neut, Mood=Imp, Mood=Ind, Mood=Sub, Number=Plur, Number=Sing, Person=1, Person=2, Person=3, Tense=Past, Tense=Pres, VerbForm=Fin, VerbForm=Inf, VerbForm=Part, VerbForm=Sup, Voice=Act, Voice=Mid

VERB occurs with 66 feature combinations. The most frequent feature combination is VerbForm=Inf|Voice=Act (2629 tokens). Examples: gera, fara, taka, koma, segja, vinna, sjá, fá, ræða, spyrja

Relations

VERB nodes are attached to their parents using 15 different relations: root (2422; 26% instances), acl (1625; 17% instances), ccomp (1195; 13% instances), conj (1180; 13% instances), acl:relcl (954; 10% instances), advcl (666; 7% instances), obl (578; 6% instances), dep (232; 2% instances), xcomp (210; 2% instances), amod (111; 1% instances), parataxis (106; 1% instances), csubj (10; 0% instances), obj (3; 0% instances), discourse (2; 0% instances), nmod:poss (1; 0% instances)

Parents of VERB nodes belong to 16 different parts of speech: VERB (3546; 38% instances), (2422; 26% instances), NOUN (1105; 12% instances), PRON (781; 8% instances), ADJ (500; 5% instances), DET (344; 4% instances), ADV (287; 3% instances), AUX (119; 1% instances), PROPN (67; 1% instances), PART (52; 1% instances), SCONJ (22; 0% instances), ADP (21; 0% instances), CCONJ (14; 0% instances), NUM (11; 0% instances), INTJ (2; 0% instances), X (2; 0% instances)

160 (2%) VERB nodes are leaves.

347 (4%) VERB nodes have one child.

1819 (20%) VERB nodes have two children.

6969 (75%) VERB nodes have three or more children.

The highest child degree of a VERB node is 14.

Children of VERB nodes are attached using 28 different relations: obl (5857; 17% instances), nsubj (5176; 15% instances), mark (4611; 14% instances), advmod (3713; 11% instances), obj (3008; 9% instances), aux (1347; 4% instances), acl (1289; 4% instances), cc (1270; 4% instances), conj (1183; 4% instances), ccomp (977; 3% instances), cop (975; 3% instances), punct (665; 2% instances), case (615; 2% instances), compound:prt (594; 2% instances), advcl (586; 2% instances), amod (350; 1% instances), xcomp (281; 1% instances), iobj (244; 1% instances), dep (215; 1% instances), expl (214; 1% instances), vocative (203; 1% instances), parataxis (117; 0% instances), acl:relcl (64; 0% instances), discourse (38; 0% instances), nmod (4; 0% instances), appos (2; 0% instances), csubj (1; 0% instances), det (1; 0% instances)

Children of VERB nodes belong to 16 different parts of speech: NOUN (7807; 23% instances), PRON (5067; 15% instances), ADV (3925; 12% instances), VERB (3546; 11% instances), SCONJ (2667; 8% instances), AUX (2462; 7% instances), PART (2118; 6% instances), ADP (1432; 4% instances), CCONJ (1296; 4% instances), DET (874; 3% instances), PROPN (865; 3% instances), ADJ (743; 2% instances), PUNCT (665; 2% instances), NUM (80; 0% instances), INTJ (39; 0% instances), X (14; 0% instances)