home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Estonian-EDT: POS Tags: VERB

There are 2212 VERB lemmas (6%), 9408 VERB types (13%) and 41091 VERB tokens (11%). Out of 16 observed tags, the rank of VERB is: 4 in number of lemmas, 3 in number of types and 3 in number of tokens.

The 10 most frequent VERB lemmas: saama, olema, tulema, tegema, ütlema, minema, võtma, jääma, hakkama, andma

The 10 most frequent VERB types: on, tuleb, teha, ütles, saada, sai, saab, saanud, tuli, jääb

The 10 most frequent ambiguous lemmas: saama (VERB 1114, AUX 560), olema (AUX 12688, VERB 1089), tulema (VERB 1053, AUX 1), minema (VERB 689, PRON 1), nägema (VERB 426, AUX 1), pidama (AUX 764, VERB 374), tunduma (VERB 77, AUX 45), paistma (VERB 75, AUX 8), huvitama (VERB 45, ADJ 1), pruukima (VERB 42, AUX 1)

The 10 most frequent ambiguous types: on (AUX 7162, VERB 427), saada (VERB 194, AUX 1), sai (VERB 177, AUX 30, NOUN 3), saab (AUX 199, VERB 155), saanud (VERB 152, AUX 35, ADJ 22, NOUN 1), tuli (VERB 149, NOUN 6), oli (AUX 1801, VERB 95), pole (AUX 647, VERB 87), jäänud (VERB 98, ADJ 34, NOUN 3), tulnud (VERB 93, ADJ 16)

Morphology

The form / lemma ratio of VERB is 4.253165 (the average of all parts of speech is 1.912184).

The 1st highest number of forms (39) was observed with the lemma “tegema”: tee, teeb, teebki, teed, teegi, teeks, teeksime, teeksite, teeme, teen, teengi, teete, teevad, teevadki, tegema, tegemas, tegemast, tegemata, tegi, tegid, tegigi, tegime, tegin, tegingi, tegite, teha, tehakse, tehaksegi, tehes, tehke, tehku, tehta, tehtagi, tehtagu, tehtaks, tehti, tehtud, teinud, teinudki.

The 2nd highest number of forms (38) was observed with the lemma “saama”: saa, saab, saad, saada, saadaks, saadakse, saades, saadi, saadud, saage, saagi, saagu, saaks, saaksid, saaksime, saaksin, saaksite, saakski, saama, saamas, saamata, saame, saamegi, saan, saand, saanud, saanudki, saanuks, saate, saavad, saavat, sai, said, saigi, saime, sain, saingi, saite.

The 3rd highest number of forms (37) was observed with the lemma “olema”: Olge, Olidki, Ons, oldi, oldud, ole, oled, olegi, oleks, oleksid, olema, olemas, olemata, oleme, olen, olete, olevat, olgu, oli, olid, oligi, olime, olin, olla, ollagi, olles, olnud, olnudki, olnuks, on, ongi, pole, polegi, poleks, polekski, polnud, polnudki.

VERB occurs with 11 features: VerbForm (41091; 100% instances), Voice (33605; 82% instances), Tense (31102; 76% instances), Mood (27178; 66% instances), Number (21741; 53% instances), Person (21731; 53% instances), Connegative (2563; 6% instances), Case (2497; 6% instances), Polarity (167; 0% instances), Abbr (40; 0% instances), Hyph (1; 0% instances)

VERB occurs with 29 feature-value pairs: Abbr=Yes, Case=Abe, Case=All, Case=Ela, Case=Ill, Case=Ine, Case=Tra, Connegative=Yes, Hyph=Yes, Mood=Cnd, Mood=Imp, Mood=Ind, Mood=Qot, Number=Plur, Number=Sing, Person=1, Person=2, Person=3, Polarity=Neg, Tense=Imp, Tense=Past, Tense=Pres, VerbForm=Conv, VerbForm=Fin, VerbForm=Inf, VerbForm=Part, VerbForm=Sup, Voice=Act, Voice=Pass

VERB occurs with 79 feature combinations. The most frequent feature combination is Mood=Ind|Number=Sing|Person=3|Tense=Pres|VerbForm=Fin|Voice=Act (7399 tokens). Examples: on, tuleb, saab, jääb, ütleb, läheb, hakkab, teeb, annab, tähendab

Relations

VERB nodes are attached to their parents using 23 different relations: root (18263; 44% instances), conj (6251; 15% instances), advcl (4570; 11% instances), xcomp (3103; 8% instances), acl:relcl (2638; 6% instances), parataxis (2127; 5% instances), ccomp (1608; 4% instances), acl (965; 2% instances), csubj (810; 2% instances), csubj:cop (711; 2% instances), compound (14; 0% instances), amod (7; 0% instances), appos (6; 0% instances), aux (6; 0% instances), dep (2; 0% instances), nmod (2; 0% instances), orphan (2; 0% instances), advmod (1; 0% instances), compound:prt (1; 0% instances), discourse (1; 0% instances), flat (1; 0% instances), obl (1; 0% instances), vocative (1; 0% instances)

Parents of VERB nodes belong to 14 different parts of speech: (18263; 44% instances), VERB (15400; 37% instances), NOUN (4029; 10% instances), ADJ (1813; 4% instances), PRON (831; 2% instances), ADV (389; 1% instances), PROPN (304; 1% instances), NUM (37; 0% instances), DET (18; 0% instances), X (3; 0% instances), ADP (1; 0% instances), AUX (1; 0% instances), INTJ (1; 0% instances), SYM (1; 0% instances)

1119 (3%) VERB nodes are leaves.

3683 (9%) VERB nodes have one child.

4534 (11%) VERB nodes have two children.

31755 (77%) VERB nodes have three or more children.

The highest child degree of a VERB node is 26.

Children of VERB nodes are attached using 32 different relations: punct (35479; 22% instances), obl (25660; 16% instances), nsubj (22027; 14% instances), obj (17291; 11% instances), advmod (14300; 9% instances), aux (8853; 6% instances), conj (6276; 4% instances), mark (5840; 4% instances), cc (4820; 3% instances), advcl (4494; 3% instances), xcomp (4370; 3% instances), compound:prt (3793; 2% instances), parataxis (2295; 1% instances), ccomp (2237; 1% instances), csubj (858; 1% instances), amod (534; 0% instances), nummod (389; 0% instances), discourse (161; 0% instances), vocative (89; 0% instances), nsubj:cop (72; 0% instances), cop (41; 0% instances), cc:preconj (33; 0% instances), compound (15; 0% instances), acl:relcl (14; 0% instances), nmod (10; 0% instances), case (6; 0% instances), dep (3; 0% instances), acl (2; 0% instances), appos (2; 0% instances), csubj:cop (2; 0% instances), fixed (2; 0% instances), orphan (2; 0% instances)

Children of VERB nodes belong to 16 different parts of speech: NOUN (49940; 31% instances), PUNCT (35479; 22% instances), ADV (19593; 12% instances), VERB (15400; 10% instances), PRON (11969; 7% instances), AUX (8891; 6% instances), PROPN (5601; 4% instances), CCONJ (4823; 3% instances), SCONJ (4799; 3% instances), ADJ (2388; 1% instances), NUM (884; 1% instances), INTJ (156; 0% instances), SYM (19; 0% instances), X (19; 0% instances), ADP (8; 0% instances), DET (1; 0% instances)