home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Czech-PDT: POS Tags: VERB

There are 5664 VERB lemmas (9%), 21008 VERB types (16%) and 130274 VERB tokens (9%). Out of 17 observed tags, the rank of VERB is: 4 in number of lemmas, 4 in number of types and 5 in number of tokens.

The 10 most frequent VERB lemmas: mít, moci, muset, stát, chtít, jít, říci, lze, dát, dostat

The 10 most frequent VERB types: má, může, řekl, měl, mají, musí, jde, měla, lze, mít

The 10 most frequent ambiguous lemmas: stát (VERB 1542, NOUN 1446), růst (NOUN 353, VERB 149), vzrůst (VERB 139, NOUN 13), jet (VERB 129, PROPN 6, NOUN 3), hledět (VERB 39, ADP 1), škodit (VERB 18, NOUN 1), rozlišit (VERB 12, NOUN 1), drát (NOUN 25, VERB 4), do (ADP 7414, PROPN 11, VERB 3, NOUN 2), transit (VERB 3, NOUN 2)

The 10 most frequent ambiguous types: (VERB 2171, DET 16), stát (NOUN 301, VERB 226), moci (NOUN 129, VERB 111), pomoci (VERB 94, NOUN 79), vlastní (ADJ 464, VERB 76), myslí (VERB 69, NOUN 4), děje (VERB 39, NOUN 28), praví (VERB 24, ADJ 1), plní (VERB 24, ADJ 1), padla (VERB 19, NOUN 1)

Morphology

The form / lemma ratio of VERB is 3.709040 (the average of all parts of speech is 2.181221).

The 1st highest number of forms (36) was observed with the lemma “stát”: nestal, nestala, nestali, nestalo, nestaly, nestane, nestanou, nestojí, nestojíme, nestojíte, nestál, nestála, nestáli, nestálo, nestály, stal, stala, stali, stalo, staly, stane, stanete, stanou, stanu, stoje, stojí, stojím, stojíme, stál, stála, stáli, stálo, stály, stát, státi, stůj.

The 2nd highest number of forms (31) was observed with the lemma “jít”: Nejít, Pojď, Pojďme, jde, jdem, jdeme, jdete, jdou, jít, nejde, nejdeme, nejdou, nejdu, nepůjde, nepůjdou, nepůjdu, nešel, nešla, nešli, nešlo, nešly, půjde, půjdeme, půjdete, půjdou, půjdu, šel, šla, šli, šlo, šly.

The 3rd highest number of forms (29) was observed with the lemma “dát”: Dej, Nedejte, Nedám, dají, dal, dala, dali, dalo, daly, dejme, dejte, dá, dám, dáme, dát, dáte, dáti, nedají, nedal, nedala, nedali, nedalo, nedaly, nedat, nedej, nedejme, nedá, nedáme, nedáš.

VERB occurs with 14 features: VerbForm (130274; 100% instances), Polarity (130272; 100% instances), Number (106094; 81% instances), Tense (105235; 81% instances), Voice (105235; 81% instances), Aspect (81688; 63% instances), Mood (56803; 44% instances), Person (56798; 44% instances), Gender (49280; 38% instances), Animacy (12511; 10% instances), Style (145; 0% instances), Foreign (117; 0% instances), NameType (13; 0% instances), Abbr (5; 0% instances)

VERB occurs with 38 feature-value pairs: Abbr=Yes, Animacy=Anim, Animacy=Inan, Aspect=Imp, Aspect=Perf, Foreign=Yes, Gender=Fem, Gender=Fem,Masc, Gender=Fem,Neut, Gender=Masc, Gender=Neut, Mood=Cnd, Mood=Imp, Mood=Ind, NameType=Com, NameType=Oth, NameType=Pro, Number=Plur, Number=Plur,Sing, Number=Sing, Person=1, Person=2, Person=3, Polarity=Neg, Polarity=Pos, Style=Arch, Style=Coll, Style=Expr, Style=Rare, Style=Vrnc, Tense=Fut, Tense=Past, Tense=Pres, VerbForm=Conv, VerbForm=Fin, VerbForm=Inf, VerbForm=Part, Voice=Act

VERB occurs with 178 feature combinations. The most frequent feature combination is Aspect=Imp|Mood=Ind|Number=Sing|Person=3|Polarity=Pos|Tense=Pres|VerbForm=Fin|Voice=Act (17342 tokens). Examples: říká, patří, znamená, tvrdí, představuje, uvádí, považuje, existuje, začíná, vyplývá

Relations

VERB nodes are attached to their parents using 21 different relations: root (56856; 44% instances), conj (17389; 13% instances), xcomp (13797; 11% instances), acl:relcl (13496; 10% instances), ccomp (7490; 6% instances), advcl (7316; 6% instances), acl (5381; 4% instances), csubj (5280; 4% instances), parataxis (1499; 1% instances), appos (688; 1% instances), dep (401; 0% instances), csubj:pass (375; 0% instances), orphan (184; 0% instances), flat:foreign (66; 0% instances), case (27; 0% instances), fixed (15; 0% instances), nmod (8; 0% instances), advmod (2; 0% instances), obj (2; 0% instances), amod (1; 0% instances), nsubj (1; 0% instances)

Parents of VERB nodes belong to 16 different parts of speech: (56856; 44% instances), VERB (43429; 33% instances), NOUN (17559; 13% instances), ADJ (6063; 5% instances), DET (2805; 2% instances), PROPN (1320; 1% instances), AUX (1031; 1% instances), ADV (656; 1% instances), PRON (227; 0% instances), NUM (186; 0% instances), PART (79; 0% instances), SYM (26; 0% instances), CCONJ (19; 0% instances), ADP (8; 0% instances), SCONJ (7; 0% instances), INTJ (3; 0% instances)

1348 (1%) VERB nodes are leaves.

9660 (7%) VERB nodes have one child.

15089 (12%) VERB nodes have two children.

104177 (80%) VERB nodes have three or more children.

The highest child degree of a VERB node is 19.

Children of VERB nodes are attached using 35 different relations: punct (111727; 23% instances), nsubj (67871; 14% instances), obl (63220; 13% instances), obj (52096; 11% instances), advmod (38730; 8% instances), obl:arg (28196; 6% instances), conj (18090; 4% instances), mark (17585; 4% instances), cc (17413; 4% instances), expl:pv (16641; 3% instances), xcomp (15123; 3% instances), aux (13389; 3% instances), ccomp (10541; 2% instances), advcl (6937; 1% instances), expl:pass (4897; 1% instances), nsubj:pass (3146; 1% instances), csubj (2614; 1% instances), dep (2297; 0% instances), advmod:emph (1556; 0% instances), parataxis (1095; 0% instances), iobj (1064; 0% instances), csubj:pass (404; 0% instances), nmod (322; 0% instances), appos (288; 0% instances), discourse (279; 0% instances), flat:foreign (63; 0% instances), vocative (60; 0% instances), orphan (51; 0% instances), amod (26; 0% instances), acl (13; 0% instances), det (9; 0% instances), nummod (8; 0% instances), fixed (5; 0% instances), acl:relcl (4; 0% instances), cop (3; 0% instances)

Children of VERB nodes belong to 16 different parts of speech: NOUN (160747; 32% instances), PUNCT (111727; 23% instances), VERB (43429; 9% instances), ADV (41767; 8% instances), PRON (39415; 8% instances), DET (18283; 4% instances), CCONJ (17490; 4% instances), PROPN (17351; 3% instances), SCONJ (16608; 3% instances), AUX (14844; 3% instances), ADJ (8783; 2% instances), NUM (3154; 1% instances), PART (1924; 0% instances), ADP (121; 0% instances), SYM (92; 0% instances), INTJ (28; 0% instances)