home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Czech-PDTC: POS Tags: ADV

There are 2825 ADV lemmas (3%), 3188 ADV types (2%) and 165193 ADV tokens (5%). Out of 17 observed tags, the rank of ADV is: 7 in number of lemmas, 7 in number of types and 6 in number of tokens.

The 10 most frequent ADV lemmas: tam, už, tak, jak, více, kde, pak, kdy, ještě, včera

The 10 most frequent ADV types: tam, už, tak, jak, kde, pak, kdy, více, ještě, včera

The 10 most frequent ambiguous lemmas: (ADV 5406, PART 2200), tak (ADV 5088, PART 4056, CCONJ 489), jak (ADV 4182, SCONJ 2298, CCONJ 341, PROPN 5), ještě (PART 3186, ADV 2368), moc (ADV 1500, NOUN 350), již (ADV 1434, PART 964), dobře (ADV 1321, PART 450), tedy (ADV 1189, CCONJ 572), daleko (ADV 754, NOUN 4), jen (PART 2925, ADV 721, NOUN 423, SCONJ 5)

The 10 most frequent ambiguous types: (ADV 4692, PART 1995), tak (ADV 4790, PART 3727, CCONJ 470), jak (ADV 2596, SCONJ 1577, CCONJ 312, PROPN 5), ještě (PART 2673, ADV 2142), moc (ADV 1273, NOUN 127), již (ADV 1352, PART 880, PRON 28), dobře (ADV 1219, PART 8), tedy (ADV 1158, CCONJ 557), tu (ADV 762, DET 610), jen (PART 2750, ADV 672, NOUN 14, SCONJ 2)

Morphology

The form / lemma ratio of ADV is 1.128496 (the average of all parts of speech is 2.169184).

The 1st highest number of forms (7) was observed with the lemma “snadno”: nejsnadněji, nejsnáze, nesnadno, snadno, snadněji, snáz, snáze.

The 2nd highest number of forms (6) was observed with the lemma “daleko”: daleko, dál, dále, nedaleko, nejdál, nejdále.

The 3rd highest number of forms (5) was observed with the lemma “blízko”: blízko, blíž, blíže, nejblíž, nejblíže.

ADV occurs with 9 features: Degree (50186; 30% instances), Polarity (50186; 30% instances), PronType (43007; 26% instances), NumType (1446; 1% instances), ExtPos (831; 1% instances), Style (817; 0% instances), Abbr (541; 0% instances), Foreign (2; 0% instances), Typo (2; 0% instances)

ADV occurs with 22 feature-value pairs: Abbr=Yes, Degree=Cmp, Degree=Pos, Degree=Sup, ExtPos=ADP, ExtPos=ADV, ExtPos=CCONJ, ExtPos=SCONJ, Foreign=Yes, NumType=Mult, Polarity=Neg, Polarity=Pos, PronType=Dem, PronType=Dem,Ind, PronType=Ind, PronType=Int,Rel, PronType=Neg, PronType=Rel, PronType=Tot, Style=Coll, Style=Expr, Typo=Yes

ADV occurs with 36 feature combinations. The most frequent feature combination is _ (69120 tokens). Examples: už, pak, ještě, včera, potom, dnes, velmi, moc, vždycky, již

Relations

ADV nodes are attached to their parents using 29 different relations: advmod (125408; 76% instances), advmod:emph (15361; 9% instances), root (6476; 4% instances), conj (5472; 3% instances), obj (2742; 2% instances), dep (1554; 1% instances), obl (1149; 1% instances), nsubj (1019; 1% instances), cc (1008; 1% instances), advcl (987; 1% instances), appos (674; 0% instances), case (670; 0% instances), acl:relcl (593; 0% instances), ccomp (581; 0% instances), orphan (399; 0% instances), iobj (264; 0% instances), acl (157; 0% instances), xcomp (141; 0% instances), nsubj:pass (117; 0% instances), parataxis (94; 0% instances), obl:arg (87; 0% instances), csubj (63; 0% instances), fixed (62; 0% instances), advcl:pred (56; 0% instances), nmod (30; 0% instances), discourse (17; 0% instances), csubj:pass (9; 0% instances), mark (2; 0% instances), compound (1; 0% instances)

Parents of ADV nodes belong to 17 different parts of speech: VERB (94404; 57% instances), ADJ (29206; 18% instances), NOUN (13340; 8% instances), ADV (11152; 7% instances), (6476; 4% instances), NUM (3725; 2% instances), DET (2150; 1% instances), PRON (1332; 1% instances), PROPN (1320; 1% instances), AUX (914; 1% instances), PART (818; 0% instances), CCONJ (184; 0% instances), X (82; 0% instances), SYM (54; 0% instances), INTJ (18; 0% instances), ADP (14; 0% instances), SCONJ (4; 0% instances)

127120 (77%) ADV nodes are leaves.

22150 (13%) ADV nodes have one child.

5204 (3%) ADV nodes have two children.

10719 (6%) ADV nodes have three or more children.

The highest child degree of a ADV node is 12.

Children of ADV nodes are attached using 35 different relations: punct (16238; 21% instances), cop (8731; 11% instances), advmod (7575; 10% instances), advmod:emph (6835; 9% instances), nsubj (6294; 8% instances), obl (6249; 8% instances), advcl (5214; 7% instances), conj (4104; 5% instances), cc (3951; 5% instances), nmod (2186; 3% instances), mark (1599; 2% instances), dep (1508; 2% instances), aux (1083; 1% instances), orphan (1057; 1% instances), fixed (829; 1% instances), case (649; 1% instances), appos (571; 1% instances), obl:arg (264; 0% instances), csubj (219; 0% instances), obj (179; 0% instances), parataxis (142; 0% instances), det (126; 0% instances), ccomp (111; 0% instances), discourse (85; 0% instances), advcl:pred (65; 0% instances), acl (49; 0% instances), compound (31; 0% instances), acl:relcl (21; 0% instances), amod (15; 0% instances), nummod (13; 0% instances), vocative (13; 0% instances), xcomp (8; 0% instances), expl:pass (2; 0% instances), iobj (2; 0% instances), flat (1; 0% instances)

Children of ADV nodes belong to 17 different parts of speech: PUNCT (16238; 21% instances), NOUN (13866; 18% instances), ADV (11152; 15% instances), AUX (9968; 13% instances), PART (5830; 8% instances), VERB (4957; 7% instances), CCONJ (4156; 5% instances), DET (2071; 3% instances), SCONJ (1720; 2% instances), PRON (1559; 2% instances), ADP (1309; 2% instances), PROPN (1308; 2% instances), ADJ (978; 1% instances), NUM (809; 1% instances), SYM (52; 0% instances), X (36; 0% instances), INTJ (10; 0% instances)