ADV
: adverb
Definition
In the group of Bulgarian adverbs there are words that typically modify verbs for such categories as time, place, direction or manner. They may also modify adjectives and other adverbs, as in very briefly or arguably wrong. Some adverbs can modify even [nouns] (Noun).
In BulTreeBank tagset the corresponding POS tag is D
.
There is a closed subclass of pronominal adverbs that refer to
circumstances in context, rather than naming them directly; similarly
to pronouns, these can be categorized as interrogative, relative,
demonstrative etc. Pronominal adverbs also get the ADV
part-of-speech tag but they are differentiated by additional features.
In the BulTreeBank tagset the corresponding tags are as follows:
- Pdl, Pdm, Pdq, Pdt (Adverbial demonstrative pronouns for location, manner, quantity and time)
- Prl, Prm, Prq, Prt (Adverbial relative pronouns for location, manner, quantity and time)
- Pcl, Pcm, Pct (Adverbial collective pronouns for location, manner and time)
- Pil, Pim, Piq, Pit (Adverbial interrogative pronouns for location, manner, quantity and time)
- Pfl, Pfm, Pfq, Pft (Adverbial indefinite pronouns for location, manner, quantity and time)
- Pnl, Pnm, Pnq, Pnt (Adverbial negative pronouns for location, manner, quantity and time)
Examples
- demonstrative adverbs: тук, там, тогава / tuk, tam, togava “here, there, then”
- relative pronouns: когато, където, както, колкото / kogato, kadeto, kakto, kolkoto “when, where, as, as much as”
- collective adverbs: навсякъде, всякога, всякак / navsyakade, vsyakoga, vsyakak “everywhere, always, anyway”
- interrogative adverbs: кога, къде, как, колко /koga, kade, kak, kolko “when, where, how, how many”
- indefinite adverbs: някъде, някога, някак / nyakade, nyakoga, nyakak “somewhere, sometime, somehow”
- negative adverbs: никога, никъде, никак / nikoga, nikade, nikak “never, nowhere, not at all”
Note that there are words that may be traditionally called numerals in
some languages (e.g. Bulgarian) but they are treated as adverbs in the
universal tagging scheme. In particular, adverbial ordinal numerals
([bg] първо / parvo “for the first time”) are tagged ADV
.
The mapped tags present the neuter singular indefinite forms of the ordinal numerals: Monsi
.
In this way there will be ambiguity with the class of [adjectives] (ADJ).
Another adverbial numeral that goes under ADV
is Md#:
Examples
- много / mnogo “very”
- малко /malko “little”
Note that the symbol `#’, used in the Universal POS section indicates a holder for arbitrary number of features, suppressed in the respective tag as irrelevant in the BulTreeBank tagset, when mapped to the Universal one.
Treebank Statistics (UD_Bulgarian)
There are 675 ADV
lemmas (4%), 771 ADV
types (3%) and 6558 ADV
tokens (4%).
Out of 16 observed tags, the rank of ADV
is: 5 in number of lemmas, 5 in number of types and 8 in number of tokens.
The 10 most frequent ADV
lemmas: много, още, вчера, само, така, когато, вече, там, защото, обаче
The 10 most frequent ADV
types: още, много, вчера, само, вече, когато, защото, обаче, сега, как
The 10 most frequent ambiguous lemmas: обаче (ADV 161, CONJ 1), сега (ADV 130, PROPN 3), малко (ADV 80, PROPN 1), следобед (ADV 8, NOUN 1), случайно (ADV 6, ADJ 1), технически (ADJ 15, ADV 4), учудвам-(се) (ADV 4, VERB 1), истински (ADJ 29, ADV 3), политически (ADJ 107, ADV 3), преди (ADP 165, ADV 3)
The 10 most frequent ambiguous types: само (ADV 187, ADJ 1), ясно (ADV 45, ADJ 8), малко (ADV 44, ADJ 2), особено (ADV 27, ADJ 3), достатъчно (ADV 23, ADJ 1), просто (ADV 21, ADJ 1), трудно (ADV 20, ADJ 1), бързо (ADV 24, ADJ 8), възможно (ADV 17, ADJ 10), очевидно (ADV 12, ADJ 1)
- само
- ясно
- малко
- особено
- достатъчно
- просто
- трудно
- бързо
- възможно
- очевидно
Morphology
The form / lemma ratio of ADV
is 1.142222 (the average of all parts of speech is 1.728233).
The 1st highest number of forms (9) was observed with the lemma “там”: По-нататък, дотам, дотук, нататък, оттам, оттук, там, тук, тука.
The 2nd highest number of forms (6) was observed with the lemma “къде”: Отде, где, докъде, къде, накъде, откъде.
The 3rd highest number of forms (6) was observed with the lemma “малко”: Най-малкото, малко, малкото, най-малко, по-малко, по-малкото.
ADV
occurs with 5 features: bg-feat/Degree (2304; 35% instances), bg-feat/PronType (1216; 19% instances), bg-feat/NumType (516; 8% instances), bg-feat/Definite (396; 6% instances), bg-feat/Number (396; 6% instances)
ADV
occurs with 13 feature-value pairs: Definite=Def
, Definite=Ind
, Degree=Cmp
, Degree=Pos
, Degree=Sup
, NumType=Card
, Number=Plur
, PronType=Dem
, PronType=Ind
, PronType=Int
, PronType=Neg
, PronType=Rel
, PronType=Tot
ADV
occurs with 22 feature combinations.
The most frequent feature combination is _
(2759 tokens).
Examples: още, вчера, само, защото, обаче, вече, сега, днес, все, пак
Relations
ADV
nodes are attached to their parents using 18 different relations: bg-dep/advmod (5695; 87% instances), bg-dep/dobj (269; 4% instances), bg-dep/root (218; 3% instances), bg-dep/conj (97; 1% instances), bg-dep/mark (73; 1% instances), bg-dep/cc (65; 1% instances), bg-dep/ccomp (28; 0% instances), bg-dep/mwe (27; 0% instances), bg-dep/nsubj (25; 0% instances), bg-dep/advcl (17; 0% instances), bg-dep/nmod (14; 0% instances), bg-dep/acl (8; 0% instances), bg-dep/csubj (6; 0% instances), bg-dep/xcomp (6; 0% instances), bg-dep/iobj (5; 0% instances), bg-dep/nsubjpass (3; 0% instances), bg-dep/amod (1; 0% instances), bg-dep/goeswith (1; 0% instances)
Parents of ADV
nodes belong to 13 different parts of speech: VERB (4071; 62% instances), NOUN (994; 15% instances), ADJ (590; 9% instances), ADV (452; 7% instances), ROOT (218; 3% instances), ADP (72; 1% instances), PRON (44; 1% instances), DET (41; 1% instances), NUM (37; 1% instances), PROPN (30; 0% instances), CONJ (5; 0% instances), INTJ (2; 0% instances), PART (2; 0% instances)
5461 (83%) ADV
nodes are leaves.
612 (9%) ADV
nodes have one child.
182 (3%) ADV
nodes have two children.
303 (5%) ADV
nodes have three or more children.
The highest child degree of a ADV
node is 9.
Children of ADV
nodes are attached using 20 different relations: bg-dep/punct (468; 21% instances), bg-dep/advmod (361; 16% instances), bg-dep/nmod (292; 13% instances), bg-dep/cop (262; 12% instances), bg-dep/mwe (169; 8% instances), bg-dep/csubj (114; 5% instances), bg-dep/cc (108; 5% instances), bg-dep/conj (106; 5% instances), bg-dep/nsubj (93; 4% instances), bg-dep/neg (63; 3% instances), bg-dep/discourse (39; 2% instances), bg-dep/mark (30; 1% instances), bg-dep/case (27; 1% instances), bg-dep/advcl (24; 1% instances), bg-dep/aux (19; 1% instances), bg-dep/expl (10; 0% instances), bg-dep/iobj (4; 0% instances), bg-dep/dobj (3; 0% instances), bg-dep/acl (2; 0% instances), bg-dep/det (2; 0% instances)
Children of ADV
nodes belong to 15 different parts of speech: PUNCT (468; 21% instances), ADV (452; 21% instances), VERB (411; 19% instances), NOUN (327; 15% instances), CONJ (153; 7% instances), SCONJ (108; 5% instances), PRON (72; 3% instances), INTJ (66; 3% instances), PART (57; 3% instances), ADP (28; 1% instances), PROPN (25; 1% instances), ADJ (14; 1% instances), AUX (9; 0% instances), DET (3; 0% instances), NUM (3; 0% instances)
ADV in other languages: [bg] [cs] [de] [el] [en] [es] [eu] [fa] [fi] [fr] [ga] [he] [hu] [it] [ja] [ko] [sv] [u]