home bg/feat edit page issue tracker

NumType: numeral type

NumType

Some languages (especially Slavic) have a complex system of numerals. For example, in the school grammar of Czech, the main part of speech is “numeral”, it includes almost everything where counting is involved and there are various subtypes. It also includes interrogative, relative, indefinite and demonstrative words referring to numbers (words like kolik / how many, tolik / so many, několik / some, a few), so at the same time we may have a non-empty value of PronType. (In English, these words are called quantifiers and they are considered a subgroup of determiners.)

In this respect Bulgarian behaves like Czech language.

From the syntactic point of view, some numtypes behave like adjectives and some behave like adverbs. We tag them u-pos/ADJ and u-pos/ADV respectively. Thus the NumType feature applies to several different parts of speech:

Card: cardinal number or corresponding interrogative / relative / indefinite / demonstrative word

Note that in some Indo-European languages there is a fuzzy borderline between numerals and nouns for thousand, million and billion.

Examples

Ord: ordinal number or corresponding interrogative / relative / indefinite / demonstrative word

This is a subtype of adjective.

Examples

Mult: multiplicative numeral or corresponding interrogative / relative / indefinite / demonstrative word

This is subtype of adverb.

Examples

Frac: fraction

This is a subtype of cardinal numbers, occasionally distinguished in corpora. It may denote a fraction or just the denominator of the fraction. In Bulgarian the numerator is cardinal numeral and denominator is ordinal numeral.

Examples


Treebank Statistics (UD_Bulgarian)

This feature is universal. It occurs with 2 different values: Card, Ord.

3449 tokens (2%) have a non-empty value of NumType. 715 types (3%) occur at least once with a non-empty value of NumType. 548 lemmas (4%) occur at least once with a non-empty value of NumType. The feature is used with 3 part-of-speech tags: bg-pos/NUM (2038; 1% instances), bg-pos/ADJ (895; 1% instances), bg-pos/ADV (516; 0% instances).

NUM

2038 bg-pos/NUM tokens (97% of all NUM tokens) have a non-empty value of NumType.

The most frequent other feature values with which NUM and NumType co-occurred: Definite=Ind (1908; 94%), Number=Plur (1810; 89%), Gender=EMPTY (1544; 76%).

NUM tokens may have the following values of NumType:

NumType seems to be lexical feature of NUM. 100% lemmas (407) occur only with one value of NumType.

ADJ

895 bg-pos/ADJ tokens (7% of all ADJ tokens) have a non-empty value of NumType.

The most frequent other feature values with which ADJ and NumType co-occurred: VerbForm=EMPTY (895; 100%), Aspect=EMPTY (895; 100%), Voice=EMPTY (895; 100%), Number=Sing (836; 93%), Degree=EMPTY (712; 80%), Definite=Ind (663; 74%).

ADJ tokens may have the following values of NumType:

NumType seems to be lexical feature of ADJ. 100% lemmas (163) occur only with one value of NumType.

ADV

516 bg-pos/ADV tokens (8% of all ADV tokens) have a non-empty value of NumType.

The most frequent other feature values with which ADV and NumType co-occurred: Degree=EMPTY (400; 78%), PronType=EMPTY (396; 77%).

ADV tokens may have the following values of NumType:

Relations with Agreement in NumType

The 10 most frequent relations where parent and child node agree in NumType: NUM –[mwe]–> NUM (52; 100%), NUM –[conj]–> NUM (35; 100%), NUM –[nmod]–> NUM (34; 100%), ADJ –[conj]–> ADJ (13; 87%).


NumType in other languages: [bg] [cs] [de] [el] [en] [es] [eu] [fa] [fi] [fr] [ga] [he] [hu] [it] [ja] [ko] [sv] [u]