NUM

This is part of archived UD v1 documentation. See http://universaldependencies.org/ for the current version.

home ru/pos issue tracker

`NUM`: numeral

Definition

A numeral is a word, functioning most typically as a determiner, adjective or pronoun, that expresses a number and a relation to the number, such as quantity, sequence, frequency or fraction.

Note that cardinal numerals are covered by NUM whether they are used as determiners or not (as in Windows 7) and whether they are expressed as words (четыре), digits (4) or Roman numerals (IV).

Russian grammar distinguishes several subclasses of pronominal numerals (quantifiers): interrogative and relative (сколько “how many”); demonstrative (столько “this many”); indefinite (несколько “several”). These words behave similarly to (most) cardinal numbers, e.g. they require that the counted noun phrase be in Genitive. They are not similar to adjectives (unlike their English counterparts).

In addition, several types of (non-pronominal) numerals, such as ordinal numerals and multiplicative numerals, are tagged ADJ or ADV, based on their syntactic and morphological behavior.

Examples

0, 1, 2, 3, 4, 5, 2014, 1000000, 3.14159265359
I, II, III, IV, V, MMXIV
один, два, три, четыре, пять, семьдесят “one, two, three, four, five, seventy”
половина, треть, четверть “one-half, one third, quarter”: denominators of fractions constitute a separate class of cardinal numerals.
двое, трое, четверо, пятеро “four, five”: collective numerals (see specific-syntax on their morphosyntactic behavior).
сколько, столько, предостаточно “how many, this many, more than enough”: pronominal quantifiers of imprecise quantity.

Counterexamples

первый, второй, третий “first, second, third”: adjectival ordinal numerals. They are tagged ADJ, and the ru-feat/NumType feature reveals their semantic relation to numbers.
впервые “for the first time”: adverbial ordinal numerals. They are tagged ADV, and the ru-feat/NumType feature reveals their semantic relation to numbers.
однажды, дважды, трижды “once, twice, three times”: multiplicative numerals. They are tagged ADV, and the ru-feat/NumType feature reveals their semantic relation to numbers.
пара, тройка, четверка “pair, triplet, foursome”: n-tuples (n-tice) are not considered numerals in the Russian grammar. They are tagged NOUN.
единица, двойка, тройка, четверка, пятерка “number one, number two, number three, number four, number five”: names of numbers, or of objects identified by the number (e.g. of a bus route). They are not considered numerals and they are tagged NOUN.

Border cases

тысяча, миллион, миллиард, триллион “thousand, million, billion, trillion”: words for large quantities are ambiguous between cardinal numerals (tagged NUM) and nouns. If they inflect as nouns, they are tagged NOUN; but the borderline is fuzzy. For instance, in phrases like тысячи людей вышли на улицы (“thousands of people went on the streets”), тысячи is a noun. In numeric expressions, e.g. 110 тысяч долларов (“110 thousand dollars”), it is a cardinal numeral.
много, мало, немного, немало, несколько, достаточно “many, few, not many, a lot, several, enough”: pronominal quantifiers are ambiguous between cardinal numerals (tagged NUM when they refer to imprecise quantities) and adverbs (tagged ADV when they refer to degree/intensity). As a rule, the latter have verbs, adjectives, and adverbs as their head (e.g. я был несколько груб “I was a bit rude”). Note that the words более, больше, менее, меньше “more than, less than” are considered comparative forms of the numerals много and мало when they are used in constructions with cardinal numerals, e.g. более пяти студентов “more than five students” (see specific-syntax).

Treebank Statistics (UD_Russian)

There are 683 NUM lemmas (4%), 728 NUM types (2%) and 2028 NUM tokens (2%). Out of 16 observed tags, the rank of NUM is: 6 in number of lemmas, 6 in number of types and 9 in number of tokens.

The 10 most frequent NUM lemmas: ОДИН, ДВА, НЕСКОЛЬКО, ТРИ, 2, 1, 10, ЧЕТЫРЕ, 4, 3

The 10 most frequent NUM types: 2, два, один, несколько, 1, двух, 10, 4, три, 3

The 10 most frequent ambiguous lemmas: ОДИН (NUM 185, ADV 1), НЕСКОЛЬКО (NUM 68, ADV 5), ТРИ (NUM 58, ADV 1), 2 (NUM 55, ADV 23, ADJ 9), 1 (NUM 43, ADJ 33, ADV 19), 10 (NUM 40, ADJ 14, ADV 8), 4 (NUM 35, ADJ 14, ADV 13), 3 (NUM 31, ADV 13, ADJ 8), 5 (NUM 29, ADJ 9, ADV 5), МНОГО (NUM 29, ADV 9)

The 10 most frequent ambiguous types: 2 (NUM 55, ADV 23, ADJ 9), один (NUM 42, ADV 1), несколько (NUM 41, ADV 5), 1 (NUM 43, ADJ 33, ADV 19), 10 (NUM 40, ADJ 14, ADV 8), 4 (NUM 35, ADJ 14, ADV 13), три (NUM 29, ADV 1), 3 (NUM 30, ADV 13, ADJ 8), 5 (NUM 29, ADJ 9, ADV 5), 20 (NUM 24, ADJ 12, ADV 11)

2
- NUM 55: Запирание осуществляется поворотом затвора на 2 боевых упора .
- ADV 23: Население Новогригоровки составляет более 2 - х тысяч человек .
- ADJ 9: Он умер в Каннах 2 февраля 1886 года .
один
- NUM 42: Годиноция – один из самых ранних известных полуобезьян .
- ADV 1: Астольф один вызывает на бой все войско татарского царя 2 миллиона 200 тысяч .
несколько
- NUM 41: В городе имеется несколько университетов , музеев , картинных галерей .
- ADV 5: Естественная реакция на несколько медлительную стратегию черных – 3. е4 .
1
- NUM 43: На расстоянии в 1 км расположено село Поповка .
- ADJ 33: 1 декабря 1923 произведен в лейтенанты 3 - го артиллерийского полка .
- ADV 19: Награждён орденом Святой Анны 1 - й степени .
10
- NUM 40: Готовая фигурка покрывается золотом 999,9 - й пробы толщиной 10 микрон .
- ADJ 14: 10 декабря в Перми прошла несогласованная акция против итогов выборов .
- ADV 8: С 10 июня по 29 ноября 1940 года командовал учебной подлодкой U - 10 .
4
- NUM 35: За сборную Аргентины он провёл 4 матча и забил 2 гола .
- ADJ 14: Первый эпизод вышел 4 августа 2012 .
- ADV 13: В свою очередь , это выражение восходит к тексту 4 - го псалма .
три
- NUM 29: Сериал продержался три сезона и транслировался каналом Sat. 1 .
- ADV 1: Иногда применяется , в неофициальной обстановке , сокращённое название `` кап - три '' .
3
- NUM 30: Мы потратили на это 2 или 3 года .
- ADV 13: 1 декабря 1923 произведен в лейтенанты 3 - го артиллерийского полка .
- ADJ 8: Один из главных организаторов переворота 3 апреля 1984 года , член Военного комитета национального возрождения .
5
- NUM 29: Фильмы на канале KkcTebou TV Channel выделяют 5 направлений :
- ADJ 9: 5 июля германские войска атаковали советские войска в НОВУРе , но успеха не имели .
- ADV 5: Остальные модификации предлагались с 5 - ступенчатыми ручными или 4 - ступенчатыми автоматическими коробками передач .
20
- NUM 24: Вторая глава в сборнике `` Воспоминания '' , 20 страниц .
- ADJ 12: Томас Стюарт Бейкер ( род. 20 января 1934 ) – английский актёр .
- ADV 11: Весна 20 - го года .

Morphology

The form / lemma ratio of NUM is 1.065886 (the average of all parts of speech is 1.591757).

The 1st highest number of forms (10) was observed with the lemma “ОДИН”: один, одна, одним, одних, одно, одного, одной, одном, одному, одну.

The 2nd highest number of forms (5) was observed with the lemma “ДВА”: два, две, двум, двумя, двух.

The 3rd highest number of forms (5) was observed with the lemma “МНОГО”: более, больше, многим, многих, много.

NUM occurs with 5 features: Case (2026; 100% instances), Animacy (1013; 50% instances), Gender (601; 30% instances), Number (316; 16% instances), Degree (2; 0% instances)

NUM occurs with 14 feature-value pairs: Animacy=Anim, Animacy=Inan, Case=Acc, Case=Dat, Case=Gen, Case=Ins, Case=Loc, Case=Nom, Degree=Cmp, Gender=Fem, Gender=Masc, Gender=Neut, Number=Plur, Number=Sing

NUM occurs with 82 feature combinations. The most frequent feature combination is Case=Nom (470 tokens). Examples: один, 1, 10, 2, 5, два, 0, несколько, 16, 4

Relations

NUM nodes are attached to their parents using 20 different relations: nummod:gov (845; 42% instances), nummod (520; 26% instances), nummod:entity (125; 6% instances), nmod (92; 5% instances), root (83; 4% instances), list (68; 3% instances), appos (64; 3% instances), conj (57; 3% instances), compound (53; 3% instances), amod (26; 1% instances), nsubj (21; 1% instances), dobj (17; 1% instances), parataxis (15; 1% instances), advmod (11; 1% instances), goeswith (10; 0% instances), remnant (9; 0% instances), nsubjpass (5; 0% instances), iobj (4; 0% instances), acl (2; 0% instances), ccomp (1; 0% instances)

Parents of NUM nodes belong to 11 different parts of speech: NOUN (1562; 77% instances), VERB (102; 5% instances), NUM (92; 5% instances), ROOT (83; 4% instances), SYM (83; 4% instances), PROPN (63; 3% instances), ADJ (28; 1% instances), ADP (5; 0% instances), ADV (4; 0% instances), PRON (3; 0% instances), PUNCT (3; 0% instances)

1593 (79%) NUM nodes are leaves.

200 (10%) NUM nodes have one child.

108 (5%) NUM nodes have two children.

127 (6%) NUM nodes have three or more children.

The highest child degree of a NUM node is 10.

Children of NUM nodes are attached using 24 different relations: punct (338; 36% instances), nmod (155; 17% instances), nsubj (82; 9% instances), case (69; 7% instances), conj (57; 6% instances), advmod (51; 5% instances), cc (27; 3% instances), cop (27; 3% instances), discourse (27; 3% instances), goeswith (25; 3% instances), appos (12; 1% instances), neg (12; 1% instances), remnant (11; 1% instances), list (9; 1% instances), parataxis (8; 1% instances), nummod (7; 1% instances), amod (4; 0% instances), compound (4; 0% instances), nummod:gov (3; 0% instances), advcl (2; 0% instances), det (2; 0% instances), acl (1; 0% instances), dobj (1; 0% instances), iobj (1; 0% instances)

Children of NUM nodes belong to 14 different parts of speech: PUNCT (340; 36% instances), NOUN (223; 24% instances), NUM (92; 10% instances), ADP (71; 8% instances), ADV (66; 7% instances), VERB (35; 4% instances), CONJ (27; 3% instances), PART (27; 3% instances), PROPN (15; 2% instances), PRON (12; 1% instances), SYM (12; 1% instances), ADJ (8; 1% instances), DET (6; 1% instances), AUX (1; 0% instances)

Treebank Statistics (UD_Russian-SynTagRus)

There are 1220 NUM lemmas (3%), 1314 NUM types (1%) and 16014 NUM tokens (2%). Out of 17 observed tags, the rank of NUM is: 6 in number of lemmas, 6 in number of types and 13 in number of tokens.

The 10 most frequent NUM lemmas: один, два, несколько, три, 1, 10, 20, 2, пять, четыре

The 10 most frequent NUM types: один, несколько, два, три, 1, одной, 10, двух, 20, 2

The 10 most frequent ambiguous lemmas: один (NUM 1927, ADJ 575, NOUN 1), несколько (NUM 739, ADV 90), 6 (NUM 145, NOUN 1), пол (NUM 79, NOUN 69, PROPN 16), 2005 (NUM 66, NOUN 1), i (NUM 22, PROPN 5, X 3), 2012 (NUM 21, NOUN 1), x (NUM 9, PUNCT 2, PROPN 2), v (NUM 5, PROPN 1)

The 10 most frequent ambiguous types: один (NUM 493, ADJ 98), несколько (NUM 531, ADV 82), одной (NUM 321, ADJ 79), 10 (NUM 324, ADJ 1), одного (NUM 192, ADJ 55), одна (NUM 154, ADJ 56), одно (NUM 131, ADJ 50), одним (NUM 125, ADJ 22), одну (NUM 109, ADJ 18), одном (NUM 112, ADJ 23)

один
- NUM 493: Соревнования могут проводиться очные и заочные , в один или два тура .
- ADJ 98: ( А вдруг именно он один и был “ к чему “ ? . . )
несколько
- NUM 531: Хотел написать несколько песен о полетах .
- ADV 82: Потом его симпатии несколько сместились .
одной
- NUM 321: Траверс одной вершины не классифицируется .
- ADJ 79: Но ведь коррупция - удел не одной лишь госбюрократии .
10
- NUM 324: Минимум - 10 % от общей стоимости квартиры .
- ADJ 1: Он первым опытным путём измерил плотность воздуха , которую Аристотель считал равной 1 / 10 плотности воды ; эксперимент Галилея дал значение 1 / 400 , что намного ближе к истинному значению ( около 1 / 770 ) .
одного
- NUM 192: - Вы редактор одного из самых авторитетных наших научных журналов .
- ADJ 55: Из-за одного процента мы подозреваем все сто .
одна
- NUM 154: И тут еще одна его неразгаданная тайна .
- ADJ 56: Зачастили одна , другая , пятая следовательские бригады .
одно
- NUM 131: Вот одно из ее писем племяннице мужа :
- ADJ 50: Общеизвестно , что это далеко не одно и то же “ .
одним
- NUM 125: Мальчик очень любил учиться и стал одним из лучших учеников в классе .
- ADJ 22: Но нельзя всех мазать одним миром .
одну
- NUM 109: Мы рассмотрели только одну характеристику драконов .
- ADJ 18: Они были подвижны и катились всегда в одну сторону .
одном
- NUM 112: А его портрет поместили в одном из парадных залов Королевского дворца .
- ADJ 23: Она шла на одном мордените .

Morphology

The form / lemma ratio of NUM is 1.077049 (the average of all parts of speech is 2.665758).

The 1st highest number of forms (11) was observed with the lemma “один”: один, одна, одни, одним, одними, одно, одного, одной, одном, одному, одну.

The 2nd highest number of forms (8) was observed with the lemma “оба”: оба, обе, обеим, обеими, обеих, обоим, обоими, обоих.

The 3rd highest number of forms (6) was observed with the lemma “три”: трем, тремя, трех, три, трём, трёх.

NUM occurs with 3 features: Case (6012; 38% instances), Gender (2875; 18% instances), Animacy (1420; 9% instances)

NUM occurs with 11 feature-value pairs: Animacy=Anim, Animacy=Inan, Case=Acc, Case=Dat, Case=Gen, Case=Ins, Case=Loc, Case=Nom, Gender=Fem, Gender=Masc, Gender=Neut

NUM occurs with 31 feature combinations. The most frequent feature combination is _ (10002 tokens). Examples: 1, 10, 20, 2, 15, 5, 3, 30, 4, 100

Relations

NUM nodes are attached to their parents using 10 different relations: nummod (11751; 73% instances), nummod:gov (3536; 22% instances), nsubj (346; 2% instances), root (177; 1% instances), conj (138; 1% instances), nsubjpass (42; 0% instances), dep (10; 0% instances), advmod (9; 0% instances), advcl (4; 0% instances), acl (1; 0% instances)

Parents of NUM nodes belong to 13 different parts of speech: NOUN (11519; 72% instances), VERB (1480; 9% instances), NUM (1153; 7% instances), SYM (833; 5% instances), PROPN (564; 4% instances), ADJ (211; 1% instances), ROOT (177; 1% instances), ADV (48; 0% instances), PRON (14; 0% instances), SCONJ (7; 0% instances), X (4; 0% instances), CONJ (2; 0% instances), PART (2; 0% instances)

10887 (68%) NUM nodes are leaves.

2873 (18%) NUM nodes have one child.

1692 (11%) NUM nodes have two children.

562 (4%) NUM nodes have three or more children.

The highest child degree of a NUM node is 18.

Children of NUM nodes are attached using 22 different relations: punct (2443; 30% instances), nmod (1820; 22% instances), advmod (987; 12% instances), nummod (937; 11% instances), case (786; 10% instances), amod (334; 4% instances), conj (197; 2% instances), cc (162; 2% instances), nsubj (132; 2% instances), parataxis (131; 2% instances), nummod:gov (128; 2% instances), neg (49; 1% instances), appos (47; 1% instances), acl:relcl (19; 0% instances), mark (15; 0% instances), foreign (7; 0% instances), advcl (5; 0% instances), acl (2; 0% instances), compound (2; 0% instances), cop (2; 0% instances), mwe (2; 0% instances), iobj (1; 0% instances)

Children of NUM nodes belong to 14 different parts of speech: PUNCT (2443; 30% instances), NOUN (1878; 23% instances), NUM (1153; 14% instances), ADP (785; 10% instances), ADV (611; 7% instances), PART (454; 6% instances), ADJ (411; 5% instances), CONJ (158; 2% instances), VERB (121; 1% instances), PRON (90; 1% instances), PROPN (64; 1% instances), SCONJ (22; 0% instances), SYM (17; 0% instances), X (1; 0% instances)

NUM in other languages: [bg] [cs] [de] [el] [en] [es] [eu] [fa] [fi] [fr] [ga] [he] [hu] [it] [ja] [ko] [sv] [u]

NUM: numeral

Definition

Examples

Counterexamples

Border cases

Treebank Statistics (UD_Russian)

Morphology

Relations

Treebank Statistics (UD_Russian-SynTagRus)

Morphology

Relations

`NUM`: numeral