home cs/dep edit page issue tracker

nummod: numeric modifier

A numeric modifier of a noun is any number phrase that serves to modify the meaning of the noun with a quantity.

Jan snědl tři řízky . \n Jan ate three steaks .
nummod(řízky, tři)
nummod(steaks, three)

Agreement and government with Czech quantifiers

The morphological and syntactic behavior of Czech numerals is a complex matter. Small cardinal numerals jeden “one”, dva “two”, tři “three” and čtyři “four” agree with the counted noun in cs-feat/Case (jeden also agrees in cs-feat/Gender and cs-feat/Number; dva also agrees in cs-feat/Gender). They behave as if they modify the counted noun; they are similar to adjectives in this respect. Examples:

In PDT, these numerals are attached to their counted nouns as Atr (attribute). It is straightforward to convert such dependencies to nummod:

Jedno kotě spalo . \n One kitten slept .
nummod(kotě, Jedno)
nsubj(spalo, kotě)
punct(spalo, .-4)
nummod(kitten, One)
nsubj(slept, kitten)
punct(slept, .-9)

Larger cardinals behave differently. They require that the counted noun be in the genitive case; this indicates that they actually govern the noun. Such constructions are parallel to nouns modified by other noun phrases in genitive. The whole phrase (numeral + counted noun) behaves as a noun phrase in neuter gender and singular number (which is important for subject-verb agreement).

In PDT, these numerals are analyzed as heads of the counted nouns, which are attached to the numeral as Atr:

# This is not UD, it is Prague Dependency Treebank, and we want to clearly distinguish it from the UD examples.
# visual-style nodes yellow
# visual-style arcs blue
1   Pět     pět     NUM     _   Case=Nom                           3   Sb     _   Five
2   mužů    muž     NOUN    _   Case=Gen|Gender=Masc|Number=Plur   1   Atr    _   men
3   hrálo   hrát    VERB    _   Gender=Neut|Number=Sing            0   Pred   _   played
4   karty   karta   NOUN    _   Case=Acc|Gender=Fem|Number=Plur    3   Obj    _   cards
5   .       .       PUNCT   _   _                                  0   AuxK   _   .

There are both advantages and drawbacks to this solution. On the one hand, it reflects well the agreement in case, gender and number. On the other hand, it is confusing that there are two different analyses of counted noun constructions, depending on the numeric value.

Moreover, the numeral does not govern the noun in all morphological cases. The following table shows the case of the whole phrase (numeral + noun; first column) and the consequences for the case of the parts (note that these numerals have only two distinct morphological forms, resulting in homonymy).

Phrase CaseExampleNumeral CaseNoun Case
Nompět mužů NomGen
Genpěti mužů GenGen
Datpěti mužům DatDat
Accpět mužů AccGen
Vocpět mužů VocGen
Locpěti mužíchLocLoc
Inspěti muži InsIns

We can say that the noun has the case of the whole phrase if it is dative, locative or instrumental. The numeral then agrees with the noun in case. The numeral forces the noun to the genitive case if the whole phrase is nominative, accusative or vocative (but the vocative usage is rather hypothetical). In genitive, the noun and the numeral agree with each other; but note that the numeral uses its inflected form, as in the other cases where it agrees with the noun.

In PDT, the genitive, dative, locative and instrumental cases are analyzed in parallel to the low-value numerals, i.e. the noun governs the numeral:

# This is not UD, it is Prague Dependency Treebank, and we want to clearly distinguish it from the UD examples.
# visual-style nodes yellow
# visual-style arcs blue
1   Hrál      hrát    VERB    _   Gender=Masc|Number=Sing            0   Pred   _   He-played
2   karty     karta   NOUN    _   Case=Acc|Gender=Fem|Number=Plur    1   Obj    _   cards
3   s         s       ADP     _   _                                  1   AuxP   _   with
4   pěti      pět     NUM     _   Case=Ins                           6   Atr    _   five
5   dalšími   další   ADJ     _   Case=Ins|Gender=Masc|Number=Plur   6   Atr    _   other
6   muži      muž     NOUN    _   Case=Ins|Gender=Masc|Number=Plur   3   Obj    _   men
7   .         .       PUNCT   _   _                                  0   AuxK   _   .

High-value numerals where the lowest-order digit is more than zero and less than five (e.g. 21, 22, 23, 24) may behave both ways:

Pronominal quantifiers behave as high-value numerals and govern the quantifed nouns:

# This is not UD, it is Prague Dependency Treebank, and we want to clearly distinguish it from the UD examples.
# visual-style nodes yellow
# visual-style arcs blue
1   Kolik   kolik   NUM     _   Case=Nom                           3   Sb     _   How-many
2   mužů    muž     NOUN    _   Case=Gen|Gender=Masc|Number=Plur   1   Atr    _   men
3   hrálo   hrát    VERB    _   Gender=Neut|Number=Sing            0   Pred   _   played
4   karty   karta   NOUN    _   Case=Acc|Gender=Fem|Number=Plur    3   Obj    _   cards
5   ?       ?       PUNCT   _   _                                  0   AuxK   _   ?

The UD conversion of the PDT data unifies analyses of counted noun phrases and uses a structure that is parallel among all the above cases, and also with universal dependencies in other languages. The counted noun is always the head and the numeral is always attached as its modifier. Nevertheless, we use different relation labels to mark situations where the numeral (or quantifier) actually governs the morphological case of the noun. There are four labels used:

NumericPronominal
Noun governsnummoddet:nummod
Numeral governsnummod:govdet:numgov
Tři muži hráli karty . \n Three men played cards .
nummod(muži, Tři)
nsubj(hráli, muži)
dobj(hráli, karty)
punct(hráli, .-5)
nummod(men, Three)
nsubj(played, men)
dobj(played, cards)
punct(played, .-11)
Pět mužů hrálo karty . \n Five men played cards .
nummod:gov(mužů, Pět)
nsubj(hrálo, mužů)
dobj(hrálo, karty)
punct(hrálo, .-5)
nummod:gov(men, Five)
nsubj(played, men)
dobj(played, cards)
punct(played, .-11)
Kolik mužů hrálo karty ? \n How-many men played cards ?
det:numgov(mužů, Kolik)
nsubj(hrálo, mužů)
dobj(hrálo, karty)
punct(hrálo, ?-5)
det:numgov(men, How-many)
nsubj(played, men)
dobj(played, cards)
punct(played, ?-11)
Hrál jsem karty s pěti muži . \n Played I-have cards with five men .
aux(Hrál, jsem)
dobj(Hrál, karty)
iobj(Hrál, muži)
case(muži, s)
nummod(muži, pěti)
punct(Hrál, .-7)
aux(Played, I-have)
dobj(Played, cards)
iobj(Played, men)
case(men, with)
nummod(men, five)
punct(Played, .-15)
Nepamatuji si , s kolika muži jsem hrál karty . \n I-do-not-remember myself , with how-many men I-have played cards .
ccomp(Nepamatuji, hrál)
compound:reflex(Nepamatuji, si)
punct(hrál, ,-3)
aux(hrál, jsem)
dobj(hrál, karty)
iobj(hrál, muži)
case(muži, s)
det:nummod(muži, kolika)
punct(Nepamatuji, .-10)
ccomp(I-do-not-remember, played)
compound:reflex(I-do-not-remember, myself)
punct(played, ,-14)
aux(played, I-have)
dobj(played, cards)
iobj(played, men)
case(men, with)
det:nummod(men, how-many)
punct(I-do-not-remember, .-21)

Additional remarks

In PDT the words milión “million”, miliarda “billion” and higher are usually tagged as nouns, not as numerals. In the typical case, the million is in genitive, it is preceded by a smaller number, and it is not followed by smaller numerals (as it is in million five hundred thousand). It is followed by the counted noun. Thus the following examples receive parallel analyses:

50 miliónů korun \n 50 millions of-crowns
nummod:gov(miliónů, 50-1)
nummod:gov(millions, 50-5)
nmod(miliónů, korun)
nmod(millions, of-crowns)
50 pytlů bankovek \n 50 sacks of-bills
nummod:gov(pytlů, 50-1)
nummod:gov(sacks, 50-5)
nmod(pytlů, bankovek)
nmod(sacks, of-bills)

On the other hand the word tisíc “thousand” may be a noun (na náměstí byly tisíce lidí “there were thousands of people in the square”) or a numeral:

nanejvýš 50 tisíc korun \n at-most 50 thousand crowns
advmod:emph(korun, nanejvýš)
nummod:gov(korun, tisíc)
compound(tisíc, 50-2)
advmod:emph(crowns, at-most)
nummod:gov(crowns, thousand)
compound(thousand, 50-7)

Note that the two numeral words in the above example are joined using the compound relation. Also note that the intensifier nanejvýš is attached to the head of the phrase (korun) and not to the number. This is in accord both with the UD guidelines and with the original PDT annotation of agreeing numerals (e.g. jen čtyři firmy, jen několik procent).

Similarly there may be other nodes (such as punctuation) that are attached to the head of the phrase and they are related to the whole phrase rather than directly to the head noun:

( 9 dní ) \n ( 9 days )
punct(dní, (-1)
nummod:gov(dní, 9-2)
punct(dní, )-4)
punct(days, (-6)
nummod:gov(days, 9-7)
punct(days, )-9)
5 minut včetně seřízení \n 5 minutes including adjustment
nummod:gov(minut, 5-1)
nmod(minut, seřízení)
case(seřízení, včetně)
nummod:gov(minutes, 5-6)
nmod(minutes, adjustment)
case(adjustment, including)

Dates

# This is not UD, it is Prague Dependency Treebank, and we want to clearly distinguish it from the UD examples.
# visual-style nodes yellow
# visual-style arcs blue
1    Ředitel         ředitel         NOUN    _   _   2   Sb     _   The-director
2    navrhl          navrhnout       VERB    _   _   0   Pred   _   proposed
3    zrušit          zrušit          VERB    _   _   2   Obj    _   to-disband
4    profesionální   profesionální   ADJ     _   _   5   Atr    _   the-professional
5    scénu           scéna           NOUN    _   _   3   Obj    _   scene
6    k               k               ADP     _   _   3   AuxP   _   towards
7    31              31              NUM     _   _   9   Atr    _   the-31
8    .               .               PUNCT   _   _   7   AuxG   _   th
9    12              12              NUM     _   _   6   Adv    _   December
10   .               .               PUNCT   _   _   9   AuxG   _   .
Ředitel navrhl zrušit profesionální scénu k 31 . 12 . \n Director proposed to-disband professional scene towards 31 st December .
advmod(zrušit, 12)
case(12, k)
punct(12, .-10)
nummod(12, 31-7)
punct(31-7, .-8)
advmod(to-disband, December)
case(December, towards)
punct(December, .-21)
nummod(December, 31-18)
punct(31-18, st)

Numerals expressed using digits are labeled nummod even if they represent ordinal numerals, which would be labeled amod:

# This is not UD, it is Prague Dependency Treebank, and we want to clearly distinguish it from the UD examples.
# visual-style nodes yellow
# visual-style arcs blue
1    Letošní     letošní      ADJ     _   _   2   Atr    _   This-year's
2    veletrh     veletrh      NOUN    _   _   4   Sb     _   fair
3    se          se           PRON    _   _   4   AuxR   _   itself
4    uskuteční   uskutečnit   VERB    _   _   0   Pred   _   will-take-place
5    od          od           ADP     _   _   4   AuxP   _   from
6    9           9            NUM     _   _   5   ExD    _   9
7    .           .            PUNCT   _   _   6   AuxG   _   th
8    do          do           ADP     _   _   4   AuxP   _   to
9    12          12           NUM     _   _   11  Atr    _   12
10   .           .            PUNCT   _   _   9   AuxG   _   th
11   března      březen       NOUN    _   _   8   Adv    _   March
12   .           .            PUNCT   _   _   0   AuxK   _   .
Letošní veletrh se uskuteční od 9 . do 12 . března . \n This-year's fair itself will-take-place from 9 th to 12 th March .
advmod(uskuteční, března)
case(března, do)
nummod(března, 12-9)
remnant(12-9, 9-6)
remnant(do, od)
advmod(will-take-place, March)
case(March, to)
nummod(March, 12-22)
remnant(12-22, 9-19)
remnant(to, from)

Numbered objects

House number in address is attached as nummod to the name of the street:

v budově Na poříčí 12 \n in the-building Na poříčí 12
nmod(budově, poříčí-4)
case(poříčí-4, Na-3)
nummod(poříčí-4, 12-5)
nmod(the-building, poříčí-10)
case(poříčí-10, Na-9)
nummod(poříčí-10, 12-11)

Treebank Statistics (UD_Czech)

This relation is universal. There are 1 language-specific subtypes of nummod: nummod:gov.

19668 nodes (1%) are attached to their parents as nummod.

11411 instances of nummod (58%) are right-to-left (child precedes parent). Average distance between parent and child is 1.57982509660362.

The following 11 pairs of parts of speech are connected with nummod: cs-pos/NOUN-cs-pos/NUM (17449; 89% instances), cs-pos/PROPN-cs-pos/NUM (1624; 8% instances), cs-pos/ADJ-cs-pos/NUM (260; 1% instances), cs-pos/SYM-cs-pos/NUM (152; 1% instances), cs-pos/NUM-cs-pos/NUM (99; 1% instances), cs-pos/PRON-cs-pos/NUM (30; 0% instances), cs-pos/CONJ-cs-pos/NUM (28; 0% instances), cs-pos/PUNCT-cs-pos/NUM (10; 0% instances), cs-pos/VERB-cs-pos/NUM (10; 0% instances), cs-pos/ADV-cs-pos/NUM (5; 0% instances), cs-pos/INTJ-cs-pos/NUM (1; 0% instances).

# visual-style 3	bgColor:blue
# visual-style 3	fgColor:white
# visual-style 1	bgColor:blue
# visual-style 1	fgColor:white
# visual-style 1 3 nummod	color:blue
1	Obrázek	obrázek	NOUN	NNIS1-----A----	Animacy=Inan|Case=Nom|Gender=Masc|Negative=Pos|Number=Sing	0	root	_	SpaceAfter=No
2	:	:	PUNCT	Z:-------------	_	3	punct	_	_
3	3	3	NUM	C=-------------	NumForm=Digit|NumType=Card	1	nummod	_	_

# visual-style 9	bgColor:blue
# visual-style 9	fgColor:white
# visual-style 7	bgColor:blue
# visual-style 7	fgColor:white
# visual-style 7 9 nummod	color:blue
1	Výrobce	výrobce	NOUN	NNMS1-----A----	Animacy=Anim|Case=Nom|Gender=Masc|Negative=Pos|Number=Sing	0	root	_	_
2	-	-	PUNCT	Z:-------------	_	1	punct	_	_
3	typ	typ	NOUN	NNIS1-----A----	Animacy=Inan|Case=Nom|Gender=Masc|Negative=Pos|Number=Sing	1	conj	_	SpaceAfter=No
4	:	:	PUNCT	Z:-------------	_	5	punct	_	_
5	PANASONIC	Panasonic	PROPN	NNIS1-----A----	Animacy=Inan|Case=Nom|Gender=Masc|NameType=Com,Pro|Negative=Pos|Number=Sing	3	nmod	_	_
6	PANAFAX	Panafax	PROPN	NNIS1-----A----	Animacy=Inan|Case=Nom|Gender=Masc|NameType=Pro|Negative=Pos|Number=Sing	5	nmod	_	_
7	UF	UF	PROPN	NNXXX-----A---8	Abbr=Yes|NameType=Pro|Negative=Pos	6	nmod	_	SpaceAfter=No|LId=UF-98
8	-	-	PUNCT	Z:-------------	_	9	punct	_	SpaceAfter=No
9	311	311	NUM	C=-------------	NumForm=Digit|NumType=Card	7	nummod	_	_

# visual-style 8	bgColor:blue
# visual-style 8	fgColor:white
# visual-style 9	bgColor:blue
# visual-style 9	fgColor:white
# visual-style 9 8 nummod	color:blue
1	The	the	ADJ	AAXXX----1A----	Degree=Pos|Foreign=Foreign|Negative=Pos	11	foreign	_	LId=the-1|LGloss=(obv._souč._anglických_názvů,_urč._člen)
2	Black	Black	ADJ	AAXXX----1A----	Degree=Pos|Foreign=Foreign|NameType=Com,Oth|Negative=Pos	11	foreign	_	_
3	Box	Box	ADJ	AAXXX----1A----	Degree=Pos|Foreign=Foreign|NameType=Com,Oth|Negative=Pos	11	foreign	_	_
4	Summer	Summer	ADJ	AAXXX----1A----	Degree=Pos|Foreign=Foreign|NameType=Oth|Negative=Pos	11	foreign	_	_
5	Festival	Festival	PROPN	NNIXX-----A----	Animacy=Inan|Foreign=Foreign|Gender=Masc|NameType=Oth|Negative=Pos	11	foreign	_	_
6	of	of	ADP	RR--X----------	AdpType=Prep|Foreign=Foreign	11	foreign	_	LId=of-1|LGloss=(obv._souč._anglických_názvů,_předl._2._p.)
7	Czech	Czech	ADJ	AAXXX----1A----	Degree=Pos|Foreign=Foreign|Negative=Pos	11	foreign	_	LId=Czech-2
8	20	20	NUM	C=-------------	NumForm=Digit|NumType=Card	9	nummod	_	SpaceAfter=No
9	th	th	ADJ	AAXXX----1A----	Degree=Pos|Foreign=Foreign|Negative=Pos	11	foreign	_	LId=th-2
10	Century	Century	ADJ	AAXXX----1A----	Degree=Pos|Foreign=Foreign|NameType=Oth|Negative=Pos	11	foreign	_	_
11	Plays	Plays	PROPN	NNFPX-----A----	Foreign=Foreign|Gender=Fem|NameType=Oth|Negative=Pos|Number=Plur	0	root	_	SpaceAfter=No
12	-	-	PUNCT	Z:-------------	_	11	punct	_	_


nummod in other languages: [bg] [cs] [de] [el] [en] [es] [eu] [fa] [fi] [fr] [ga] [he] [hu] [it] [ja] [ko] [sv] [u]