nummod
: numeric modifier
A numeric modifier of a noun is any number phrase that serves to modify the meaning of the noun with a quantity.
Agreement and government with Czech quantifiers
The morphological and syntactic behavior of Czech numerals is a complex matter. Small cardinal numerals jeden “one”, dva “two”, tři “three” and čtyři “four” agree with the counted noun in Case (jeden also agrees in Gender and Number; dva also agrees in Gender). They behave as if they modify the counted noun; they are similar to adjectives in this respect. Examples:
- Jeden muž spal, dva muži hráli karty. “One man slept, two men played cards.”
- Jedna žena spala, dvě ženy hrály karty. “One woman slept, two women played cards.”
- Jedno kotě spalo, dvě koťata si hrála. “One kitten slept, two kittens played.”
In PDT, these numerals are attached to their counted nouns as Atr
(attribute).
It is straightforward to convert such dependencies to nummod
:
Larger cardinals behave differently. They require that the counted noun be in the genitive case; this indicates that they actually govern the noun. Such constructions are parallel to nouns modified by other noun phrases in genitive. The whole phrase (numeral + counted noun) behaves as a noun phrase in neuter gender and singular number (which is important for subject-verb agreement).
- Pět mužů hrálo karty. “Five men played cards.”
- Skupina mužů hrála karty. “A group of men played cards.”
In PDT, these numerals are analyzed as heads of the counted nouns, which are attached to the numeral as Atr
:
There are both advantages and drawbacks to this solution. On the one hand, it reflects well the agreement in case, gender and number. On the other hand, it is confusing that there are two different analyses of counted noun constructions, depending on the numeric value.
Moreover, the numeral does not govern the noun in all morphological cases. The following table shows the case of the whole phrase (numeral + noun; first column) and the consequences for the case of the parts (note that these numerals have only two distinct morphological forms, resulting in homonymy).
Phrase Case | Example | Numeral Case | Noun Case |
---|---|---|---|
Nom | pět mužů | Nom | Gen |
Gen | pěti mužů | Gen | Gen |
Dat | pěti mužům | Dat | Dat |
Acc | pět mužů | Acc | Gen |
Voc | pět mužů | Voc | Gen |
Loc | pěti mužích | Loc | Loc |
Ins | pěti muži | Ins | Ins |
We can say that the noun has the case of the whole phrase if it is dative, locative or instrumental. The numeral then agrees with the noun in case. The numeral forces the noun to the genitive case if the whole phrase is nominative, accusative or vocative (but the vocative usage is rather hypothetical). In genitive, the noun and the numeral agree with each other; but note that the numeral uses its inflected form, as in the other cases where it agrees with the noun.
In PDT, the genitive, dative, locative and instrumental cases are analyzed in parallel to the low-value numerals, i.e. the noun governs the numeral:
High-value numerals where the lowest-order digit is more than zero and less than five (e.g. 21, 22, 23, 24) may behave both ways:
- dvacet dva muži (noun governs numeral)
- dvacet dva mužů (numeral governs noun)
- dvaadvacet mužů (alternative form; it does not end with dva, thus the numeral governs the noun)
- 22 muži (assuming the reader will pronounce 22 as dvacet dva, not dvaadvacet)
- 22 mužů (pronounced either way)
Pronominal quantifiers behave as high-value numerals and govern the quantifed nouns:
- Kolik mužů hrálo karty? “How many men played cards?”
- Několik (mnoho, málo) mužů hrálo karty. “Several (many, few) men played cards.”
- Tolik mužů hrát karty jsem ještě neviděl. “I have never seen so many men playing cards.”
The UD conversion of the PDT data unifies analyses of counted noun phrases and uses a structure that is parallel among all the above cases, and also with universal dependencies in other languages. The counted noun is always the head and the numeral is always attached as its modifier. Nevertheless, we use different relation labels to mark situations where the numeral (or quantifier) actually governs the morphological case of the noun. There are four labels used:
Numeric | Pronominal | |
Noun governs | nummod | det:nummod |
Numeral governs | nummod:gov | det:numgov |
Additional remarks
In PDT the words milión “million”, miliarda “billion” and higher are usually tagged as nouns, not as numerals. In the typical case, the million is in genitive, it is preceded by a smaller number, and it is not followed by smaller numerals (as it is in million five hundred thousand). It is followed by the counted noun. Thus the following examples receive parallel analyses:
On the other hand the word tisíc “thousand” may be a noun (na náměstí byly tisíce lidí “there were thousands of people in the square”) or a numeral:
Note that the two numeral words in the above example are joined using the compound relation. Also note that the intensifier nanejvýš is attached to the head of the phrase (korun) and not to the number. This is in accord both with the UD guidelines and with the original PDT annotation of agreeing numerals (e.g. jen čtyři firmy, jen několik procent).
Similarly there may be other nodes (such as punctuation) that are attached to the head of the phrase and they are related to the whole phrase rather than directly to the head noun:
Dates
Numerals expressed using digits are labeled nummod
even if they represent ordinal numerals,
which would be labeled amod
:
Numbered objects
House number in address is attached as nummod
to the name of the street:
nummod in other languages: [bg] [bm] [cop] [cs] [de] [el] [en] [es] [eu] [fi] [fr] [fro] [ga] [gsw] [hy] [it] [ja] [ka] [kk] [ky] [no] [pcm] [pt] [qpm] [ro] [ru] [sl] [ssp] [sv] [swl] [tr] [u] [xcl] [yue] [zh]