NumType
: numeral type
Czech has a complex system of numerals. For example, in the school grammar of Czech, the main part of speech is “numeral”, it includes almost everything where counting is involved and there are various subtypes. It also includes interrogative, relative, indefinite and demonstrative quantifiers (words like kolik “how many”, tolik “so many”, několik “several”), so at the same time we may have a non-empty value of PronType.
From the syntactic point of view, some numtypes behave like adjectives
and some behave like adverbs. We tag them cs-pos/ADJ and
cs-pos/ADV respectively. Thus the NumType
feature applies to
several different parts of speech:
- cs-pos/NUM: cardinal numerals
- cs-pos/DET: quantifiers
- cs-pos/ADJ: adjectival ordinal and some generic numerals
- cs-pos/ADV: adverbial (e.g. ordinal and multiplicative) numerals
Card
: cardinal number or corresponding interrogative / relative / indefinite / demonstrative word
Examples
- jeden, dva, tři “one, two, three”
- kolik “how many”
- několik “several”, mnoho “many”, málo “few”
- tolik “so many”
Ord
: ordinal number or corresponding interrogative / relative / indefinite / demonstrative word
This is a subtype of adjective or adverb.
Adjectival examples
- první “first”; druhý “second”, třetí “third”
- kolikátý lit. how manieth “which rank”
- několikátý “some rank”
- tolikátý “this/that rank”
Adverbial examples
- poprvé “for the first time”; podruhé “for the second time”; potřetí “for the third time”
- pokolikáté “for which time”
- poněkolikáté “for x-th time”
- potolikáté “it has been so many times”
Mult
: multiplicative numeral or corresponding interrogative / relative / indefinite / demonstrative word
This is a subtype of adverb.
Examples
- jednou “once”; dvakrát “twice”; třikrát “three times”
- kolikrát “how many times”
- několikrát “several times”
- tolikrát “so many times”
Frac
: fraction
This is a subtype of cardinal numbers. It may denote a fraction or just the denominator of the fraction.
Examples
- půl / polovina “half”; třetina “one third”; čtvrt / čtvrtina “quarter”
Sets
: number of sets of things
Morphologically distinct class of numerals used to count sets of things, or nouns that are pluralia tantum.
Examples
- dvoje / troje boty “two / three [pairs of] shoes”; as opposed to normal cardinal numbers: dvě / tři boty “two / three shoes”
Gen
: generic numeral, i.e. a numeral that is neither of the above
Czech school grammar distinguishes this subclass, which is why it
appears in Czech tagsets. (Note that
“generic numerals” in Czech grammar also include the Sets
subclass
mentioned above.)
Examples
- čtvero, patero, desatero (specific forms of four, five, ten; they are morphologically, syntactically and stylistically distinct from the default forms čtyři, pět, deset)
- dvojí, trojí, čtverý (twofold, threefold, fourfold; these are morphologically and syntactically adjectives)
Treebank Statistics (UD_Czech)
This feature is universal.
It occurs with 6 different values: Card
, Frac
, Gen
, Mult
, Ord
, Sets
.
49212 tokens (3%) have a non-empty value of NumType
.
4024 types (3%) occur at least once with a non-empty value of NumType
.
3572 lemmas (6%) occur at least once with a non-empty value of NumType
.
The feature is used with 5 part-of-speech tags: cs-pos/NUM (41510; 3% instances), cs-pos/ADJ (4990; 0% instances), cs-pos/DET (1552; 0% instances), cs-pos/ADV (741; 0% instances), cs-pos/PRON (419; 0% instances).
NUM
41510 cs-pos/NUM tokens (100% of all NUM
tokens) have a non-empty value of NumType
.
The most frequent other feature values with which NUM
and NumType
co-occurred: Gender=EMPTY (36751; 89%), NumValue=EMPTY (33460; 81%), Case=EMPTY (29887; 72%), Number=EMPTY (29861; 72%), NumForm=Digit (29484; 71%).
NUM
tokens may have the following values of NumType
:
Card
(41168; 99% of non-emptyNumType
): 1, 2, 3, dva, tři, 4, jeden, 6, dvě, tisícFrac
(342; 1% of non-emptyNumType
): třetiny, třetinu, třetina, třetině, čtvrtinu, čtvrtina, desetinu, čtvrtiny, pětinu, desetina
NumType
seems to be lexical feature of NUM
. 100% lemmas (3436) occur only with one value of NumType
.
ADJ
4990 cs-pos/ADJ tokens (3% of all ADJ
tokens) have a non-empty value of NumType
.
The most frequent other feature values with which ADJ
and NumType
co-occurred: Degree=EMPTY (4990; 100%), Negative=EMPTY (4990; 100%), Number=Sing (4215; 84%), Animacy=EMPTY (3246; 65%).
ADJ
tokens may have the following values of NumType
:
Gen
(62; 1% of non-emptyNumType
): dvojí, obojí, dvojím, dvojího, obojím, trojí, dvojími, obéhoOrd
(4889; 98% of non-emptyNumType
): první, druhé, prvním, třetí, druhý, druhou, prvních, prvního, druhá, druhémSets
(39; 1% of non-emptyNumType
): jedny, jedni, dvoje, jedněch, jedněm, oboje, jedněmi, obé, trojeEMPTY
(175821): další, české, nové, poslední, státní, dalších, možné, vlastní, jiné, každý
Paradigm dvojí | Sets | Gen |
---|---|---|
Animacy=Inan|Case=Acc|Gender=Masc|Number=Sing | dvojí | |
Case=Acc|Gender=Fem|Number=Sing | dvojí | |
Case=Acc|Gender=Neut|Number=Sing | dvojí | |
Case=Acc|Number=Plur | dvoje | |
Case=Gen|Gender=Fem|Number=Sing | dvojí | |
Case=Gen|Gender=Neut|Number=Sing | dvojího | |
Case=Ins|Gender=Masc|Number=Sing | dvojím | |
Case=Ins|Gender=Fem|Number=Sing | dvojí | |
Case=Ins|Gender=Neut|Number=Sing | dvojím | |
Case=Ins|Number=Plur | dvojími | |
Case=Loc|Gender=Neut|Number=Sing | dvojím | |
Case=Nom|Number=Sing | dvojí |
NumType
seems to be lexical feature of ADJ
. 96% lemmas (64) occur only with one value of NumType
.
DET
1552 cs-pos/DET tokens (6% of all DET
tokens) have a non-empty value of NumType
.
The most frequent other feature values with which DET
and NumType
co-occurred: Number[psor]=EMPTY (1552; 100%), Gender[psor]=EMPTY (1552; 100%), Poss=EMPTY (1552; 100%), Reflex=EMPTY (1552; 100%), Person=EMPTY (1552; 100%), Gender=EMPTY (1542; 99%), Number=EMPTY (1542; 99%), PronType=Dem,Ind (1454; 94%).
DET
tokens may have the following values of NumType
:
Card
(1551; 100% of non-emptyNumType
): několik, několika, mnoho, mnoha, kolik, málo, tolik, mála, moc, tolikaOrd
(1; 0% of non-emptyNumType
): několikátýEMPTY
(26261): jeho, jejich, své, této, její, tento, tohoto, svou, tato, těchto
NumType
seems to be lexical feature of DET
. 100% lemmas (13) occur only with one value of NumType
.
ADV
741 cs-pos/ADV tokens (1% of all ADV
tokens) have a non-empty value of NumType
.
The most frequent other feature values with which ADV
and NumType
co-occurred: Degree=EMPTY (741; 100%), Negative=EMPTY (741; 100%).
ADV
tokens may have the following values of NumType
:
Mult
(413; 56% of non-emptyNumType
): dvakrát, jednou, třikrát, pětkrát, desetkrát, čtyřikrát, nejednou, šestkrát, jedenkrát, sedmkrátOrd
(328; 44% of non-emptyNumType
): poprvé, podruhé, potřetí, počtvrté, pošesté, podvanácté, popáté, Pošestnácté, podesáté, podvaadvacátéEMPTY
(79133): tak, už, také, jak, včera, ještě, již, tedy, dnes, pak
NumType
seems to be lexical feature of ADV
. 100% lemmas (49) occur only with one value of NumType
.
PRON
419 cs-pos/PRON tokens (1% of all PRON
tokens) have a non-empty value of NumType
.
The most frequent other feature values with which PRON
and NumType
co-occurred: Variant=EMPTY (419; 100%), Reflex=EMPTY (419; 100%), Person=EMPTY (419; 100%), Gender=EMPTY (413; 99%), Number=EMPTY (413; 99%), PronType=Dem,Ind (318; 76%).
PRON
tokens may have the following values of NumType
:
Card
(295; 70% of non-emptyNumType
): kolik, mnoho, tolik, málo, moc, několik, několika, mnoha, nejeden, nemáloMult
(123; 29% of non-emptyNumType
): několikrát, mnohokrát, vícekrát, kolikrát, tolikrát, mnohokráte, bezpočtukrát, nesčíslněkrát, několikráteOrd
(1; 0% of non-emptyNumType
): několikátéEMPTY
(72129): se, to, si, které, který, která, co, tím, kteří, tom
NumType
seems to be lexical feature of PRON
. 100% lemmas (19) occur only with one value of NumType
.
Relations with Agreement in NumType
The 10 most frequent relations where parent and child node agree in NumType
:
NUM –[conj]–> NUM (3378; 100%),
NUM –[compound]–> NUM (2797; 100%),
ADJ –[conj]–> ADJ (75; 56%),
NUM –[dep]–> NUM (52; 100%),
NUM –[det:nummod]–> DET (16; 100%),
DET –[conj]–> PRON (4; 80%),
PRON –[conj]–> PRON (3; 100%),
DET –[appos]–> NUM (3; 100%),
DET –[det:nummod]–> DET (2; 100%),
DET –[dep]–> NUM (1; 100%).
NumType in other languages: [bg] [cs] [de] [el] [en] [es] [eu] [fa] [fi] [fr] [ga] [he] [hu] [it] [ja] [ko] [sv] [u]