NUM

This is part of archived UD v1 documentation. See http://universaldependencies.org/ for the current version.

home et/pos issue tracker

`NUM`: numeral

Definition

A numeral is a word that expresses a number and a relation to the number, such as quantity, sequence, frequency or fraction.
Both cardinal and ordinal numerals get the postag NUM. Also words like paar “pair”, paarsada “about twenty”, paarkümmend “about two hundred” etc, tosin “dozen” are labelled as NUM.

Treebank Statistics (UD_Estonian)

There are 773 NUM lemmas (3%), 963 NUM types (2%) and 4131 NUM tokens (2%). Out of 15 observed tags, the rank of NUM is: 6 in number of lemmas, 6 in number of types and 12 in number of tokens.

The 10 most frequent NUM lemmas: kaks, üks, kolm, miljon, viis, kümme, neli, pool, paar, 000

The 10 most frequent NUM types: kaks, kolm, 000, üks, kahe, miljonit, ühe, paar, viis, neli

The 10 most frequent ambiguous lemmas: üks (PRON 495, NUM 207), viis (NUM 114, NOUN 27), pool (NUM 98, NOUN 49, ADV 20, ADP 17), paar (NUM 97, NOUN 12), 2000 (NUM 48, ADJ 1), seitse (NUM 48, NOUN 1), sada (NUM 45, VERB 4, NOUN 2), kolmandik (NUM 18, NOUN 6), 2001 (NUM 12, ADJ 1), 1990 (NUM 11, ADJ 1)

The 10 most frequent ambiguous types: üks (PRON 152, NUM 70), ühe (PRON 70, NUM 58), paar (NUM 50, NOUN 3), viis (NUM 57, VERB 18, NOUN 5), 2000 (NUM 47, ADJ 1), pool (NUM 37, ADV 20, ADP 16, NOUN 9), poole (ADP 116, NUM 31, NOUN 6, ADV 2), seitse (NUM 28, NOUN 1), kuus (NUM 27, NOUN 27), paari (NUM 17, NOUN 6)

üks
- PRON 152: Just praegu leidis üks võõras rüütel Püha Graali .
- NUM 70: Jälle üks piinarikas öö !
ühe
- PRON 70: ” Kuule , aga teeks enne ühe õlle veel ? “ pakun sõbralikult .
- NUM 58: Nüüd töötas Liis ühe päeva ühes Tallinna keemilises puhastuses .
paar
- NUM 50: Praegu peaks selliseid miljonäre olema juba paar korda rohkem .
- NOUN 3: Lenini ordeneid tilgub paar tükki kuus .
viis
- NUM 57: Pane purkidele kaaned lahtiselt peale ja steriliseeri viis minutit .
- VERB 18: Politsei viis mehe arestimaja kainestuskambrisse välja magama .
- NOUN 5: Odavaim ja kõige turvalisem viis reisida on sõita laevaga mööda meresid .
2000
- NUM 47: Kadikas , 2000 .
- ADJ 1: Kuid ei - kohe jõuab kätte 5. mai 2000 .
pool
- NUM 37: Kala pidi valvest tulema pool kuus .
- ADV 20: Läbi aknaklaasi nägin , et igal pool poodides olid järjekorrad .
- ADP 16: Teisel pool maja sööklas ei ole ühtegi praadi kalast .
- NOUN 9: Vene pool samaga ei vastanud .
poole
- ADP 116: Ta komposteeris pileti ja taarus tagaakna poole .
- NUM 31: Üle poole neist lasti maha pärast Berliini müüri püstitamist .
- NOUN 6: Turnee on poole peal ja mul on tunne , et finisheerimiseks tuleks tellida takso .
- ADV 2: Nuputab isekeskis , kuidas koduse remondiga kiiremini ühele poole saaks .
seitse
- NUM 28: Neid sõlmis paela nii , et saapa pealispinnale jäi seitse musta risti .
- NOUN 1: See seitse on täiesti elektrisinine ja selle valgus särab kaugele öösse .
kuus
- NUM 27: Kala pidi valvest tulema pool kuus .
- NOUN 27: Pealegi teenivad need poisid kuus natuke üle saja dollari .
paari
- NUM 17: Iga paari sammu tagant torkab lati lumme .
- NOUN 6: Põldsamid söödavad oma tiigerpüütonite paari elusate küülikutega .

Morphology

The form / lemma ratio of NUM is 1.245796 (the average of all parts of speech is 1.839644).

The 1st highest number of forms (17) was observed with the lemma “üks”: ühe, ühe-, ühega, üheks, ühel, ühele, ühelgi, ühelt, ühena, ühes, ühest, üht, ühte, ühtegi, ühtki, üks, ükski.

The 2nd highest number of forms (13) was observed with the lemma “kümme”: Kümned, kümme, kümmet, kümne, kümnega, kümneid, kümneks, kümnel, kümnele, kümnest, kümnete, kümnetesse, kümnetest.

The 3rd highest number of forms (12) was observed with the lemma “miljon”: Miljonitel, miljon, miljoneid, miljoni, miljonid, miljoniga, miljonilt, miljonini, miljonist, miljonit, miljonite, miljonitest.

NUM occurs with 6 features: NumType (4131; 100% instances), NumForm (4077; 99% instances), Case (3275; 79% instances), Number (3275; 79% instances), Hyph (6; 0% instances), PronType (1; 0% instances)

NUM occurs with 22 feature-value pairs: Case=Abl, Case=Add, Case=Ade, Case=All, Case=Com, Case=Ela, Case=Ess, Case=Gen, Case=Ill, Case=Ine, Case=Nom, Case=Par, Case=Ter, Case=Tra, Hyph=Yes, NumForm=Digit, NumForm=Letter, NumType=Card, NumType=Ord, Number=Plur, Number=Sing, PronType=Ind

NUM occurs with 50 feature combinations. The most frequent feature combination is Case=Nom|Number=Sing|NumForm=Digit|NumType=Card (983 tokens). Examples: 000, 2000, 1997, 1999, 15, 1998, 20, 2002, 50, 1

Relations

NUM nodes are attached to their parents using 15 different relations: nummod (3386; 82% instances), compound (342; 8% instances), conj (108; 3% instances), root (98; 2% instances), nsubj (80; 2% instances), dobj (52; 1% instances), parataxis (27; 1% instances), nsubj:cop (19; 0% instances), dep (5; 0% instances), acl:relcl (4; 0% instances), list (3; 0% instances), nmod (3; 0% instances), csubj (2; 0% instances), amod (1; 0% instances), name (1; 0% instances)

Parents of NUM nodes belong to 11 different parts of speech: NOUN (2584; 63% instances), NUM (566; 14% instances), VERB (462; 11% instances), PROPN (279; 7% instances), ROOT (98; 2% instances), ADJ (82; 2% instances), ADV (39; 1% instances), SYM (9; 0% instances), ADP (8; 0% instances), X (3; 0% instances), AUX (1; 0% instances)

2623 (63%) NUM nodes are leaves.

1115 (27%) NUM nodes have one child.

250 (6%) NUM nodes have two children.

143 (3%) NUM nodes have three or more children.

The highest child degree of a NUM node is 14.

Children of NUM nodes are attached using 21 different relations: punct (440; 20% instances), advmod (383; 17% instances), compound (339; 15% instances), nmod (318; 14% instances), nummod (156; 7% instances), conj (124; 6% instances), case (100; 4% instances), amod (97; 4% instances), cc (80; 4% instances), det (53; 2% instances), nsubj:cop (44; 2% instances), cop (41; 2% instances), appos (12; 1% instances), advcl (8; 0% instances), parataxis (8; 0% instances), mark (6; 0% instances), nmod:poss (6; 0% instances), dep (5; 0% instances), advmod:quant (4; 0% instances), nsubj (2; 0% instances), xcomp (1; 0% instances)

Children of NUM nodes belong to 13 different parts of speech: NUM (566; 25% instances), PUNCT (440; 20% instances), ADV (396; 18% instances), NOUN (343; 15% instances), ADJ (109; 5% instances), ADP (100; 4% instances), PRON (83; 4% instances), CONJ (79; 4% instances), VERB (54; 2% instances), PROPN (46; 2% instances), SCONJ (6; 0% instances), SYM (4; 0% instances), X (1; 0% instances)

NUM in other languages: [bg] [cs] [de] [el] [en] [es] [eu] [fa] [fi] [fr] [ga] [he] [hu] [it] [ja] [ko] [sv] [u]

NUM: numeral

Treebank Statistics (UD_Estonian)

Morphology

Relations

`NUM`: numeral