This is part of archived UD v1 documentation. See http://universaldependencies.org/ for the current version.
home nl/pos issue tracker

NUM: numeral

This document is a placeholder for the language-specific documentation for NUM.


Treebank Statistics (UD_Dutch)

There are 921 NUM lemmas (4%), 931 NUM types (3%) and 3859 NUM tokens (2%). Out of 16 observed tags, the rank of NUM is: 6 in number of lemmas, 7 in number of types and 14 in number of tokens.

The 10 most frequent NUM lemmas: twee, drie, vier, miljoen, 1, vijf, tien, beide, zes, 1969

The 10 most frequent NUM types: twee, drie, vier, miljoen, 1, vijf, tien, beide, zes, 1969

The 10 most frequent ambiguous lemmas: twee (NUM 263, X 4, PROPN 1), drie (NUM 182, X 3, NOUN 1, VERB 1), vier (NUM 110, VERB 8, AUX 4, ADJ 2, NOUN 1, PROPN 1), miljoen (NUM 102, NOUN 1), vijf (NUM 66, NOUN 2, X 2), tien (NUM 66, X 2), zes (NUM 59, X 2, ADJ 1, NOUN 1), 2 (NUM 44, NOUN 1), acht (NUM 43, VERB 15, AUX 3), uur (NOUN 89, NUM 41, X 2)

The 10 most frequent ambiguous types: twee (NUM 241, X 4, PROPN 1), drie (NUM 166, X 3), vier (NUM 101, NOUN 1, PROPN 1), vijf (NUM 59, X 2, NOUN 1), tien (NUM 62, X 2), zes (NUM 56, X 2), 2 (NUM 44, NOUN 1), acht (NUM 39, VERB 5), uur (NOUN 74, NUM 41, X 2), 3 (NUM 37, NOUN 1)

Morphology

The form / lemma ratio of NUM is 1.010858 (the average of all parts of speech is 1.258498).

The 1st highest number of forms (5) was observed with the lemma “miljoen”: mijoenen, miljoen, miljoenen, mln, mln..

The 2nd highest number of forms (3) was observed with the lemma “duizend”: duizend, duizenden, zeventigduizend.

The 3rd highest number of forms (3) was observed with the lemma “honderd”: honderd, honderden, vijfenvijftighonderd.

NUM occurs with 6 features: NumType (3859; 100% instances), Definite (3303; 86% instances), NumForm (543; 14% instances), Number (266; 7% instances), Degree (9; 0% instances), Case (3; 0% instances)

NUM occurs with 8 feature-value pairs: Case=Gen, Case=Nom, Definite=Def, Degree=Pos, NumForm=Digit, NumType=Card, Number=Plur, Number=Sing

NUM occurs with 10 feature combinations. The most frequent feature combination is Definite=Def|NumType=Card (3031 tokens). Examples: twee, drie, vier, miljoen, vijf, beide, tien, zes, acht, 1969

Relations

NUM nodes are attached to their parents using 18 different relations: nummod (1967; 51% instances), advmod (696; 18% instances), compound (585; 15% instances), appos (155; 4% instances), conj (135; 3% instances), dobj (113; 3% instances), dep (71; 2% instances), root (56; 1% instances), nsubj (55; 1% instances), cc (8; 0% instances), cop (8; 0% instances), advcl (3; 0% instances), nmod (2; 0% instances), acl (1; 0% instances), ccomp (1; 0% instances), mark (1; 0% instances), parataxis (1; 0% instances), xcomp (1; 0% instances)

Parents of NUM nodes belong to 15 different parts of speech: NOUN (1953; 51% instances), NUM (536; 14% instances), VERB (509; 13% instances), PROPN (242; 6% instances), X (206; 5% instances), AUX (166; 4% instances), ADJ (73; 2% instances), ROOT (56; 1% instances), ADV (36; 1% instances), PRON (34; 1% instances), DET (33; 1% instances), PUNCT (6; 0% instances), ADP (5; 0% instances), SCONJ (2; 0% instances), SYM (2; 0% instances)

2579 (67%) NUM nodes are leaves.

709 (18%) NUM nodes have one child.

299 (8%) NUM nodes have two children.

272 (7%) NUM nodes have three or more children.

The highest child degree of a NUM node is 10.

Children of NUM nodes are attached using 21 different relations: case (783; 34% instances), compound (442; 19% instances), nmod (256; 11% instances), advmod (167; 7% instances), cc (144; 6% instances), conj (142; 6% instances), punct (136; 6% instances), det (60; 3% instances), mark (46; 2% instances), cop (29; 1% instances), nsubj (27; 1% instances), advcl (22; 1% instances), dep (14; 1% instances), parataxis (14; 1% instances), aux (9; 0% instances), appos (3; 0% instances), dobj (3; 0% instances), name (2; 0% instances), compound:prt (1; 0% instances), neg (1; 0% instances), xcomp (1; 0% instances)

Children of NUM nodes belong to 15 different parts of speech: ADP (832; 36% instances), NUM (536; 23% instances), NOUN (205; 9% instances), PUNCT (143; 6% instances), CONJ (92; 4% instances), ADJ (76; 3% instances), X (73; 3% instances), PRON (70; 3% instances), ADV (68; 3% instances), DET (65; 3% instances), AUX (44; 2% instances), SCONJ (44; 2% instances), VERB (33; 1% instances), PROPN (18; 1% instances), SYM (3; 0% instances)


Treebank Statistics (UD_Dutch-LassySmall)

There are 590 NUM lemmas (4%), 665 NUM types (4%) and 3930 NUM tokens (4%). Out of 17 observed tags, the rank of NUM is: 5 in number of lemmas, 5 in number of types and 8 in number of tokens.

The 10 most frequent NUM lemmas: één, twee, 1, 2004, 2006, 2005, drie, 2003, 2, 20

The 10 most frequent NUM types: eerste, twee, 2004, 2006, 1, 2005, 2003, tweede, één, drie

The 10 most frequent ambiguous lemmas: één (NUM 372, PROPN 6), 2 (NUM 62, X 1), 7 (NUM 26, SYM 1), 8 (NUM 23, SYM 1), 1966 (NUM 12, SYM 3), eerst (ADV 19, NUM 8), 1965 (NUM 7, SYM 6), II (PROPN 47, NUM 2), 1/8 (NUM 1, SYM 1), 1999-2004 (NUM 1, SYM 1)

The 10 most frequent ambiguous types: één (NUM 72, PROPN 6), 2 (NUM 56, X 1), een (DET 1598, NUM 45), eerst (NUM 34, ADV 16), 7 (NUM 24, SYM 1), vier (NUM 21, VERB 1), 8 (NUM 23, SYM 1), 1966 (NUM 12, SYM 3), vierde (NUM 11, VERB 2), 1965 (NUM 7, SYM 6)

Morphology

The form / lemma ratio of NUM is 1.127119 (the average of all parts of speech is 1.179900).

The 1st highest number of forms (7) was observed with the lemma “één”: Eén, een, eentje, eerst, eerste, eersten, één.

The 2nd highest number of forms (4) was observed with the lemma “15”: 15, 15de, 15e, XVde.

The 3rd highest number of forms (4) was observed with the lemma “19”: 18-19, 19, 19de, 19e.

NUM does not occur with any features.

Relations

NUM nodes are attached to their parents using 15 different relations: nummod (1374; 35% instances), nmod (1039; 26% instances), root (543; 14% instances), mwe (333; 8% instances), appos (177; 5% instances), parataxis (172; 4% instances), conj (123; 3% instances), acl (68; 2% instances), det (40; 1% instances), nsubj (30; 1% instances), advcl (15; 0% instances), dobj (13; 0% instances), amod (1; 0% instances), cc (1; 0% instances), ccomp (1; 0% instances)

Parents of NUM nodes belong to 15 different parts of speech: NOUN (1649; 42% instances), VERB (679; 17% instances), ROOT (543; 14% instances), PROPN (397; 10% instances), NUM (363; 9% instances), SYM (70; 2% instances), ADJ (68; 2% instances), DET (67; 2% instances), ADP (42; 1% instances), X (31; 1% instances), PRON (9; 0% instances), ADV (5; 0% instances), PUNCT (5; 0% instances), INTJ (1; 0% instances), SCONJ (1; 0% instances)

2066 (53%) NUM nodes are leaves.

870 (22%) NUM nodes have one child.

408 (10%) NUM nodes have two children.

586 (15%) NUM nodes have three or more children.

The highest child degree of a NUM node is 18.

Children of NUM nodes are attached using 19 different relations: punct (1124; 28% instances), case (1006; 25% instances), mwe (614; 15% instances), parataxis (561; 14% instances), nmod (218; 5% instances), conj (115; 3% instances), det (87; 2% instances), cc (71; 2% instances), cop (59; 1% instances), nsubj (54; 1% instances), advmod (36; 1% instances), mark (20; 0% instances), acl (14; 0% instances), appos (13; 0% instances), amod (8; 0% instances), dobj (4; 0% instances), nummod (3; 0% instances), advcl (2; 0% instances), aux (2; 0% instances)

Children of NUM nodes belong to 16 different parts of speech: PUNCT (1125; 28% instances), ADP (1020; 25% instances), PROPN (739; 18% instances), NUM (363; 9% instances), NOUN (345; 9% instances), DET (95; 2% instances), CONJ (67; 2% instances), AUX (61; 2% instances), PRON (51; 1% instances), ADV (41; 1% instances), ADJ (32; 1% instances), VERB (25; 1% instances), SYM (15; 0% instances), X (14; 0% instances), PART (11; 0% instances), SCONJ (7; 0% instances)


NUM in other languages: [bg] [cs] [de] [el] [en] [es] [eu] [fa] [fi] [fr] [ga] [he] [hu] [it] [ja] [ko] [sv] [u]