NUM
: numeral
This document is a placeholder for the language-specific documentation
for NUM
.
Treebank Statistics (UD_Dutch)
There are 921 NUM
lemmas (4%), 931 NUM
types (3%) and 3859 NUM
tokens (2%).
Out of 16 observed tags, the rank of NUM
is: 6 in number of lemmas, 7 in number of types and 14 in number of tokens.
The 10 most frequent NUM
lemmas: twee, drie, vier, miljoen, 1, vijf, tien, beide, zes, 1969
The 10 most frequent NUM
types: twee, drie, vier, miljoen, 1, vijf, tien, beide, zes, 1969
The 10 most frequent ambiguous lemmas: twee (NUM 263, X 4, PROPN 1), drie (NUM 182, X 3, NOUN 1, VERB 1), vier (NUM 110, VERB 8, AUX 4, ADJ 2, NOUN 1, PROPN 1), miljoen (NUM 102, NOUN 1), vijf (NUM 66, NOUN 2, X 2), tien (NUM 66, X 2), zes (NUM 59, X 2, ADJ 1, NOUN 1), 2 (NUM 44, NOUN 1), acht (NUM 43, VERB 15, AUX 3), uur (NOUN 89, NUM 41, X 2)
The 10 most frequent ambiguous types: twee (NUM 241, X 4, PROPN 1), drie (NUM 166, X 3), vier (NUM 101, NOUN 1, PROPN 1), vijf (NUM 59, X 2, NOUN 1), tien (NUM 62, X 2), zes (NUM 56, X 2), 2 (NUM 44, NOUN 1), acht (NUM 39, VERB 5), uur (NOUN 74, NUM 41, X 2), 3 (NUM 37, NOUN 1)
- twee
- NUM 241: het weegt twee kilo
- X 4: De trein komt om kwart over twee aan .
- PROPN 1: Kenmerkend is , dat de VPRO minder bezwaren had dan de TROS , misschien hierom , omdat de VPRO toch al nooit gewend was te freewheelen , terwijl de TROS gemakshalve een maandagavond op Nederland twee nog weleens als een weggevertje beschouwde .
- drie
- vier
- NUM 101: Het bronzen beeld zal twaalf meter breed en vier meter hoog worden .
- NOUN 1: De vier van Laga is dezelfde als vorig jaar .
- PROPN 1: De succesvolle start van de schoolconcerten door het Rijnmond Kamerensemble , geleid door Jan van der Waart vindt dezer dagen zijn vervolg in een nieuwe serie “ Catootjes “ , voorlopig een achttal , en niet minder dan tienmaal het vervolgprogramma dat “ De vier weverkens “ is gaan heten .
- vijf
- tien
- zes
- 2
- acht
- uur
- 3
Morphology
The form / lemma ratio of NUM
is 1.010858 (the average of all parts of speech is 1.258498).
The 1st highest number of forms (5) was observed with the lemma “miljoen”: mijoenen, miljoen, miljoenen, mln, mln..
The 2nd highest number of forms (3) was observed with the lemma “duizend”: duizend, duizenden, zeventigduizend.
The 3rd highest number of forms (3) was observed with the lemma “honderd”: honderd, honderden, vijfenvijftighonderd.
NUM
occurs with 6 features: NumType (3859; 100% instances), Definite (3303; 86% instances), NumForm (543; 14% instances), Number (266; 7% instances), Degree (9; 0% instances), Case (3; 0% instances)
NUM
occurs with 8 feature-value pairs: Case=Gen
, Case=Nom
, Definite=Def
, Degree=Pos
, NumForm=Digit
, NumType=Card
, Number=Plur
, Number=Sing
NUM
occurs with 10 feature combinations.
The most frequent feature combination is Definite=Def|NumType=Card
(3031 tokens).
Examples: twee, drie, vier, miljoen, vijf, beide, tien, zes, acht, 1969
Relations
NUM
nodes are attached to their parents using 18 different relations: nummod (1967; 51% instances), advmod (696; 18% instances), compound (585; 15% instances), appos (155; 4% instances), conj (135; 3% instances), dobj (113; 3% instances), dep (71; 2% instances), root (56; 1% instances), nsubj (55; 1% instances), cc (8; 0% instances), cop (8; 0% instances), advcl (3; 0% instances), nmod (2; 0% instances), acl (1; 0% instances), ccomp (1; 0% instances), mark (1; 0% instances), parataxis (1; 0% instances), xcomp (1; 0% instances)
Parents of NUM
nodes belong to 15 different parts of speech: NOUN (1953; 51% instances), NUM (536; 14% instances), VERB (509; 13% instances), PROPN (242; 6% instances), X (206; 5% instances), AUX (166; 4% instances), ADJ (73; 2% instances), ROOT (56; 1% instances), ADV (36; 1% instances), PRON (34; 1% instances), DET (33; 1% instances), PUNCT (6; 0% instances), ADP (5; 0% instances), SCONJ (2; 0% instances), SYM (2; 0% instances)
2579 (67%) NUM
nodes are leaves.
709 (18%) NUM
nodes have one child.
299 (8%) NUM
nodes have two children.
272 (7%) NUM
nodes have three or more children.
The highest child degree of a NUM
node is 10.
Children of NUM
nodes are attached using 21 different relations: case (783; 34% instances), compound (442; 19% instances), nmod (256; 11% instances), advmod (167; 7% instances), cc (144; 6% instances), conj (142; 6% instances), punct (136; 6% instances), det (60; 3% instances), mark (46; 2% instances), cop (29; 1% instances), nsubj (27; 1% instances), advcl (22; 1% instances), dep (14; 1% instances), parataxis (14; 1% instances), aux (9; 0% instances), appos (3; 0% instances), dobj (3; 0% instances), name (2; 0% instances), compound:prt (1; 0% instances), neg (1; 0% instances), xcomp (1; 0% instances)
Children of NUM
nodes belong to 15 different parts of speech: ADP (832; 36% instances), NUM (536; 23% instances), NOUN (205; 9% instances), PUNCT (143; 6% instances), CONJ (92; 4% instances), ADJ (76; 3% instances), X (73; 3% instances), PRON (70; 3% instances), ADV (68; 3% instances), DET (65; 3% instances), AUX (44; 2% instances), SCONJ (44; 2% instances), VERB (33; 1% instances), PROPN (18; 1% instances), SYM (3; 0% instances)
Treebank Statistics (UD_Dutch-LassySmall)
There are 590 NUM
lemmas (4%), 665 NUM
types (4%) and 3930 NUM
tokens (4%).
Out of 17 observed tags, the rank of NUM
is: 5 in number of lemmas, 5 in number of types and 8 in number of tokens.
The 10 most frequent NUM
lemmas: één, twee, 1, 2004, 2006, 2005, drie, 2003, 2, 20
The 10 most frequent NUM
types: eerste, twee, 2004, 2006, 1, 2005, 2003, tweede, één, drie
The 10 most frequent ambiguous lemmas: één (NUM 372, PROPN 6), 2 (NUM 62, X 1), 7 (NUM 26, SYM 1), 8 (NUM 23, SYM 1), 1966 (NUM 12, SYM 3), eerst (ADV 19, NUM 8), 1965 (NUM 7, SYM 6), II (PROPN 47, NUM 2), 1/8 (NUM 1, SYM 1), 1999-2004 (NUM 1, SYM 1)
The 10 most frequent ambiguous types: één (NUM 72, PROPN 6), 2 (NUM 56, X 1), een (DET 1598, NUM 45), eerst (NUM 34, ADV 16), 7 (NUM 24, SYM 1), vier (NUM 21, VERB 1), 8 (NUM 23, SYM 1), 1966 (NUM 12, SYM 3), vierde (NUM 11, VERB 2), 1965 (NUM 7, SYM 6)
- één
- 2
- een
- eerst
- 7
- vier
- 8
- 1966
- vierde
- 1965
Morphology
The form / lemma ratio of NUM
is 1.127119 (the average of all parts of speech is 1.179900).
The 1st highest number of forms (7) was observed with the lemma “één”: Eén, een, eentje, eerst, eerste, eersten, één.
The 2nd highest number of forms (4) was observed with the lemma “15”: 15, 15de, 15e, XVde.
The 3rd highest number of forms (4) was observed with the lemma “19”: 18-19, 19, 19de, 19e.
NUM
does not occur with any features.
Relations
NUM
nodes are attached to their parents using 15 different relations: nummod (1374; 35% instances), nmod (1039; 26% instances), root (543; 14% instances), mwe (333; 8% instances), appos (177; 5% instances), parataxis (172; 4% instances), conj (123; 3% instances), acl (68; 2% instances), det (40; 1% instances), nsubj (30; 1% instances), advcl (15; 0% instances), dobj (13; 0% instances), amod (1; 0% instances), cc (1; 0% instances), ccomp (1; 0% instances)
Parents of NUM
nodes belong to 15 different parts of speech: NOUN (1649; 42% instances), VERB (679; 17% instances), ROOT (543; 14% instances), PROPN (397; 10% instances), NUM (363; 9% instances), SYM (70; 2% instances), ADJ (68; 2% instances), DET (67; 2% instances), ADP (42; 1% instances), X (31; 1% instances), PRON (9; 0% instances), ADV (5; 0% instances), PUNCT (5; 0% instances), INTJ (1; 0% instances), SCONJ (1; 0% instances)
2066 (53%) NUM
nodes are leaves.
870 (22%) NUM
nodes have one child.
408 (10%) NUM
nodes have two children.
586 (15%) NUM
nodes have three or more children.
The highest child degree of a NUM
node is 18.
Children of NUM
nodes are attached using 19 different relations: punct (1124; 28% instances), case (1006; 25% instances), mwe (614; 15% instances), parataxis (561; 14% instances), nmod (218; 5% instances), conj (115; 3% instances), det (87; 2% instances), cc (71; 2% instances), cop (59; 1% instances), nsubj (54; 1% instances), advmod (36; 1% instances), mark (20; 0% instances), acl (14; 0% instances), appos (13; 0% instances), amod (8; 0% instances), dobj (4; 0% instances), nummod (3; 0% instances), advcl (2; 0% instances), aux (2; 0% instances)
Children of NUM
nodes belong to 16 different parts of speech: PUNCT (1125; 28% instances), ADP (1020; 25% instances), PROPN (739; 18% instances), NUM (363; 9% instances), NOUN (345; 9% instances), DET (95; 2% instances), CONJ (67; 2% instances), AUX (61; 2% instances), PRON (51; 1% instances), ADV (41; 1% instances), ADJ (32; 1% instances), VERB (25; 1% instances), SYM (15; 0% instances), X (14; 0% instances), PART (11; 0% instances), SCONJ (7; 0% instances)
NUM in other languages: [bg] [cs] [de] [el] [en] [es] [eu] [fa] [fi] [fr] [ga] [he] [hu] [it] [ja] [ko] [sv] [u]