This is part of archived UD v1 documentation. See http://universaldependencies.org/ for the current version.
home en/pos issue tracker

NUM: numeral

The English NUM corresponds exactly to the PTB CD.


Treebank Statistics (UD_English)

There are 1184 NUM lemmas (6%), 1184 NUM types (5%) and 4496 NUM tokens (2%). Out of 17 observed tags, the rank of NUM is: 5 in number of lemmas, 5 in number of types and 14 in number of tokens.

The 10 most frequent NUM lemmas: one, two, 2, 3, 1, 5, 4, 10, three, 20

The 10 most frequent NUM types: one, two, 2, 3, 1, 5, 4, 10, three, 20

The 10 most frequent ambiguous lemmas: one (NUM 446, NOUN 144, PRON 26, VERB 1), 2 (NUM 140, X 30, PROPN 2, ADP 1, PART 1), 3 (NUM 119, X 17, NOUN 1), 1 (NUM 103, X 31), 5 (NUM 103, X 4, PROPN 1), 4 (NUM 95, X 13, ADP 1, SCONJ 1), 10 (NUM 93, X 2), 20 (NUM 63, NOUN 5), 6 (NUM 61, X 2), 12 (NUM 37, X 1)

The 10 most frequent ambiguous types: one (NUM 393, NOUN 104, PRON 22), 2 (NUM 140, X 30, PROPN 2, ADP 1, PART 1), 3 (NUM 119, X 17), 1 (NUM 103, X 31), 5 (NUM 103, X 4, PROPN 1), 4 (NUM 95, X 13, SCONJ 1, ADP 1), 10 (NUM 93, X 2), 20 (NUM 63, NOUN 3), 6 (NUM 61, X 2), 12 (NUM 37, X 1)

Morphology

The form / lemma ratio of NUM is 1.000000 (the average of all parts of speech is 1.173735).

The 1st highest number of forms (1) was observed with the lemma “’02”: ‘02.

The 2nd highest number of forms (1) was observed with the lemma “’05”: ‘05.

The 3rd highest number of forms (1) was observed with the lemma “’07”: ‘07.

NUM occurs with 1 features: en-feat/NumType (4496; 100% instances)

NUM occurs with 1 feature-value pairs: NumType=Card

NUM occurs with 1 feature combinations. The most frequent feature combination is NumType=Card (4496 tokens). Examples: one, two, 2, 3, 1, 5, 4, 10, three, 20

Relations

NUM nodes are attached to their parents using 26 different relations: en-dep/nummod (2738; 61% instances), en-dep/nmod (500; 11% instances), en-dep/root (272; 6% instances), en-dep/appos (211; 5% instances), en-dep/compound (205; 5% instances), en-dep/list (115; 3% instances), en-dep/dobj (103; 2% instances), en-dep/nsubj (103; 2% instances), en-dep/conj (90; 2% instances), en-dep/nmod:tmod (64; 1% instances), en-dep/parataxis (18; 0% instances), en-dep/amod (13; 0% instances), en-dep/advcl (9; 0% instances), en-dep/nmod:npmod (9; 0% instances), en-dep/xcomp (9; 0% instances), en-dep/remnant (8; 0% instances), en-dep/ccomp (7; 0% instances), en-dep/advmod (6; 0% instances), en-dep/nsubjpass (4; 0% instances), en-dep/acl:relcl (3; 0% instances), en-dep/case (2; 0% instances), en-dep/det (2; 0% instances), en-dep/reparandum (2; 0% instances), en-dep/iobj (1; 0% instances), en-dep/nmod:poss (1; 0% instances), en-dep/vocative (1; 0% instances)

Parents of NUM nodes belong to 13 different parts of speech: NOUN (2310; 51% instances), PROPN (737; 16% instances), VERB (412; 9% instances), NUM (389; 9% instances), SYM (295; 7% instances), ROOT (272; 6% instances), ADJ (39; 1% instances), X (15; 0% instances), ADV (14; 0% instances), PRON (6; 0% instances), DET (5; 0% instances), AUX (1; 0% instances), PUNCT (1; 0% instances)

3019 (67%) NUM nodes are leaves.

959 (21%) NUM nodes have one child.

269 (6%) NUM nodes have two children.

249 (6%) NUM nodes have three or more children.

The highest child degree of a NUM node is 13.

Children of NUM nodes are attached using 34 different relations: en-dep/case (517; 21% instances), en-dep/punct (399; 16% instances), en-dep/nmod (355; 14% instances), en-dep/advmod (207; 8% instances), en-dep/nmod:tmod (197; 8% instances), en-dep/compound (115; 5% instances), en-dep/conj (97; 4% instances), en-dep/cop (91; 4% instances), en-dep/cc (89; 4% instances), en-dep/nsubj (88; 4% instances), en-dep/nummod (85; 3% instances), en-dep/det (57; 2% instances), en-dep/parataxis (46; 2% instances), en-dep/amod (24; 1% instances), en-dep/appos (23; 1% instances), en-dep/acl:relcl (20; 1% instances), en-dep/mark (15; 1% instances), en-dep/nmod:npmod (13; 1% instances), en-dep/aux (11; 0% instances), en-dep/advcl (9; 0% instances), en-dep/remnant (7; 0% instances), en-dep/discourse (5; 0% instances), en-dep/neg (5; 0% instances), en-dep/acl (4; 0% instances), en-dep/nmod:poss (2; 0% instances), en-dep/reparandum (2; 0% instances), en-dep/cc:preconj (1; 0% instances), en-dep/ccomp (1; 0% instances), en-dep/csubj (1; 0% instances), en-dep/det:predet (1; 0% instances), en-dep/dobj (1; 0% instances), en-dep/goeswith (1; 0% instances), en-dep/list (1; 0% instances), en-dep/xcomp (1; 0% instances)

Children of NUM nodes belong to 17 different parts of speech: ADP (441; 18% instances), NOUN (420; 17% instances), PUNCT (390; 16% instances), NUM (389; 16% instances), ADV (187; 8% instances), VERB (163; 7% instances), SYM (112; 4% instances), CONJ (86; 3% instances), PRON (81; 3% instances), ADJ (77; 3% instances), DET (69; 3% instances), PROPN (46; 2% instances), AUX (11; 0% instances), SCONJ (8; 0% instances), PART (5; 0% instances), INTJ (3; 0% instances), X (3; 0% instances)


Treebank Statistics (UD_English-ESL)

There are 1 NUM lemmas (6%), 1 NUM types (6%) and 844 NUM tokens (1%). Out of 17 observed tags, the rank of NUM is: 9 in number of lemmas, 9 in number of types and 14 in number of tokens.

The 10 most frequent NUM lemmas: _

The 10 most frequent NUM types: _

The 10 most frequent ambiguous lemmas: _ (NOUN 15635, VERB 15080, PRON 10618, DET 10057, PUNCT 9580, ADP 8546, ADJ 5857, ADV 5704, AUX 4533, PART 3531, CONJ 3198, SCONJ 2516, PROPN 1795, NUM 844, INTJ 80, X 68, SYM 39)

The 10 most frequent ambiguous types: _ (NOUN 15635, VERB 15080, PRON 10618, DET 10057, PUNCT 9580, ADP 8546, ADJ 5857, ADV 5704, AUX 4533, PART 3531, CONJ 3198, SCONJ 2516, PROPN 1795, NUM 844, INTJ 80, X 68, SYM 39)

Morphology

The form / lemma ratio of NUM is 1.000000 (the average of all parts of speech is 1.000000).

The 1st highest number of forms (1) was observed with the lemma “_”: _.

NUM does not occur with any features.

Relations

NUM nodes are attached to their parents using 19 different relations: en-dep/nummod (525; 62% instances), en-dep/nmod (156; 18% instances), en-dep/root (31; 4% instances), en-dep/nsubj (28; 3% instances), en-dep/conj (27; 3% instances), en-dep/dobj (22; 3% instances), en-dep/compound (13; 2% instances), en-dep/appos (10; 1% instances), en-dep/nmod:tmod (6; 1% instances), en-dep/advcl (5; 1% instances), en-dep/ccomp (4; 0% instances), en-dep/parataxis (4; 0% instances), en-dep/acl:relcl (3; 0% instances), en-dep/xcomp (3; 0% instances), en-dep/goeswith (2; 0% instances), en-dep/nmod:npmod (2; 0% instances), en-dep/amod (1; 0% instances), en-dep/csubjpass (1; 0% instances), en-dep/det (1; 0% instances)

Parents of NUM nodes belong to 10 different parts of speech: NOUN (507; 60% instances), VERB (180; 21% instances), PROPN (38; 5% instances), NUM (36; 4% instances), SYM (35; 4% instances), ROOT (31; 4% instances), ADJ (11; 1% instances), ADV (3; 0% instances), PRON (2; 0% instances), PUNCT (1; 0% instances)

509 (60%) NUM nodes are leaves.

209 (25%) NUM nodes have one child.

52 (6%) NUM nodes have two children.

74 (9%) NUM nodes have three or more children.

The highest child degree of a NUM node is 9.

Children of NUM nodes are attached using 25 different relations: en-dep/case (163; 26% instances), en-dep/nmod (103; 16% instances), en-dep/punct (53; 8% instances), en-dep/cop (51; 8% instances), en-dep/nsubj (48; 8% instances), en-dep/advmod (47; 7% instances), en-dep/conj (32; 5% instances), en-dep/cc (30; 5% instances), en-dep/det (21; 3% instances), en-dep/compound (17; 3% instances), en-dep/amod (15; 2% instances), en-dep/mark (12; 2% instances), en-dep/acl:relcl (7; 1% instances), en-dep/parataxis (6; 1% instances), en-dep/appos (4; 1% instances), en-dep/neg (4; 1% instances), en-dep/advcl (3; 0% instances), en-dep/aux (3; 0% instances), en-dep/goeswith (3; 0% instances), en-dep/acl (2; 0% instances), en-dep/nummod (2; 0% instances), en-dep/csubj (1; 0% instances), en-dep/discourse (1; 0% instances), en-dep/nmod:poss (1; 0% instances), en-dep/xcomp (1; 0% instances)

Children of NUM nodes belong to 16 different parts of speech: ADP (159; 25% instances), NOUN (99; 16% instances), VERB (80; 13% instances), ADV (54; 9% instances), PUNCT (52; 8% instances), PRON (37; 6% instances), NUM (36; 6% instances), CONJ (30; 5% instances), ADJ (26; 4% instances), DET (23; 4% instances), PROPN (15; 2% instances), SCONJ (8; 1% instances), PART (5; 1% instances), AUX (3; 0% instances), SYM (2; 0% instances), X (1; 0% instances)


Treebank Statistics (UD_English-LinES)

There are 1 NUM lemmas (6%), 125 NUM types (1%) and 581 NUM tokens (1%). Out of 17 observed tags, the rank of NUM is: 9 in number of lemmas, 6 in number of types and 14 in number of tokens.

The 10 most frequent NUM lemmas: _

The 10 most frequent NUM types: one, two, three, 2002, six, five, 2000, 1, 2, ten

The 10 most frequent ambiguous lemmas: _ (NOUN 14939, VERB 11076, PUNCT 10025, ADP 8281, DET 7865, PRON 7793, ADJ 5305, ADV 4610, AUX 3168, PROPN 2792, CONJ 2535, PART 2131, SCONJ 1512, NUM 581, INTJ 159, X 43, SYM 6)

The 10 most frequent ambiguous types: one (PRON 115, NUM 102, DET 10), 1 (NUM 14, ADJ 2), 12 (NUM 7, ADJ 1), 3 (NUM 5, ADJ 1), 5 (NUM 3, ADJ 1), 30 (NUM 2, ADJ 1), U (NUM 2, NOUN 1), 14 (NUM 1, ADJ 1), 22 (ADJ 2, NUM 1)

Morphology

The form / lemma ratio of NUM is 125.000000 (the average of all parts of speech is 597.705882).

The 1st highest number of forms (125) was observed with the lemma “_”: 01-Jul-1999, 08-Jul-1999, 1, 1-100, 10, 100, 100c, 101-200, 11.25, 11.30, 111, 12, 12.00, 12:30, 13, 14, 1857, 1875, 1910, 1945, 1947, 1950s, 1952, 1953, 1955, 1972, 1973, 1976, 1996, 1996-1997, 1997, 1998, 1999, 2, 2.6, 2000, 2002, 2005, 22, 23, 25, 3, 30, 31-Dec-1999, 37, 38, 4, 4-5, 40, 43, 4:30, 5, 5.5, 50, 50000, 6, 60, 6500, 7, 7.0, 7.15, 747, 84, 9, 96/23, 96/96/EC, 97, A4-0029/99, A4-0072/97, A4-0090/99, C4-0497/98-98/0126, H-0002/99, H-0045/99, H-0209/99, H-0218/97, H-0237/97, No-12, No-15, No-4, No-44, No-46, No-49, No-59, No-6, No-8, U, billion, eight, eight-and-a-half-by-eleven, eighteen, eleven, fifteen, fifty, five, forty, forty-eight, four, fourteen, half-a-dozen, hundred, million, n, nine, nineteen, nn, one, seven, six, six-forty-one, six-thirty, sixteen, sixty, ten, thirty, thirty-eight, thirty-five, thousand, three, twelve, twenty, twenty-five, twenty-four, twenty-six, twenty-two, two.

NUM does not occur with any features.

Relations

NUM nodes are attached to their parents using 15 different relations: en-dep/nummod (396; 68% instances), en-dep/nmod (57; 10% instances), en-dep/conj (36; 6% instances), en-dep/discourse (24; 4% instances), en-dep/nsubj (16; 3% instances), en-dep/root (15; 3% instances), en-dep/appos (13; 2% instances), en-dep/dobj (10; 2% instances), en-dep/name (4; 1% instances), en-dep/nsubjpass (3; 1% instances), en-dep/xcomp (3; 1% instances), en-dep/advmod (1; 0% instances), en-dep/ccomp (1; 0% instances), en-dep/compound (1; 0% instances), en-dep/dislocated (1; 0% instances)

Parents of NUM nodes belong to 11 different parts of speech: NOUN (360; 62% instances), VERB (95; 16% instances), NUM (52; 9% instances), PROPN (39; 7% instances), ROOT (15; 3% instances), ADJ (5; 1% instances), SYM (5; 1% instances), ADV (4; 1% instances), PRON (4; 1% instances), ADP (1; 0% instances), AUX (1; 0% instances)

354 (61%) NUM nodes are leaves.

114 (20%) NUM nodes have one child.

69 (12%) NUM nodes have two children.

44 (8%) NUM nodes have three or more children.

The highest child degree of a NUM node is 16.

Children of NUM nodes are attached using 19 different relations: en-dep/case (92; 22% instances), en-dep/nmod (60; 14% instances), en-dep/punct (59; 14% instances), en-dep/advmod (43; 10% instances), en-dep/conj (41; 10% instances), en-dep/compound (31; 7% instances), en-dep/cc (25; 6% instances), en-dep/det (16; 4% instances), en-dep/nummod (16; 4% instances), en-dep/mwe (12; 3% instances), en-dep/appos (8; 2% instances), en-dep/cop (6; 1% instances), en-dep/nsubj (6; 1% instances), en-dep/amod (5; 1% instances), en-dep/acl (2; 0% instances), en-dep/acl:relcl (1; 0% instances), en-dep/aux (1; 0% instances), en-dep/mark (1; 0% instances), en-dep/parataxis (1; 0% instances)

Children of NUM nodes belong to 15 different parts of speech: ADP (91; 21% instances), NOUN (87; 20% instances), PUNCT (59; 14% instances), NUM (52; 12% instances), ADV (46; 11% instances), CONJ (31; 7% instances), DET (16; 4% instances), ADJ (12; 3% instances), VERB (11; 3% instances), PROPN (9; 2% instances), PRON (7; 2% instances), AUX (2; 0% instances), PART (1; 0% instances), SCONJ (1; 0% instances), X (1; 0% instances)


NUM in other languages: [bg] [cs] [de] [el] [en] [es] [eu] [fa] [fi] [fr] [ga] [he] [hu] [it] [ja] [ko] [sv] [u]