home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_English-GUM: POS Tags: NUM

There are 51 NUM lemmas (1%), 378 NUM types (3%) and 1458 NUM tokens (2%). Out of 17 observed tags, the rank of NUM is: 10 in number of lemmas, 6 in number of types and 14 in number of tokens.

The 10 most frequent NUM lemmas: @card@, one, two, 1, 2, 3, 4, four, 5, five

The 10 most frequent NUM types: one, two, 1, 2, 3, 15, 4, four, 10, 5

The 10 most frequent ambiguous lemmas: @card@ (NUM 796, PROPN 3, ADJ 1, X 1), one (NUM 156, NOUN 6, PRON 5), 1 (NUM 64, X 4), 2 (NUM 54, X 4), 3 (NUM 33, X 4), 4 (NUM 21, X 4), 5 (NUM 20, X 2), 6 (NUM 16, DET 1, X 1), million (NUM 14, NOUN 6), 9 (NUM 12, X 1)

The 10 most frequent ambiguous types: one (NUM 156, PRON 5, NOUN 1), 1 (NUM 64, X 4), 2 (NUM 54, X 4), 3 (NUM 33, X 4), 4 (NUM 21, X 4), 5 (NUM 20, X 2), 30 (NUM 16, PROPN 1), 6 (NUM 16, DET 1, X 1), 9 (NUM 13, X 1), 7 (NUM 9, X 1)

Morphology

The form / lemma ratio of NUM is 7.411765 (the average of all parts of speech is 1.227660).

The 1st highest number of forms (332) was observed with the lemma “@card@”: .2, .4, .5, 0.05, 0.22, 0.3, 03:00, 0590258046, 0590854950, 0590854959, 0590883899, 0590920648, 06:30, 08, 08:00, 08:30, 1,000, 1,426, 1,537,058, 1.12, 1.428,000, 1.75, 1/2, 1/4, 10, 10,000, 10,694, 10.1, 100, 100,000, 1000, 107, 11, 11,000, 11.8, 110, 1100, 115, 12, 120, 1214, 1230, 12:30, 13, 1355, 14, 1423, 14:00, 15, 150, 1596, 1598, 15:11, 16, 160, 1602, 1647, 1656, 1662, 1669, 1696, 17, 1704, 1709, 1710, 1715, 1722, 1754, 1758, 177, 1774, 1776, 1776-77, 1777, 1778, 1779, 1780, 1783, 1787, 1789, 1790, 1791, 18, 180, 1801, 181,000, 1818, 1825, 1835, 1846, 1848, 1849, 185, 1859, 1860, 1865, 1868, 1871, 1872, 1877, 1880, 1881, 1887, 1888, 189, 1890, 1891, 1892, 1893, 1894, 1896, 19, 190, 1900, 1902, 1904, 1905, 1906, 1907, 1909, 1910, 1911, 1912, 1913, 1914, 1920, 1921, 1922, 1924, 1925, 1927, 1930, 1932, 1933, 1934, 1936, 1937, 1939, 1942, 1943, 1945, 1947, 1949, 1950, 1954, 1956, 1958, 1969, 1970, 1972, 1979, 1980, 1981, 1984, 1985, 1986, 1987, 1989, 1990, 1991, 1992, 1993, 1994, 1995, 1997, 1998, 1999, 2,360, 2.5, 2/3, 20, 20,000, 20.00, 200, 200,000, 2000, 20000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 201, 201-224-7900, 201-592-4699, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2022, 2050, 20:00, 21, 211, 213, 22, 225, 22:30, 23, 23.6, 238, 24, 240,000, 2470, 25, 25.1, 250, 250,000, 251, 26, 27, 275, 277, 28, 28,000, 29, 2:30, 3.7, 3/4, 30, 30,000, 300, 300,000, 3000, 31, 311, 315, 317, 319, 32, 320, 33, 335, 345, 35, 350, 354, 36, 363-0555, 39, 391,000, 394, 4,000, 4,489,109, 40, 40,000, 42, 42.65, 422,000, 424, 44, 45, 450, 46, 461-1776, 48, 49, 491,667, 5,000, 5,600, 50, 50,000, 50,818, 500, 5000, 50000, 508, 51,516, 529999204044, 529999420000, 53, 54, 542, 543, 56, 58, 58,825, 594, 60, 60,000, 60,760, 60.23, 600, 61, 62, 63, 64.75, 66, 672,000, 7.64, 70, 700, 71, 72, 73, 75, 750,000, 76, 760,000, 776, 8, 8.00, 80, 80,000, 8000, 80119, 84, 84,121, 842, 860, 866, 88, 89, 9, 90, 900, 92, 937,000, 94, 944-3737, 944-6800, 95, 96, 988,000.

The 2nd highest number of forms (1) was observed with the lemma “+1602275-4958”: +1602275-4958.

The 3rd highest number of forms (1) was observed with the lemma “+1918584-4428”: +1918584-4428.

NUM occurs with 1 features: NumType (1458; 100% instances)

NUM occurs with 1 feature-value pairs: NumType=Card

NUM occurs with 1 feature combinations. The most frequent feature combination is NumType=Card (1458 tokens). Examples: one, two, 1, 2, 3, 15, 4, four, 10, 5

Relations

NUM nodes are attached to their parents using 23 different relations: nummod (671; 46% instances), obl (160; 11% instances), nmod:tmod (150; 10% instances), nmod (127; 9% instances), dep (66; 5% instances), appos (59; 4% instances), compound (50; 3% instances), conj (46; 3% instances), root (28; 2% instances), nsubj (26; 2% instances), obj (24; 2% instances), obl:tmod (11; 1% instances), xcomp (10; 1% instances), advcl (9; 1% instances), amod (7; 0% instances), nsubj:pass (4; 0% instances), ccomp (3; 0% instances), parataxis (2; 0% instances), advmod (1; 0% instances), case (1; 0% instances), discourse (1; 0% instances), nmod:npmod (1; 0% instances), nmod:poss (1; 0% instances)

Parents of NUM nodes belong to 13 different parts of speech: NOUN (707; 48% instances), VERB (262; 18% instances), PROPN (211; 14% instances), NUM (180; 12% instances), SYM (39; 3% instances), (28; 2% instances), ADJ (14; 1% instances), X (7; 0% instances), ADV (5; 0% instances), ADP (2; 0% instances), CCONJ (1; 0% instances), PART (1; 0% instances), PRON (1; 0% instances)

636 (44%) NUM nodes are leaves.

468 (32%) NUM nodes have one child.

187 (13%) NUM nodes have two children.

167 (11%) NUM nodes have three or more children.

The highest child degree of a NUM node is 7.

Children of NUM nodes are attached using 25 different relations: punct (421; 29% instances), case (289; 20% instances), advmod (130; 9% instances), compound (121; 8% instances), nmod (121; 8% instances), nmod:tmod (82; 6% instances), conj (40; 3% instances), cc (39; 3% instances), cop (34; 2% instances), det (30; 2% instances), nsubj (30; 2% instances), dep (18; 1% instances), amod (14; 1% instances), acl:relcl (12; 1% instances), mark (11; 1% instances), nummod (9; 1% instances), acl (8; 1% instances), advcl (7; 0% instances), appos (6; 0% instances), aux (3; 0% instances), parataxis (3; 0% instances), ccomp (1; 0% instances), csubj (1; 0% instances), nmod:npmod (1; 0% instances), xcomp (1; 0% instances)

Children of NUM nodes belong to 16 different parts of speech: PUNCT (425; 30% instances), ADP (306; 21% instances), NUM (180; 13% instances), PROPN (119; 8% instances), NOUN (84; 6% instances), ADV (83; 6% instances), ADJ (40; 3% instances), CCONJ (39; 3% instances), AUX (37; 3% instances), PRON (31; 2% instances), DET (30; 2% instances), VERB (24; 2% instances), SCONJ (12; 1% instances), SYM (11; 1% instances), PART (8; 1% instances), X (3; 0% instances)