home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_German-PUD: POS Tags: NUM

There are 1 NUM lemmas (5%), 206 NUM types (3%) and 352 NUM tokens (2%). Out of 16 observed tags, the rank of NUM is: 8 in number of lemmas, 6 in number of types and 12 in number of tokens.

The 10 most frequent NUM lemmas: _

The 10 most frequent NUM types: zwei, drei, vier, 3, sechs, zehn, 1, 10, 50, 100

The 10 most frequent ambiguous lemmas: _ (NOUN 4261, PUNCT 2767, DET 2515, VERB 1913, ADP 1715, ADJ 1387, PROPN 1219, PRON 1185, ADV 1139, AUX 950, CCONJ 743, NUM 352, SCONJ 326, PART 144, X 31, SYM 22)

The 10 most frequent ambiguous types: 3 (NUM 7, ADJ 1), 1 (NUM 5, NOUN 1), - (PUNCT 178, NUM 2, CCONJ 1), 16 (NUM 2, ADJ 1), 31 (NOUN 2, NUM 2), 21. (NOUN 1, NUM 1), 4 (ADJ 1, NUM 1), III (NOUN 3, NUM 1)

Morphology

The form / lemma ratio of NUM is 206.000000 (the average of all parts of speech is 307.454545).

The 1st highest number of forms (206) was observed with the lemma “_”: -, 1, 1,165, 1,335, 1,4, 1,5, 10, 10.000, 100, 100.000, 1000, 103,7, 1072, 1075, 10:00, 11, 12.000, 120, 1200, 125, 1340, 137, 1399, 14, 1415, 1492, 15, 15,5, 15.000, 15.001, 1519, 1530, 1538, 1563, 1566, 16, 16.500, 1600, 1610, 1632, 168.000, 17, 1770, 1777, 1794, 1820, 1832, 1839, 1842, 1856, 1858, 1860, 1879, 1882, 1886, 1887, 1896, 19.999, 1900, 1903, 1904, 1911, 1912, 1913, 1914, 1916, 1917, 1918, 1925, 1926, 1927, 1928, 1933, 1945, 1947, 1948, 1952, 1954, 1955, 1956, 1960, 1961, 1962, 1969, 1973, 1975, 1976, 1977, 1979, 1980, 1981, 1984, 1987, 1988, 1990, 1991, 1992, 1993, 1994, 1996, 1997, 1998, 2, 20, 200, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2013-2014, 2014, 2015, 2015-2016, 2016, 2017, 2019, 2020, 2035, 2050, 21., 221, 23.45, 25.000, 28, 2900, 3, 3.000, 3000, 31, 328, 33, 330, 330.000, 3300, 34, 35.000, 352, 36, 360, 363, 367, 39, 393, 4, 40, 400, 42, 45, 49, 5, 5,7, 5.000, 50, 500, 5000, 511, 512, 53, 550, 56, 6, 6.000, 600.000, 62, 66, 6:30, 7, 7,5, 70, 700, 71, 760, 80, 830, 833, 84, 846, 9, 90, 96, Eins, III, Nine, acht, achtzehn, anderthalb, drei, fünf, neun, sechs, sieben, siebzehn, vier, zehn, zwei, zweitausend.

NUM occurs with 1 features: NumType (352; 100% instances)

NUM occurs with 1 feature-value pairs: NumType=Card

NUM occurs with 1 feature combinations. The most frequent feature combination is NumType=Card (352 tokens). Examples: zwei, drei, vier, 3, sechs, zehn, 1, 10, 50, 100

Relations

NUM nodes are attached to their parents using 9 different relations: nummod (174; 49% instances), obl:tmod (120; 34% instances), obl (17; 5% instances), nmod (13; 4% instances), conj (12; 3% instances), compound (10; 3% instances), nsubj (4; 1% instances), nsubj:pass (1; 0% instances), obj (1; 0% instances)

Parents of NUM nodes belong to 7 different parts of speech: NOUN (245; 70% instances), VERB (61; 17% instances), SYM (22; 6% instances), NUM (16; 5% instances), ADJ (5; 1% instances), PROPN (2; 1% instances), DET (1; 0% instances)

250 (71%) NUM nodes are leaves.

75 (21%) NUM nodes have one child.

17 (5%) NUM nodes have two children.

10 (3%) NUM nodes have three or more children.

The highest child degree of a NUM node is 6.

Children of NUM nodes are attached using 13 different relations: advmod (40; 27% instances), case (31; 21% instances), punct (26; 18% instances), cc (11; 8% instances), conj (11; 8% instances), nmod (11; 8% instances), compound (5; 3% instances), det (3; 2% instances), cop (2; 1% instances), nsubj (2; 1% instances), obl:tmod (2; 1% instances), acl:relcl (1; 1% instances), cc:preconj (1; 1% instances)

Children of NUM nodes belong to 11 different parts of speech: ADV (38; 26% instances), ADP (33; 23% instances), PUNCT (26; 18% instances), NUM (16; 11% instances), CCONJ (12; 8% instances), PROPN (8; 5% instances), NOUN (6; 4% instances), DET (3; 2% instances), AUX (2; 1% instances), ADJ (1; 1% instances), VERB (1; 1% instances)