home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Chinese-PUD: POS Tags: NUM

There are 1 NUM lemmas (7%), 264 NUM types (5%) and 873 NUM tokens (4%). Out of 15 observed tags, the rank of NUM is: 8 in number of lemmas, 6 in number of types and 8 in number of tokens.

The 10 most frequent NUM lemmas: _

The 10 most frequent NUM types: 一、 兩、 很多、 三、 許多、 六、 多、 20、 10、 十

The 10 most frequent ambiguous lemmas: _ (NOUN 5410, VERB 3467, PUNCT 2902, PART 1881, PROPN 1361, ADP 1288, ADV 1283, NUM 873, PRON 710, ADJ 650, AUX 618, DET 355, X 306, CCONJ 283, SCONJ 28)

The 10 most frequent ambiguous types: 多 (NUM 14, ADJ 10, ADV 1), 天 (NOUN 10, NUM 1)

Morphology

The form / lemma ratio of NUM is 264.000000 (the average of all parts of speech is 388.466667).

The 1st highest number of forms (264) was observed with the lemma “_”: $15000, $150萬, $25,000, 1, 1%, 1.5, 10, 10%, 100, 100%, 1000, 103.7億, 1072, 1075, 10億, 10萬, 11, 11%, 1165, 12, 120, 1200, 13, 1335, 1340, 1365, 137, 1399, 14, 1415, 1492, 14億, 15, 150萬, 1519, 1530, 1538, 1550萬, 1563, 1566, 15,001, 16, 16,500, 1600, 160億, 1610, 1632, 16萬8千, 17, 1770, 1777, 1794, 18, 1820, 1832, 1839, 1842, 1856, 1858, 1860, 1867, 1879, 1882, 1886, 1887, 1896, 19, 1900, 1903, 1904, 1911, 1912, 1913, 1914, 1916, 1917, 1918, 1925, 1926, 1927, 1928, 1933, 1945, 1947, 1948, 1952, 1954, 1955, 1960, 1961, 1962, 1969, 1973, 1975, 1976, 1977, 1979, 1980, 1981, 1984, 1987, 1988, 1990, 1991, 1992, 1993, 1994, 1996, 1997, 1998, 19,999, 1萬, 20, 20%, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2013-2014, 2014, 2015, 2015-2016, 2016, 2017, 2019, 2020, 2035, 2050, 20億, 21, 2210億, 23, 24, 25, 27, 28, 2900, 2C, 3, 3%, 30, 3000, 31, 328, 33, 330, 3300, 33萬, 34, 35000分之一, 352, 36, 363, 367, 393, 4, 40%, 400, 4200萬, 45, 49, 5, 5,000, 50, 500, 5000, 511, 512, 53, 53%, 550, 56%, 57億, 6, 6%, 6.30-10am, 60, 6000, 60萬, 62%, 66%, 7, 70, 700, 70%, 71, 760, 8, 80, 80%, 830-846, 833, 84, 9, 90, 90%, £10,000, £12,000, £125, £3,000-£5,000, £360, 一, 一億, 一千萬, 七, 七十, 七百五十萬, 三, 三十, 三十九, 三百萬, 上百, 九, 二, 二十, 二百, 五, 五十, 五十萬, 五百萬, 兩, 兩千, 八, 六, 六十, 十, 十七, 十五, 十四, 十四億, 半, 單, 四, 四十, 四千, 多, 多少, 天, 好幾, 幾, 幾十, 幾十億, 很多, 數, 數十, 數千, 數百億, 數百萬, 新一, 百分之一, 許多, 近幾, 首家, 首次.

NUM occurs with 1 features: NumType (873; 100% instances)

NUM occurs with 1 feature-value pairs: NumType=Card

NUM occurs with 1 feature combinations. The most frequent feature combination is NumType=Card (873 tokens). Examples: 一、 兩、 很多、 三、 許多、 六、 多、 20、 10、 十

Relations

NUM nodes are attached to their parents using 13 different relations: nummod (808; 93% instances), obj (24; 3% instances), conj (7; 1% instances), advmod (6; 1% instances), dep (6; 1% instances), nmod (6; 1% instances), nsubj (4; 0% instances), root (4; 0% instances), case:loc (2; 0% instances), ccomp (2; 0% instances), obl (2; 0% instances), obl:tmod (1; 0% instances), xcomp (1; 0% instances)

Parents of NUM nodes belong to 9 different parts of speech: NOUN (804; 92% instances), VERB (43; 5% instances), NUM (7; 1% instances), ADJ (6; 1% instances), (4; 0% instances), X (3; 0% instances), ADP (2; 0% instances), PART (2; 0% instances), PROPN (2; 0% instances)

821 (94%) NUM nodes are leaves.

35 (4%) NUM nodes have one child.

4 (0%) NUM nodes have two children.

13 (1%) NUM nodes have three or more children.

The highest child degree of a NUM node is 5.

Children of NUM nodes are attached using 16 different relations: punct (15; 16% instances), nmod (14; 15% instances), cop (11; 12% instances), nsubj (11; 12% instances), case (9; 9% instances), cc (7; 7% instances), conj (7; 7% instances), advmod (6; 6% instances), appos (3; 3% instances), det (3; 3% instances), case:loc (2; 2% instances), compound (2; 2% instances), flat:name (2; 2% instances), dep (1; 1% instances), mark (1; 1% instances), obl:tmod (1; 1% instances)

Children of NUM nodes belong to 13 different parts of speech: NOUN (24; 25% instances), PUNCT (15; 16% instances), AUX (11; 12% instances), CCONJ (7; 7% instances), NUM (7; 7% instances), ADP (6; 6% instances), ADV (6; 6% instances), PART (6; 6% instances), PROPN (4; 4% instances), DET (3; 3% instances), PRON (2; 2% instances), VERB (2; 2% instances), X (2; 2% instances)