home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Hindi-PUD: POS Tags: NUM

There are 1 NUM lemmas (6%), 236 NUM types (4%) and 452 NUM tokens (2%). Out of 16 observed tags, the rank of NUM is: 8 in number of lemmas, 5 in number of types and 11 in number of tokens.

The 10 most frequent NUM lemmas: _

The 10 most frequent NUM types: दो, तीन, एक, मिलियन, चार, 1, 10, 3, छह, दस

The 10 most frequent ambiguous lemmas: _ (NOUN 5597, ADP 4849, PUNCT 2297, VERB 2058, ADJ 1995, AUX 1776, PROPN 1358, PRON 1128, DET 876, CCONJ 545, NUM 452, SCONJ 382, PART 316, ADV 159, SYM 30, X 11)

The 10 most frequent ambiguous types: दो (NUM 35, NOUN 1), एक (DET 205, NUM 14, NOUN 8, ADP 1, PRON 1), दोनों (DET 7, NOUN 7, NUM 4, PART 2)

Morphology

The form / lemma ratio of NUM is 236.000000 (the average of all parts of speech is 345.375000).

The 1st highest number of forms (236) was observed with the lemma “_”: $1.4, $1.5, $103.7, $15,000, $221, 1, 1,165, 1,335, 1,365, 1.4, 1.5, 10, 10,000, 100, 100,000, 1000, 1072, 1075, 11, 12, 12,000, 120, 1200, 125, 1340, 1350, 137, 1399, 14, 1415, 1492, 15, 15,001, 15.5, 1519, 1530, 1538, 1563, 1566, 16, 16,500, 1600, 1610, 1632, 168,000, 17, 1770, 1777, 1794, 18, 1820, 1832, 1839, 1842, 1856, 1858, 1860, 1879, 1882, 1886, 1887, 1896, 19, 19,999, 1900, 1903, 1904, 1911, 1912, 1913, 1914, 1916, 1917, 1918, 1925, 1926, 1927, 1928, 1933, 1945, 1947, 1948, 1950, 1952, 1954, 1955, 1960, 1961, 1962, 1969, 1970, 1973, 1975, 1976, 1977, 1979, 1980, 1981, 1984, 1987, 1988, 1990, 1991, 1992, 1993, 1994, 1996, 1997, 1998, 2, 20, 200, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2013-2014, 2014, 2015, 2015-2016, 2016, 2017, 2019, 2020, 2035, 2050, 21, 23.45, 24, 25, 27, 28, 29, 2900, 3, 3,000, 30, 3000, 31, 328, 33, 330, 330,000, 3300, 34, 35,000, 352, 36, 360, 363, 367, 39, 393, 4, 40, 400, 42, 45, 49, 5, 5,000, 5.7, 50, 500, 5000, 512-511, 53, 550, 56, 6, 6,000, 6.30, 60, 600,000, 62, 66, 7, 7.5, 70, 700, 71, 760, 8, 80, 830-846, 833, 84, 9, 90, 96, III, VI, bn, अट्ठाइस, अठारह, आठ, एक, चार, चालीस, छह, जीरो, डेढ़, तीन, तीस, दस, दो, दोनों, नाइन, नौ, पंद्रह, पचास, पांच, फाइव, बिलयन, बिलियन, बीस, मिलयन, मिलियन, लाख, वन, सत्तर, सत्रह, साठ, सात, सौ, हजार.

NUM occurs with 1 features: NumType (452; 100% instances)

NUM occurs with 1 feature-value pairs: NumType=Card

NUM occurs with 1 feature combinations. The most frequent feature combination is NumType=Card (452 tokens). Examples: दो, तीन, एक, मिलियन, चार, 1, 10, 3, छह, दस

Relations

NUM nodes are attached to their parents using 14 different relations: nummod (277; 61% instances), obl:tmod (74; 16% instances), nmod:poss (40; 9% instances), obl (14; 3% instances), nmod (9; 2% instances), obj (9; 2% instances), appos (8; 2% instances), compound (5; 1% instances), conj (5; 1% instances), nsubj (3; 1% instances), root (3; 1% instances), amod (2; 0% instances), iobj (2; 0% instances), dep (1; 0% instances)

Parents of NUM nodes belong to 8 different parts of speech: NOUN (294; 65% instances), VERB (95; 21% instances), NUM (31; 7% instances), ADJ (17; 4% instances), PROPN (7; 2% instances), (3; 1% instances), SYM (3; 1% instances), DET (2; 0% instances)

219 (48%) NUM nodes are leaves.

127 (28%) NUM nodes have one child.

69 (15%) NUM nodes have two children.

37 (8%) NUM nodes have three or more children.

The highest child degree of a NUM node is 7.

Children of NUM nodes are attached using 21 different relations: case (149; 38% instances), punct (60; 15% instances), compound (58; 15% instances), det (32; 8% instances), nummod (24; 6% instances), dep (23; 6% instances), discourse (8; 2% instances), conj (7; 2% instances), cc (6; 2% instances), cop (4; 1% instances), acl:relcl (3; 1% instances), advmod (3; 1% instances), obl:tmod (3; 1% instances), amod (2; 1% instances), aux (2; 1% instances), fixed (2; 1% instances), nmod (2; 1% instances), nmod:poss (2; 1% instances), obj (2; 1% instances), acl (1; 0% instances), nsubj (1; 0% instances)

Children of NUM nodes belong to 14 different parts of speech: ADP (149; 38% instances), NOUN (63; 16% instances), PUNCT (60; 15% instances), DET (32; 8% instances), NUM (31; 8% instances), SYM (27; 7% instances), AUX (6; 2% instances), CCONJ (6; 2% instances), PART (5; 1% instances), ADJ (4; 1% instances), VERB (4; 1% instances), ADV (3; 1% instances), PROPN (2; 1% instances), X (2; 1% instances)