home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Korean: POS Tags: NUM

There are 1 NUM lemmas (3%), 218 NUM types (1%) and 532 NUM tokens (1%). Out of 11 observed tags, the rank of NUM is: 8 in number of lemmas, 5 in number of types and 9 in number of tokens.

The 10 most frequent NUM lemmas: _

The 10 most frequent NUM types: 한, 두, 첫, 세, 하나, 1, 다섯, 하나는, 하나의, 네

The 10 most frequent ambiguous lemmas: _ (NOUN 32099, VERB 18517, ADV 11605, ADJ 2715, PUNCT 1972, ADP 835, PRON 677, DET 539, NUM 532, CCONJ 176, X 23)

The 10 most frequent ambiguous types: 한 (NUM 134, VERB 37, NOUN 5, ADV 4), 첫 (NUM 25, NOUN 1), 하나 (NUM 10, ADV 2, VERB 2, NOUN 1), 3 (NUM 4, NOUN 2), 50 (NUM 4, NOUN 2), 수십 (NUM 4, NOUN 1), 10 (NUM 3, NOUN 2), 2010 (NUM 3, NOUN 1), 12 (NUM 2, NOUN 1), 1의 (NUM 2, NOUN 1)

Morphology

The form / lemma ratio of NUM is 218.000000 (the average of all parts of speech is 963.631579).

The 1st highest number of forms (218) was observed with the lemma “_”: “6.2, “한, ‘1, ‘2008, ‘2009, ‘한, 0.89~0.92의, 04, 08-09, 1, 1,800만, 1.1, 10, 10.29, 1000, 100억, 100이, 10~15, 10만, 11만, 11억5천, 11억8천만~11억9천만, 12, 12.12, 120, 125만, 128, 12억, 13, 13-1, 13:00를, 13만, 13은, 14, 140만, 1411, 142여, 144억, 149억, 14억3천만, 1500-1000, 153, 1536, 17, 1735억, 175억, 18-55, 18.5, 1931-32, 1932-33, 1973, 1982, 198억, 1억, 1을, 1의, 1이, 1조1577억, 1조2000억, 1조4천352억, 1조6400억, 1천억, 1천여, 2, 2,700만, 2-1, 20, 2002, 2004-05, 2007, 2007-08, 2008-09, 2009, 2010, 2010-11, 2011, 2012, 2030, 20만, 20억, 216, 24, 243, 24만, 25, 25-400, 2500, 250만, 253, 26, 26조6000억, 27억, 29, 2백만, 2뿐만, 2조3027억, 2조3678억, 3, 3,450만, 3-2, 3.3, 3.3~8.9, 30, 300, 3000, 30만, 30억, 31, 321억, 324, 34억, 34억5천만, 350, 3854억, 388조6000억, 3억, 3의, 3천만, 3천만~4천, 4, 4-2-3-1, 4.27, 4.5, 40,000, 44, 45, 45의, 487억, 48억8천만, 4백만, 4억, 4천, 5, 5.31, 5.6, 50, 500, 50억, 512억, 5157억, 5를, 5만, 5억, 5천, 5천만, 6.2, 600만, 615억, 6만, 6천만, 7, 7-1, 7.3의, 7.4, 719, 763억, 7ㆍ4, 8, 8000억, 804억, 814억, 8500억, 88, 8만, 8억, 8천, 900, 90만, 992,000만의, 994억, 9천, II, III의, 네, 다섯, 두, 두어, 둘, 둘을, 둘이, 둘이서, 만, 반, 사십, 삼, 서너, 서른, 세, 셋이, 수만, 수백, 수백억, 수십, 수십만, 수십억, 수천의, 십, 아홉, 오만, 오천, 이백팔십사를, 이십, 제2, 제3, 천, 첫, 첫째, 칠, 하나, 하나가, 하나는, 하나를, 하나만, 하나만을, 하나의, 하나이기도, 한, 한두.

NUM occurs with 1 features: NumType (532; 100% instances)

NUM occurs with 1 feature-value pairs: NumType=Card

NUM occurs with 1 feature combinations. The most frequent feature combination is NumType=Card (532 tokens). Examples: 한, 두, 첫, 세, 하나, 1, 다섯, 하나는, 하나의, 네

Relations

NUM nodes are attached to their parents using 14 different relations: nummod (402; 76% instances), flat (48; 9% instances), conj (23; 4% instances), nsubj (16; 3% instances), advmod (14; 3% instances), det:poss (9; 2% instances), obj (8; 2% instances), nmod (4; 1% instances), dep (3; 1% instances), acl:relcl (1; 0% instances), advcl (1; 0% instances), appos (1; 0% instances), nsubj:pass (1; 0% instances), root (1; 0% instances)

Parents of NUM nodes belong to 7 different parts of speech: NOUN (363; 68% instances), ADV (87; 16% instances), VERB (50; 9% instances), NUM (26; 5% instances), ADJ (4; 1% instances), ADP (1; 0% instances), (1; 0% instances)

449 (84%) NUM nodes are leaves.

46 (9%) NUM nodes have one child.

19 (4%) NUM nodes have two children.

18 (3%) NUM nodes have three or more children.

The highest child degree of a NUM node is 7.

Children of NUM nodes are attached using 14 different relations: flat (62; 39% instances), punct (35; 22% instances), conj (25; 16% instances), det:poss (9; 6% instances), advmod (7; 4% instances), nmod (6; 4% instances), case (4; 3% instances), acl:relcl (2; 1% instances), dep (2; 1% instances), nsubj (2; 1% instances), amod (1; 1% instances), appos (1; 1% instances), cc (1; 1% instances), det (1; 1% instances)

Children of NUM nodes belong to 11 different parts of speech: NOUN (69; 44% instances), PUNCT (35; 22% instances), NUM (26; 16% instances), ADV (14; 9% instances), ADP (4; 3% instances), VERB (4; 3% instances), ADJ (2; 1% instances), CCONJ (1; 1% instances), DET (1; 1% instances), PRON (1; 1% instances), X (1; 1% instances)