home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Korean-Kaist: POS Tags: NUM

There are 1372 NUM lemmas (1%), 1353 NUM types (1%) and 4848 NUM tokens (1%). Out of 17 observed tags, the rank of NUM is: 8 in number of lemmas, 8 in number of types and 11 in number of tokens.

The 10 most frequent NUM lemmas: 한, 두, 1, 하나+의, 세, 2, 3, 10, 5, 하나+는

The 10 most frequent NUM types: 한, 두, 1, 하나의, 세, 2, 3, 10, 5, 하나는

The 10 most frequent ambiguous lemmas: 한 (NUM 578, ADJ 69, NOUN 46, PROPN 32, DET 4), 두 (NUM 375, ADJ 17), 하나+의 (NUM 104, NOUN 71), 세 (NUM 101, NOUN 40, ADJ 7), 하나+는 (NUM 56, NOUN 28), 첫 (NUM 35, ADJ 13), 네 (NUM 32, PRON 28, ADJ 3), 만 (NUM 25, NOUN 8, ADJ 1, ADP 1), 하나 (NUM 23, NOUN 11, CCONJ 1), 3+천 (NUM 21, NOUN 1)

The 10 most frequent ambiguous types: 한 (NUM 577, VERB 173, ADJ 69, NOUN 46, AUX 41, PROPN 32, DET 4, PART 2), 두 (NUM 375, ADJ 17), 1 (NUM 132, NOUN 1), 하나의 (NUM 104, NOUN 71), 세 (NUM 101, NOUN 40, ADJ 7), 2 (NUM 95, NOUN 1), 10 (NUM 67, NOUN 1), 하나는 (NUM 56, NOUN 28, ADV 1), 둘째 (NUM 40, NOUN 2), 첫째 (NUM 36, NOUN 2)

Morphology

The form / lemma ratio of NUM is 0.986152 (the average of all parts of speech is 0.998034).

The 1st highest number of forms (2) was observed with the lemma “60+대+가”: 60년대가, 60대가.

The 2nd highest number of forms (2) was observed with the lemma “첫째”: 저체, 첫째.

The 3rd highest number of forms (2) was observed with the lemma “한”: 한, 한마디로.

NUM does not occur with any features.

Relations

NUM nodes are attached to their parents using 13 different relations: nummod (3303; 68% instances), compound (550; 11% instances), nmod (385; 8% instances), dislocated (230; 5% instances), nsubj (130; 3% instances), obj (122; 3% instances), conj (72; 1% instances), obl (23; 0% instances), csubj (16; 0% instances), advcl (6; 0% instances), root (6; 0% instances), dep (4; 0% instances), xcomp (1; 0% instances)

Parents of NUM nodes belong to 13 different parts of speech: NOUN (2506; 52% instances), ADV (824; 17% instances), VERB (596; 12% instances), NUM (364; 8% instances), SYM (246; 5% instances), CCONJ (118; 2% instances), SCONJ (107; 2% instances), ADJ (42; 1% instances), PROPN (23; 0% instances), X (14; 0% instances), (6; 0% instances), PART (1; 0% instances), PRON (1; 0% instances)

3740 (77%) NUM nodes are leaves.

832 (17%) NUM nodes have one child.

218 (4%) NUM nodes have two children.

58 (1%) NUM nodes have three or more children.

The highest child degree of a NUM node is 6.

Children of NUM nodes are attached using 21 different relations: compound (483; 33% instances), punct (228; 16% instances), case (145; 10% instances), nummod (125; 9% instances), nmod (105; 7% instances), amod (93; 6% instances), conj (87; 6% instances), acl (33; 2% instances), advmod (32; 2% instances), obl (28; 2% instances), det (22; 2% instances), dislocated (22; 2% instances), cop (15; 1% instances), advcl (11; 1% instances), nsubj (11; 1% instances), cc (8; 1% instances), appos (4; 0% instances), ccomp (2; 0% instances), obj (2; 0% instances), clf (1; 0% instances), dep (1; 0% instances)

Children of NUM nodes belong to 16 different parts of speech: NOUN (413; 28% instances), NUM (364; 25% instances), PUNCT (228; 16% instances), ADP (139; 10% instances), ADJ (90; 6% instances), ADV (62; 4% instances), VERB (36; 2% instances), PROPN (34; 2% instances), DET (22; 2% instances), SYM (21; 1% instances), AUX (15; 1% instances), CCONJ (13; 1% instances), X (8; 1% instances), PRON (6; 0% instances), PART (5; 0% instances), SCONJ (2; 0% instances)