home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Korean-PUD: POS Tags: NUM

There are 196 NUM lemmas (7%), 218 NUM types (3%) and 513 NUM tokens (3%). Out of 13 observed tags, the rank of NUM is: 3 in number of lemmas, 5 in number of types and 8 in number of tokens.

The 10 most frequent NUM lemmas: 1, _, 10, 3, 2, 4, 6, 20, 8, 5

The 10 most frequent NUM types: 1, 10, 3, 2, 4, 6, 20, 8, 5, 9

The 10 most frequent ambiguous lemmas: _ (NOUN 4295, VERB 1439, PROPN 1030, ADJ 596, ADV 516, DET 462, CCONJ 125, AUX 104, X 47, NUM 27, PRON 24, PUNCT 1), 만 (PART 5, NUM 1)

The 10 most frequent ambiguous types: 둘 (AUX 1, NUM 1), 만 (DET 12, PART 5, NUM 1), 억 (DET 6, NUM 1), 천 (DET 3, NUM 1)

Morphology

The form / lemma ratio of NUM is 1.112245 (the average of all parts of speech is 3.181543).

The 1st highest number of forms (17) was observed with the lemma “_”: I, II, III, IV, V, VI, X, 둘, 만, 백, 억, 제1, 제3, 제45, 천, 첫째, 하나.

The 2nd highest number of forms (4) was observed with the lemma “하나”: 하나는, 하나를, 하나에는, 하나와.

The 3rd highest number of forms (2) was observed with the lemma “1”: 1, 1은.

NUM occurs with 3 features: NumType (513; 100% instances), Polite (11; 2% instances), Case (5; 1% instances)

NUM occurs with 4 feature-value pairs: Case=Acc, Case=Nom, NumType=Card, Polite=Form

NUM occurs with 4 feature combinations. The most frequent feature combination is NumType=Card (502 tokens). Examples: 1, 10, 3, 2, 4, 6, 20, 8, 5, 9

Relations

NUM nodes are attached to their parents using 7 different relations: nummod (487; 95% instances), obl (10; 2% instances), appos (4; 1% instances), compound (4; 1% instances), nsubj (4; 1% instances), obj (2; 0% instances), root (2; 0% instances)

Parents of NUM nodes belong to 8 different parts of speech: NOUN (461; 90% instances), DET (25; 5% instances), NUM (13; 3% instances), VERB (7; 1% instances), ADJ (3; 1% instances), (2; 0% instances), PART (1; 0% instances), PROPN (1; 0% instances)

470 (92%) NUM nodes are leaves.

33 (6%) NUM nodes have one child.

8 (2%) NUM nodes have two children.

2 (0%) NUM nodes have three or more children.

The highest child degree of a NUM node is 5.

Children of NUM nodes are attached using 9 different relations: punct (20; 34% instances), compound (17; 29% instances), nummod (13; 22% instances), conj (2; 3% instances), cop (2; 3% instances), advcl (1; 2% instances), advmod (1; 2% instances), det (1; 2% instances), nsubj (1; 2% instances)

Children of NUM nodes belong to 7 different parts of speech: PUNCT (20; 34% instances), NOUN (19; 33% instances), NUM (13; 22% instances), AUX (2; 3% instances), PROPN (2; 3% instances), ADV (1; 2% instances), DET (1; 2% instances)