Treebank Statistics: UD_Korean-GSD: POS Tags: PUNCT
There are 103 PUNCT
lemmas (0%), 106 PUNCT
types (0%) and 10362 PUNCT
tokens (13%).
Out of 16 observed tags, the rank of PUNCT
is: 10 in number of lemmas, 10 in number of types and 4 in number of tokens.
The 10 most frequent PUNCT
lemmas: ., ,, ‘, (, ), “, %, ?, !, •
The 10 most frequent PUNCT
types: ., ,, ‘, (, ), “, %, ?, !, •
The 10 most frequent ambiguous lemmas: % (PUNCT 137, SYM 45), ~ (SYM 46, PUNCT 24), 이+다 (PUNCT 19, VERB 2, NOUN 1), ㎡ (PUNCT 12, SYM 5), ㎞ (SYM 7, PUNCT 5), ^ (PUNCT 3, SYM 1), ℓ (PUNCT 3, SYM 1), ㎢ (PUNCT 3, SYM 3), ㈜ (PUNCT 2, SYM 1), ㎝ (PUNCT 2, SYM 1)
The 10 most frequent ambiguous types: % (PUNCT 137, SYM 45), ~ (SYM 46, PUNCT 24), 이다 (PUNCT 14, VERB 2, NOUN 1), ㎡ (PUNCT 12, SYM 5), ㎞ (SYM 7, PUNCT 5), 다 (ADV 46, PUNCT 5, NOUN 3), ^ (PUNCT 3, SYM 1), ℓ (PUNCT 3, SYM 1), ㎢ (PUNCT 3, SYM 3), ㈜ (PUNCT 2, SYM 1)
- %
- ~
- 이다
- ㎡
- ㎞
- 다
- ℓ
- ㎢
- ㈜
Morphology
The form / lemma ratio of PUNCT
is 1.029126 (the average of all parts of speech is 1.000681).
The 1st highest number of forms (2) was observed with the lemma “<”: <, <.
The 2nd highest number of forms (2) was observed with the lemma “이+다”: 다, 이다.
The 3rd highest number of forms (2) was observed with the lemma “이+었+다”: 였다, 이었다.
PUNCT
occurs with 1 features: NumType (16; 0% instances)
PUNCT
occurs with 1 feature-value pairs: NumType=Card
PUNCT
occurs with 2 feature combinations.
The most frequent feature combination is _
(10346 tokens).
Examples: ., ,, ‘, (, ), “, %, ?, !, •
Relations
PUNCT
nodes are attached to their parents using 9 different relations: punct (10299; 99% instances), cop (29; 0% instances), appos (25; 0% instances), flat (4; 0% instances), advcl (1; 0% instances), case (1; 0% instances), conj (1; 0% instances), dep (1; 0% instances), root (1; 0% instances)
Parents of PUNCT
nodes belong to 16 different parts of speech: VERB (4863; 47% instances), NOUN (3035; 29% instances), ADJ (768; 7% instances), SYM (416; 4% instances), NUM (398; 4% instances), PROPN (349; 3% instances), PUNCT (275; 3% instances), ADV (171; 2% instances), AUX (32; 0% instances), ADP (26; 0% instances), DET (9; 0% instances), PRON (9; 0% instances), INTJ (5; 0% instances), CCONJ (4; 0% instances), PART (1; 0% instances), (1; 0% instances)
10059 (97%) PUNCT
nodes are leaves.
269 (3%) PUNCT
nodes have one child.
17 (0%) PUNCT
nodes have two children.
17 (0%) PUNCT
nodes have three or more children.
The highest child degree of a PUNCT
node is 6.
Children of PUNCT
nodes are attached using 14 different relations: punct (288; 79% instances), appos (29; 8% instances), case (14; 4% instances), flat (13; 4% instances), conj (5; 1% instances), obj (4; 1% instances), nmod (3; 1% instances), acl:relcl (2; 1% instances), nsubj (2; 1% instances), amod (1; 0% instances), cc (1; 0% instances), cop (1; 0% instances), dep (1; 0% instances), det (1; 0% instances)
Children of PUNCT
nodes belong to 11 different parts of speech: PUNCT (275; 75% instances), NOUN (39; 11% instances), ADP (19; 5% instances), SYM (18; 5% instances), NUM (5; 1% instances), PROPN (3; 1% instances), VERB (2; 1% instances), ADJ (1; 0% instances), ADV (1; 0% instances), CCONJ (1; 0% instances), DET (1; 0% instances)