Treebank Statistics: UD_Korean-KSL: POS Tags: VERB
There are 9119 VERB lemmas (29%), 8984 VERB types (28%) and 32254 VERB tokens (21%).
Out of 16 observed tags, the rank of VERB is: 2 in number of lemmas, 2 in number of types and 2 in number of tokens.
The 10 most frequent VERB lemmas: 생각+하+ㄴ다, 하+ㄹ, 있+다, 가+았+습니다, 있+습니다, 하+고, 있+는, 가+ㄹ, 하+는, 되+ㄴ다
The 10 most frequent VERB types: 할, 생각한다, 있다, 갔습니다, 있습니다, 하고, 있는, 갈, 하는, 된다
The 10 most frequent ambiguous lemmas: 있+다 (VERB 417, ADJ 41), 있+습니다 (VERB 269, ADJ 23), 하+고 (VERB 266, ADP 3, CCONJ 1), 있+는 (VERB 253, ADJ 11), 하+는 (VERB 222, ADP 1), 되+ㄴ다 (VERB 220, ADJ 1), 있+으면 (VERB 183, ADJ 5), 있+고 (VERB 118, ADJ 20), 있+어서 (VERB 98, ADJ 11), 없+다 (VERB 96, ADJ 85)
The 10 most frequent ambiguous types: 할 (VERB 537, AUX 29, X 18), 있다 (AUX 892, VERB 419, ADJ 41), 있습니다 (AUX 270, VERB 269, ADJ 23), 하고 (VERB 266, ADP 50, AUX 26, NOUN 8, X 2, CCONJ 1), 있는 (AUX 431, VERB 253, ADJ 10), 갈 (VERB 239, X 1), 하는 (VERB 224, AUX 77, ADP 1), 된다 (VERB 220, ADJ 1), 하면 (VERB 193, AUX 28, X 1), 배워야 (VERB 187, NOUN 1)
- 할
- 있다
- 있습니다
- 하고
- 있는
- 갈
- 하는
- 된다
- 하면
- 배워야
Morphology
The form / lemma ratio of VERB is 0.985196 (the average of all parts of speech is 1.008073).
The 1st highest number of forms (5) was observed with the lemma “하”: 하게, 하는, 한다고, 한다면, 해요.
The 2nd highest number of forms (4) was observed with the lemma “되+었+다”: 꼈다, 됐다, 됐었다, 되었다.
The 3rd highest number of forms (3) was observed with the lemma “답+하+았+다”: 답하였다, 답햇다, 답했다.
VERB occurs with 1 features: Typo (1513; 5% instances)
VERB occurs with 1 feature-value pairs: Typo=Yes
VERB occurs with 2 feature combinations.
The most frequent feature combination is _ (30741 tokens).
Examples: 할, 생각한다, 있다, 갔습니다, 있습니다, 하고, 있는, 갈, 하는, 된다
Relations
VERB nodes are attached to their parents using 20 different relations: root (10160; 31% instances), advcl (10113; 31% instances), acl (8470; 26% instances), conj (1191; 4% instances), ccomp (869; 3% instances), obl (414; 1% instances), obj (389; 1% instances), nsubj (237; 1% instances), nmod (170; 1% instances), amod (111; 0% instances), list (30; 0% instances), parataxis (24; 0% instances), compound (22; 0% instances), flat (19; 0% instances), dislocated (17; 0% instances), csubj (9; 0% instances), nmod:poss (5; 0% instances), dep (2; 0% instances), appos (1; 0% instances), xcomp (1; 0% instances)
Parents of VERB nodes belong to 11 different parts of speech: (10160; 31% instances), VERB (9743; 30% instances), NOUN (6701; 21% instances), ADJ (3826; 12% instances), AUX (1094; 3% instances), ADV (707; 2% instances), PRON (16; 0% instances), ADP (4; 0% instances), DET (1; 0% instances), INTJ (1; 0% instances), NUM (1; 0% instances)
2593 (8%) VERB nodes are leaves.
8721 (27%) VERB nodes have one child.
7781 (24%) VERB nodes have two children.
13159 (41%) VERB nodes have three or more children.
The highest child degree of a VERB node is 10.
Children of VERB nodes are attached using 31 different relations: obl (12945; 17% instances), obj (12664; 17% instances), nsubj (10972; 15% instances), punct (10262; 14% instances), advcl (8746; 12% instances), advmod (8276; 11% instances), aux (2776; 4% instances), cc (1778; 2% instances), ccomp (1673; 2% instances), conj (1139; 2% instances), mark (955; 1% instances), dislocated (925; 1% instances), case (380; 1% instances), nmod (279; 0% instances), acl (185; 0% instances), goeswith (120; 0% instances), amod (76; 0% instances), nmod:poss (52; 0% instances), compound:lvc (42; 0% instances), list (41; 0% instances), parataxis (39; 0% instances), vocative (29; 0% instances), compound (23; 0% instances), det (14; 0% instances), flat (14; 0% instances), appos (6; 0% instances), csubj (5; 0% instances), discourse (5; 0% instances), nummod (4; 0% instances), dep (1; 0% instances), xcomp (1; 0% instances)
Children of VERB nodes belong to 15 different parts of speech: NOUN (24730; 33% instances), ADV (18027; 24% instances), PUNCT (10262; 14% instances), VERB (9743; 13% instances), AUX (3158; 4% instances), PRON (2816; 4% instances), ADJ (2302; 3% instances), CCONJ (1778; 2% instances), SCONJ (955; 1% instances), ADP (390; 1% instances), X (120; 0% instances), NUM (94; 0% instances), DET (44; 0% instances), INTJ (5; 0% instances), PART (3; 0% instances)