home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Korean-KSL: POS Tags: VERB

There are 8271 VERB lemmas (29%), 8145 VERB types (28%) and 28298 VERB tokens (21%). Out of 14 observed tags, the rank of VERB is: 2 in number of lemmas, 2 in number of types and 2 in number of tokens.

The 10 most frequent VERB lemmas: 생각+하+ㄴ다, 하+ㄹ, 있+다, 가+았+습니다, 하+고, 가+ㄹ, 있+습니다, 있+는, 가+고, 되+ㄴ다

The 10 most frequent VERB types: 생각한다, 할, 있다, 갔습니다, 하고, 갈, 있습니다, 있는, 가고, 된다

The 10 most frequent ambiguous lemmas: 있+다 (VERB 338, ADJ 40), 하+고 (VERB 242, ADP 3, CCONJ 1), 있+습니다 (VERB 236, ADJ 22), 있+는 (VERB 217, ADJ 11), 되+ㄴ다 (VERB 190, ADJ 1), 하+는 (VERB 178, ADP 1), 있+으면 (VERB 169, ADJ 5), 있+고 (VERB 96, ADJ 20), 있+어서 (VERB 91, ADJ 11), 없+다 (VERB 83, ADJ 77)

The 10 most frequent ambiguous types: 할 (VERB 417, AUX 25, X 17), 있다 (AUX 750, VERB 340, ADJ 40), 하고 (VERB 242, ADP 50, AUX 23, NOUN 8, X 2, CCONJ 1), 갈 (VERB 236, X 1), 있습니다 (VERB 236, AUX 205, ADJ 22), 있는 (AUX 344, VERB 217, ADJ 10), 된다 (VERB 190, ADJ 1), 하는 (VERB 180, AUX 55, ADP 1), 있으면 (VERB 169, AUX 50, ADJ 5), 하면 (VERB 154, AUX 26, X 1)

Morphology

The form / lemma ratio of VERB is 0.984766 (the average of all parts of speech is 1.007876).

The 1st highest number of forms (5) was observed with the lemma “하”: 하게, 하는, 한다고, 한다면, 해요.

The 2nd highest number of forms (3) was observed with the lemma “답+하+았+다”: 답하였다, 답햇다, 답했다.

The 3rd highest number of forms (3) was observed with the lemma “되+었+다”: 꼈다, 됐다, 되었다.

VERB occurs with 1 features: Typo (1380; 5% instances)

VERB occurs with 1 feature-value pairs: Typo=Yes

VERB occurs with 2 feature combinations. The most frequent feature combination is _ (26918 tokens). Examples: 생각한다, 할, 있다, 갔습니다, 하고, 갈, 있습니다, 있는, 가고, 된다

Relations

VERB nodes are attached to their parents using 20 different relations: root (9281; 33% instances), advcl (8869; 31% instances), acl (7158; 25% instances), conj (1104; 4% instances), ccomp (642; 2% instances), obl (360; 1% instances), obj (342; 1% instances), nsubj (183; 1% instances), nmod (137; 0% instances), amod (99; 0% instances), list (30; 0% instances), parataxis (22; 0% instances), compound (21; 0% instances), flat (19; 0% instances), dislocated (15; 0% instances), csubj (8; 0% instances), nmod:poss (4; 0% instances), dep (2; 0% instances), appos (1; 0% instances), xcomp (1; 0% instances)

Parents of VERB nodes belong to 11 different parts of speech: (9281; 33% instances), VERB (8622; 30% instances), NOUN (5548; 20% instances), ADJ (3369; 12% instances), AUX (846; 3% instances), ADV (610; 2% instances), PRON (15; 0% instances), ADP (4; 0% instances), DET (1; 0% instances), INTJ (1; 0% instances), NUM (1; 0% instances)

2285 (8%) VERB nodes are leaves.

7451 (26%) VERB nodes have one child.

6705 (24%) VERB nodes have two children.

11857 (42%) VERB nodes have three or more children.

The highest child degree of a VERB node is 10.

Children of VERB nodes are attached using 31 different relations: obl (11740; 18% instances), obj (10996; 17% instances), nsubj (9628; 15% instances), punct (9388; 14% instances), advcl (7780; 12% instances), advmod (7357; 11% instances), aux (2393; 4% instances), cc (1622; 2% instances), ccomp (1299; 2% instances), conj (1053; 2% instances), mark (886; 1% instances), dislocated (805; 1% instances), case (339; 1% instances), nmod (272; 0% instances), acl (170; 0% instances), goeswith (112; 0% instances), amod (72; 0% instances), nmod:poss (50; 0% instances), list (41; 0% instances), compound:lvc (40; 0% instances), parataxis (36; 0% instances), vocative (29; 0% instances), compound (22; 0% instances), flat (13; 0% instances), det (12; 0% instances), appos (6; 0% instances), discourse (5; 0% instances), csubj (4; 0% instances), nummod (2; 0% instances), dep (1; 0% instances), xcomp (1; 0% instances)

Children of VERB nodes belong to 14 different parts of speech: NOUN (21624; 33% instances), ADV (16269; 25% instances), PUNCT (9388; 14% instances), VERB (8622; 13% instances), AUX (2708; 4% instances), PRON (2495; 4% instances), ADJ (1981; 3% instances), CCONJ (1622; 2% instances), SCONJ (886; 1% instances), ADP (349; 1% instances), X (112; 0% instances), NUM (84; 0% instances), DET (29; 0% instances), INTJ (5; 0% instances)