home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Korean-PUD: POS Tags: DET

There are 3 DET lemmas (0%), 55 DET types (1%) and 465 DET tokens (3%). Out of 13 observed tags, the rank of DET is: 9 in number of lemmas, 9 in number of types and 9 in number of tokens.

The 10 most frequent DET lemmas: _, 있는가, 총

The 10 most frequent DET types: 그, 이, 두, 한, 다른, 여러, 모든, 만, 몇, 세

The 10 most frequent ambiguous lemmas: _ (NOUN 4325, VERB 1625, PROPN 1035, ADJ 609, ADV 517, DET 463, AUX 458, CCONJ 125, NUM 27, PRON 24, X 3, PUNCT 1)

The 10 most frequent ambiguous types: 그 (DET 127, NOUN 4), 이 (DET 52, PART 16, NOUN 3, PRON 1, PROPN 1), 한 (DET 35, VERB 11, NOUN 2, ADJ 1, ADV 1), 다른 (DET 20, ADJ 6), 만 (DET 12, PART 5, NUM 1), 세 (NOUN 12, DET 11), 전 (NOUN 11, DET 9), 억 (DET 6, NUM 1), 천 (DET 3, NUM 1), 수 (NOUN 121, DET 2)

Morphology

The form / lemma ratio of DET is 18.333333 (the average of all parts of speech is 3.165468).

The 1st highest number of forms (53) was observed with the lemma “_”: 각, 구, 그, 그런, 네, 다른, 다섯, 단, 두, 만, 몇, 몇몇, 모든, 새, 서른, 성, 세, 수, 수백만, 수백억, 수십, 수십억, 수천, 십, 십억, 아닌가, 아무, 아홉, 약, 어느, 어떠한가, 어떤, 어떤가, 억, 여덟, 여러, 여섯, 열, 옛, 오랜, 온, 이, 이런, 일곱, 있겠는가, 있는가, 전, 천, 천만, 첫, 총, 한, 현.

The 2nd highest number of forms (1) was observed with the lemma “있는가”: 있는가를.

The 3rd highest number of forms (1) was observed with the lemma “총”: 총으로.

DET occurs with 4 features: PronType (7; 2% instances), VerbForm (7; 2% instances), Case (2; 0% instances), Polite (2; 0% instances)

DET occurs with 5 feature-value pairs: Case=Acc, Case=Advb, Polite=Form, PronType=Int, VerbForm=Fin

DET occurs with 4 feature combinations. The most frequent feature combination is _ (457 tokens). Examples: 그, 이, 두, 한, 다른, 여러, 모든, 만, 몇, 세

Relations

DET nodes are attached to their parents using 7 different relations: det (338; 73% instances), nummod (118; 25% instances), root (5; 1% instances), advcl (1; 0% instances), advmod (1; 0% instances), ccomp (1; 0% instances), goeswith (1; 0% instances)

Parents of DET nodes belong to 8 different parts of speech: NOUN (445; 96% instances), PROPN (6; 1% instances), (5; 1% instances), VERB (3; 1% instances), DET (2; 0% instances), PRON (2; 0% instances), ADJ (1; 0% instances), NUM (1; 0% instances)

431 (93%) DET nodes are leaves.

26 (6%) DET nodes have one child.

5 (1%) DET nodes have two children.

3 (1%) DET nodes have three or more children.

The highest child degree of a DET node is 4.

Children of DET nodes are attached using 8 different relations: nummod (25; 54% instances), nsubj (7; 15% instances), punct (5; 11% instances), advmod (4; 9% instances), advcl (2; 4% instances), compound:lvc (1; 2% instances), det (1; 2% instances), goeswith (1; 2% instances)

Children of DET nodes belong to 7 different parts of speech: NUM (25; 54% instances), NOUN (7; 15% instances), PUNCT (5; 11% instances), ADJ (3; 7% instances), ADV (3; 7% instances), DET (2; 4% instances), PRON (1; 2% instances)