home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Indonesian-GSD: POS Tags: VERB

There are 1635 VERB lemmas (8%), 2789 VERB types (13%) and 12471 VERB tokens (10%). Out of 17 observed tags, the rank of VERB is: 3 in number of lemmas, 3 in number of types and 4 in number of tokens.

The 10 most frequent VERB lemmas: jadi, ada, merupakan, milik, guna, dapat, kenal, mulai, buat, diri

The 10 most frequent VERB types: menjadi, merupakan, memiliki, ada, terletak, digunakan, berada, menggunakan, dikenal, terjadi

The 10 most frequent ambiguous lemmas: jadi (VERB 431, NOUN 15, ADV 8, SCONJ 5, ADP 3, CCONJ 1, PART 1, PROPN 1), ada (VERB 396, NOUN 20, ADV 5, PROPN 2, SCONJ 2), milik (VERB 312, NOUN 26), guna (VERB 249, NOUN 64, ADP 7), dapat (AUX 254, VERB 169, NOUN 11, ADV 4, PROPN 1), kenal (VERB 156, NOUN 2, ADJ 1), mulai (VERB 149, ADV 30, ADP 8, PROPN 1), buat (VERB 144, NOUN 21, ADP 2, PROPN 2), diri (VERB 142, PRON 85, NOUN 22, PROPN 1), laku (VERB 142, NOUN 16)

The 10 most frequent ambiguous types: menjadi (VERB 372, ADP 3), ada (VERB 224, ADV 3), mulai (VERB 76, ADV 29, ADP 7, X 1), termasuk (VERB 72, ADP 2), kembali (VERB 63, ADV 43, NOUN 1), masuk (VERB 57, NOUN 2), berbeda (VERB 40, ADJ 5), datang (VERB 40, NOUN 1), muncul (VERB 34, NOUN 3), bekerja (VERB 33, NOUN 1)

Morphology

The form / lemma ratio of VERB is 1.705810 (the average of all parts of speech is 1.120254).

The 1st highest number of forms (9) was observed with the lemma “kenal”: dikenal, dikenali, diperkenalkan, kenal, memperkenalkan, mengenal, mengenali, mengenalkan, terkenal.

The 2nd highest number of forms (9) was observed with the lemma “nama”: bernama, dinamai, dinamakan, menamai, menamakan, nama, namai, namakan, senama.

The 3rd highest number of forms (8) was observed with the lemma “dapat”: berpendapat, dapat, didapat, didapatkan, mendapat, mendapati, mendapatkan, terdapat.

VERB occurs with 3 features: Voice (10812; 87% instances), Mood (10755; 86% instances), Typo (19; 0% instances)

VERB occurs with 5 feature-value pairs: Mood=Imp, Mood=Ind, Typo=Yes, Voice=Act, Voice=Pass

VERB occurs with 9 feature combinations. The most frequent feature combination is Mood=Ind|Voice=Act (7364 tokens). Examples: menjadi, memiliki, berada, menggunakan, membuat, bermain, mulai, melakukan, kembali, berasal

Relations

VERB nodes are attached to their parents using 23 different relations: root (4196; 34% instances), acl:relcl (2246; 18% instances), conj (1251; 10% instances), xcomp (1220; 10% instances), advcl (1215; 10% instances), acl (864; 7% instances), ccomp (392; 3% instances), dep (353; 3% instances), parataxis (313; 3% instances), amod (140; 1% instances), appos (116; 1% instances), fixed (52; 0% instances), csubj (25; 0% instances), obj (21; 0% instances), flat (20; 0% instances), compound (15; 0% instances), nmod (8; 0% instances), nsubj (8; 0% instances), csubj:pass (7; 0% instances), obl (4; 0% instances), nsubj:pass (3; 0% instances), iobj (1; 0% instances), list (1; 0% instances)

Parents of VERB nodes belong to 14 different parts of speech: VERB (4355; 35% instances), (4196; 34% instances), NOUN (3014; 24% instances), PROPN (515; 4% instances), ADJ (168; 1% instances), PRON (163; 1% instances), ADV (29; 0% instances), NUM (11; 0% instances), ADP (8; 0% instances), AUX (4; 0% instances), DET (3; 0% instances), PART (2; 0% instances), X (2; 0% instances), CCONJ (1; 0% instances)

407 (3%) VERB nodes are leaves.

1175 (9%) VERB nodes have one child.

2733 (22%) VERB nodes have two children.

8156 (65%) VERB nodes have three or more children.

The highest child degree of a VERB node is 11.

Children of VERB nodes are attached using 38 different relations: punct (6779; 17% instances), obl (5960; 15% instances), obj (5685; 14% instances), nsubj (5431; 13% instances), advmod (2788; 7% instances), nsubj:pass (2192; 5% instances), mark (1964; 5% instances), xcomp (1320; 3% instances), conj (1232; 3% instances), cc (1162; 3% instances), aux (1159; 3% instances), advcl (1095; 3% instances), obl:tmod (646; 2% instances), amod (480; 1% instances), ccomp (351; 1% instances), case (323; 1% instances), dep (321; 1% instances), parataxis (317; 1% instances), acl (282; 1% instances), det (213; 1% instances), compound (183; 0% instances), nummod (130; 0% instances), advmod:emph (101; 0% instances), appos (73; 0% instances), obl:agent (65; 0% instances), nmod (64; 0% instances), fixed (63; 0% instances), case:adv (43; 0% instances), cop (40; 0% instances), flat (35; 0% instances), iobj (20; 0% instances), csubj (19; 0% instances), goeswith (18; 0% instances), vocative (8; 0% instances), csubj:pass (7; 0% instances), discourse (3; 0% instances), acl:relcl (2; 0% instances), flat:name (1; 0% instances)

Children of VERB nodes belong to 17 different parts of speech: NOUN (12112; 30% instances), PUNCT (6779; 17% instances), PROPN (4586; 11% instances), VERB (4355; 11% instances), PRON (3877; 10% instances), ADV (2309; 6% instances), SCONJ (1884; 5% instances), AUX (1210; 3% instances), CCONJ (1168; 3% instances), ADJ (757; 2% instances), PART (529; 1% instances), ADP (443; 1% instances), NUM (308; 1% instances), DET (211; 1% instances), X (27; 0% instances), SYM (18; 0% instances), INTJ (2; 0% instances)