Treebank Statistics: UD_Indonesian-PUD: POS Tags: NUM
There are 211 NUM
lemmas (5%), 222 NUM
types (5%) and 501 NUM
tokens (3%).
Out of 17 observed tags, the rank of NUM
is: 5 in number of lemmas, 5 in number of types and 11 in number of tokens.
The 10 most frequent NUM
lemmas: satu, dua, tiga, juta, puluh, empat, 1, 10, miliar, 3
The 10 most frequent NUM
types: satu, dua, kedua, tiga, juta, empat, 1, 10, 3, puluh
The 10 most frequent ambiguous lemmas: satu (NUM 51, VERB 1), dua (NUM 49, ADJ 8), tiga (NUM 17, ADJ 7), empat (NUM 10, ADJ 1), 3 (NUM 7, ADJ 2), enam (NUM 6, ADJ 1), 20 (NUM 5, ADJ 1), 8 (NUM 3, ADJ 1), 14 (NUM 2, ADJ 1), 16 (ADJ 2, NUM 2)
The 10 most frequent ambiguous types: kedua (NUM 14, ADJ 6), III (PROPN 4, NUM 1), ketiga (ADJ 7, NUM 1)
- kedua
- III
- ketiga
Morphology
The form / lemma ratio of NUM
is 1.052133 (the average of all parts of speech is 1.137428).
The 1st highest number of forms (3) was observed with the lemma “dua”: berdua, dua, kedua.
The 2nd highest number of forms (3) was observed with the lemma “puluh”: puluh, puluhan, sepuluh.
The 3rd highest number of forms (2) was observed with the lemma “enam”: enam, keenam.
NUM
occurs with 1 features: NumType (501; 100% instances)
NUM
occurs with 1 feature-value pairs: NumType=Card
NUM
occurs with 1 feature combinations.
The most frequent feature combination is NumType=Card
(501 tokens).
Examples: satu, dua, kedua, tiga, juta, empat, 1, 10, 3, puluh
Relations
NUM
nodes are attached to their parents using 12 different relations: nummod (358; 71% instances), flat (76; 15% instances), obl:tmod (16; 3% instances), conj (11; 2% instances), nsubj (10; 2% instances), nmod (8; 2% instances), appos (6; 1% instances), nmod:tmod (5; 1% instances), obl (5; 1% instances), nsubj:pass (2; 0% instances), obj (2; 0% instances), root (2; 0% instances)
Parents of NUM
nodes belong to 8 different parts of speech: NOUN (332; 66% instances), NUM (62; 12% instances), SYM (38; 8% instances), VERB (37; 7% instances), PROPN (25; 5% instances), ADJ (3; 1% instances), PRON (2; 0% instances), (2; 0% instances)
360 (72%) NUM
nodes are leaves.
90 (18%) NUM
nodes have one child.
28 (6%) NUM
nodes have two children.
23 (5%) NUM
nodes have three or more children.
The highest child degree of a NUM
node is 7.
Children of NUM
nodes are attached using 16 different relations: flat (76; 33% instances), advmod (31; 14% instances), punct (30; 13% instances), case (25; 11% instances), nmod (21; 9% instances), cc (11; 5% instances), conj (8; 3% instances), det (8; 3% instances), nsubj (5; 2% instances), cop (4; 2% instances), acl:relcl (2; 1% instances), advmod:emph (2; 1% instances), nmod:lmod (2; 1% instances), nmod:tmod (2; 1% instances), advcl (1; 0% instances), csubj (1; 0% instances)
Children of NUM
nodes belong to 12 different parts of speech: NUM (62; 27% instances), PUNCT (30; 13% instances), PROPN (26; 11% instances), ADP (25; 11% instances), NOUN (25; 11% instances), ADJ (16; 7% instances), ADV (15; 7% instances), CCONJ (11; 5% instances), DET (8; 3% instances), AUX (4; 2% instances), VERB (4; 2% instances), PART (3; 1% instances)