Treebank Statistics: UD_Japanese-PUD: POS Tags: NUM
There are 205 NUM lemmas (4%), 208 NUM types (4%) and 655 NUM tokens (2%).
Out of 16 observed tags, the rank of NUM is: 5 in number of lemmas, 5 in number of types and 8 in number of tokens.
The 10 most frequent NUM lemmas: 1, 2, 3, 10, 一, 4, 万, 6, 5, 何
The 10 most frequent NUM types: 1, 2, 3, 10, 一, 4, 万, 6, 5, 何
The 10 most frequent ambiguous lemmas: 1 (NUM 50, NOUN 1), 何 (NUM 12, PRON 9), 9 (NUM 8, NOUN 1), 数 (NOUN 10, NUM 7)
The 10 most frequent ambiguous types: 1 (NUM 50, NOUN 1), 何 (NUM 12, PRON 8), 9 (NUM 8, NOUN 1), 数 (NOUN 10, NUM 7), いく (NUM 6, VERB 1)
- 1
- 何
- 9
- 数
- いく
Morphology
The form / lemma ratio of NUM is 1.014634 (the average of all parts of speech is 1.068686).
The 1st highest number of forms (2) was observed with the lemma “一”: 一, 1.
The 2nd highest number of forms (2) was observed with the lemma “三”: Ⅲ, 三.
The 3rd highest number of forms (2) was observed with the lemma “二”: 二, 2.
NUM does not occur with any features.
Relations
NUM nodes are attached to their parents using 7 different relations: nummod (432; 66% instances), compound (202; 31% instances), nmod (10; 2% instances), obl (6; 1% instances), nsubj (3; 0% instances), obj (1; 0% instances), root (1; 0% instances)
Parents of NUM nodes belong to 7 different parts of speech: NOUN (621; 95% instances), ADV (11; 2% instances), VERB (9; 1% instances), NUM (7; 1% instances), PROPN (5; 1% instances), ADJ (1; 0% instances), (1; 0% instances)
632 (96%) NUM nodes are leaves.
3 (0%) NUM nodes have one child.
10 (2%) NUM nodes have two children.
10 (2%) NUM nodes have three or more children.
The highest child degree of a NUM node is 9.
Children of NUM nodes are attached using 7 different relations: compound (28; 44% instances), case (27; 42% instances), punct (4; 6% instances), nummod (2; 3% instances), acl (1; 2% instances), advmod (1; 2% instances), obl (1; 2% instances)
Children of NUM nodes belong to 8 different parts of speech: ADP (27; 42% instances), NOUN (20; 31% instances), NUM (7; 11% instances), PUNCT (4; 6% instances), PROPN (2; 3% instances), SYM (2; 3% instances), ADV (1; 2% instances), VERB (1; 2% instances)