home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Japanese-PUD: POS Tags: NUM

There are 205 NUM lemmas (4%), 208 NUM types (4%) and 655 NUM tokens (2%). Out of 16 observed tags, the rank of NUM is: 5 in number of lemmas, 5 in number of types and 8 in number of tokens.

The 10 most frequent NUM lemmas: 1, 2, 3, 10, 一, 4, 万, 6, 5, 何

The 10 most frequent NUM types: 1, 2, 3, 10, 一, 4, 万, 6, 5, 何

The 10 most frequent ambiguous lemmas: 1 (NUM 50, NOUN 1), 何 (NUM 12, PRON 9), 9 (NUM 8, NOUN 1), 数 (NOUN 10, NUM 7)

The 10 most frequent ambiguous types: 1 (NUM 50, NOUN 1), 何 (NUM 12, PRON 8), 9 (NUM 8, NOUN 1), 数 (NOUN 10, NUM 7), いく (NUM 6, VERB 1)

Morphology

The form / lemma ratio of NUM is 1.014634 (the average of all parts of speech is 1.068660).

The 1st highest number of forms (2) was observed with the lemma “一”: 一, 1.

The 2nd highest number of forms (2) was observed with the lemma “三”: Ⅲ, 三.

The 3rd highest number of forms (2) was observed with the lemma “二”: 二, 2.

NUM does not occur with any features.

Relations

NUM nodes are attached to their parents using 8 different relations: nummod (432; 66% instances), compound (199; 30% instances), nmod (10; 2% instances), obl (6; 1% instances), appos (3; 0% instances), nsubj (3; 0% instances), obj (1; 0% instances), root (1; 0% instances)

Parents of NUM nodes belong to 7 different parts of speech: NOUN (622; 95% instances), ADV (11; 2% instances), VERB (9; 1% instances), NUM (7; 1% instances), PROPN (4; 1% instances), ADJ (1; 0% instances), (1; 0% instances)

629 (96%) NUM nodes are leaves.

3 (0%) NUM nodes have one child.

13 (2%) NUM nodes have two children.

10 (2%) NUM nodes have three or more children.

The highest child degree of a NUM node is 9.

Children of NUM nodes are attached using 7 different relations: compound (28; 40% instances), case (27; 39% instances), punct (10; 14% instances), nummod (2; 3% instances), acl (1; 1% instances), advmod (1; 1% instances), obl (1; 1% instances)

Children of NUM nodes belong to 8 different parts of speech: ADP (27; 39% instances), NOUN (20; 29% instances), PUNCT (10; 14% instances), NUM (7; 10% instances), PROPN (2; 3% instances), SYM (2; 3% instances), ADV (1; 1% instances), VERB (1; 1% instances)