Treebank Statistics: UD_Japanese-PUD: POS Tags: NOUN
There are 2775 NOUN
lemmas (54%), 2804 NOUN
types (51%) and 7424 NOUN
tokens (26%).
Out of 16 observed tags, the rank of NOUN
is: 1 in number of lemmas, 1 in number of types and 1 in number of tokens.
The 10 most frequent NOUN
lemmas: 事, 年, 人, 者, 月, 等, 達, 後, 為, 日
The 10 most frequent NOUN
types: こと, 年, 人, 者, 月, ら, ため, 後, 日, つ
The 10 most frequent ambiguous lemmas: 為 (NOUN 49, SCONJ 19), 日 (NOUN 47, ADV 2), 物 (NOUN 40, SCONJ 3), 前 (NOUN 33, ADV 1), 所 (NOUN 22, CCONJ 1), 様 (AUX 42, NOUN 22), 全て (NOUN 19, ADV 2), 以上 (NOUN 14, ADV 3), 必要 (ADJ 14, NOUN 14), 投票 (NOUN 14, VERB 3)
The 10 most frequent ambiguous types: ため (NOUN 48, SCONJ 19), 日 (NOUN 47, ADV 2), 多く (NOUN 36, ADJ 1), もの (NOUN 34, SCONJ 3), 前 (NOUN 33, ADV 1), よう (AUX 42, NOUN 22), 以上 (NOUN 14, ADV 3), 必要 (ADJ 14, NOUN 14), 投票 (NOUN 14, VERB 3), 投資 (NOUN 14, VERB 4)
- ため
- 日
- 多く
- もの
- 前
- よう
- 以上
- 必要
- 投票
- 投資
Morphology
The form / lemma ratio of NOUN
is 1.010450 (the average of all parts of speech is 1.068660).
The 1st highest number of forms (3) was observed with the lemma “一人”: 1人, 一人, 1人.
The 2nd highest number of forms (3) was observed with the lemma “二人”: 2人, 二人, 2人.
The 3rd highest number of forms (3) was observed with the lemma “後”: あと, のち, 後.
NOUN
occurs with 1 features: Polarity (30; 0% instances)
NOUN
occurs with 1 feature-value pairs: Polarity=Neg
NOUN
occurs with 2 feature combinations.
The most frequent feature combination is _
(7394 tokens).
Examples: こと, 年, 人, 者, 月, ら, ため, 後, 日, つ
Relations
NOUN
nodes are attached to their parents using 12 different relations: compound (2031; 27% instances), nmod (1748; 24% instances), obl (1437; 19% instances), nsubj (1000; 13% instances), obj (765; 10% instances), acl (123; 2% instances), root (119; 2% instances), advcl (106; 1% instances), nsubj:outer (53; 1% instances), appos (33; 0% instances), ccomp (5; 0% instances), case (4; 0% instances)
Parents of NOUN
nodes belong to 9 different parts of speech: NOUN (3700; 50% instances), VERB (3135; 42% instances), PROPN (203; 3% instances), ADJ (190; 3% instances), (119; 2% instances), ADV (38; 1% instances), NUM (20; 0% instances), PRON (15; 0% instances), AUX (4; 0% instances)
1965 (26%) NOUN
nodes are leaves.
1032 (14%) NOUN
nodes have one child.
2297 (31%) NOUN
nodes have two children.
2130 (29%) NOUN
nodes have three or more children.
The highest child degree of a NOUN
node is 13.
Children of NOUN
nodes are attached using 22 different relations: case (4969; 36% instances), compound (2551; 19% instances), nmod (2092; 15% instances), punct (1219; 9% instances), acl (1047; 8% instances), nummod (429; 3% instances), cop (335; 2% instances), det (204; 1% instances), fixed (150; 1% instances), mark (146; 1% instances), nsubj (146; 1% instances), aux (97; 1% instances), amod (81; 1% instances), advmod (64; 0% instances), obl (50; 0% instances), cc (47; 0% instances), appos (30; 0% instances), obj (21; 0% instances), advcl (10; 0% instances), dep (7; 0% instances), csubj (4; 0% instances), nsubj:outer (3; 0% instances)
Children of NOUN
nodes belong to 15 different parts of speech: ADP (5056; 37% instances), NOUN (3700; 27% instances), PUNCT (1219; 9% instances), VERB (759; 6% instances), PROPN (758; 6% instances), NUM (622; 5% instances), AUX (474; 3% instances), ADJ (364; 3% instances), DET (204; 1% instances), PRON (183; 1% instances), PART (117; 1% instances), SYM (105; 1% instances), ADV (65; 0% instances), CCONJ (47; 0% instances), SCONJ (29; 0% instances)