home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Japanese-GSDLUW: POS Tags: NOUN

There are 18540 NOUN lemmas (64%), 18790 NOUN types (60%) and 35040 NOUN tokens (23%). Out of 17 observed tags, the rank of NOUN is: 1 in number of lemmas, 1 in number of types and 2 in number of tokens.

The 10 most frequent NOUN lemmas: 事, 物, 為, 後, 他, 様, 中, 人, 時, 場合

The 10 most frequent NOUN types: こと, ため, もの, 後, よう, 人, 中, 他, 場合, お店

The 10 most frequent ambiguous lemmas: 後 (NOUN 177, ADV 11), 様 (AUX 257, NOUN 137), 中 (NOUN 127, ADV 1), 現在 (NOUN 76, ADV 27), 所 (NOUN 60, ADV 1), 前 (NOUN 46, ADV 1), 一部 (NOUN 44, ADV 2), 必要 (NOUN 40, ADJ 35), 結果 (NOUN 40, ADV 4), 全て (NOUN 36, ADV 22)

The 10 most frequent ambiguous types: 後 (NOUN 177, ADV 1), よう (AUX 256, NOUN 128), 中 (NOUN 115, ADV 1), 現在 (NOUN 76, ADV 27), 多く (ADJ 54, NOUN 51), 感じ (NOUN 51, VERB 33), 一部 (NOUN 44, ADV 2), 前 (NOUN 44, ADV 1), 必要 (NOUN 40, ADJ 35), 結果 (NOUN 40, ADV 4)

Morphology

The form / lemma ratio of NOUN is 1.013484 (the average of all parts of speech is 1.095294).

The 1st highest number of forms (6) was observed with the lemma “_”: かっちゃ, セインツ, ドゥーナダン, ポストペイ, リアドロ, レーベンズ.

The 2nd highest number of forms (4) was observed with the lemma “出し”: だし, ダシ, 出し, 出汁.

The 3rd highest number of forms (4) was observed with the lemma “子供達”: 子どもたち, 子ども達, 子供たち, 子供達.

NOUN occurs with 1 features: Polarity (3; 0% instances)

NOUN occurs with 1 feature-value pairs: Polarity=Neg

NOUN occurs with 2 feature combinations. The most frequent feature combination is _ (35037 tokens). Examples: こと, ため, もの, 後, よう, 人, 中, 他, 場合, お店

Relations

NOUN nodes are attached to their parents using 12 different relations: obl (9924; 28% instances), nmod (9293; 27% instances), nsubj (6544; 19% instances), obj (4828; 14% instances), root (2090; 6% instances), compound (825; 2% instances), advcl (654; 2% instances), nsubj:outer (424; 1% instances), acl (419; 1% instances), ccomp (35; 0% instances), csubj (3; 0% instances), csubj:outer (1; 0% instances)

Parents of NOUN nodes belong to 14 different parts of speech: VERB (18875; 54% instances), NOUN (10870; 31% instances), (2090; 6% instances), ADJ (1754; 5% instances), NUM (762; 2% instances), PROPN (554; 2% instances), ADV (87; 0% instances), PRON (31; 0% instances), AUX (8; 0% instances), INTJ (3; 0% instances), SCONJ (2; 0% instances), SYM (2; 0% instances), DET (1; 0% instances), X (1; 0% instances)

1089 (3%) NOUN nodes are leaves.

14466 (41%) NOUN nodes have one child.

12956 (37%) NOUN nodes have two children.

6529 (19%) NOUN nodes have three or more children.

The highest child degree of a NOUN node is 16.

Children of NOUN nodes are attached using 23 different relations: case (31522; 50% instances), nmod (10708; 17% instances), punct (7231; 11% instances), acl (6513; 10% instances), aux (1203; 2% instances), cop (1158; 2% instances), nsubj (1079; 2% instances), det (960; 2% instances), compound (792; 1% instances), obl (775; 1% instances), advmod (336; 1% instances), mark (335; 1% instances), obj (290; 0% instances), cc (185; 0% instances), amod (161; 0% instances), nummod (115; 0% instances), csubj (99; 0% instances), advcl (63; 0% instances), nsubj:outer (54; 0% instances), dep (26; 0% instances), discourse (4; 0% instances), csubj:outer (3; 0% instances), ccomp (1; 0% instances)

Children of NOUN nodes belong to 17 different parts of speech: ADP (31522; 50% instances), NOUN (10870; 17% instances), PUNCT (7231; 11% instances), VERB (5040; 8% instances), AUX (2361; 4% instances), ADJ (1733; 3% instances), PROPN (1723; 3% instances), DET (960; 2% instances), NUM (921; 1% instances), ADV (338; 1% instances), PRON (332; 1% instances), SCONJ (271; 0% instances), CCONJ (185; 0% instances), PART (65; 0% instances), SYM (55; 0% instances), INTJ (4; 0% instances), X (2; 0% instances)