home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Chinese-GSD: POS Tags: X

There are 849 X lemmas (4%), 849 X types (4%) and 1209 X tokens (1%). Out of 15 observed tags, the rank of X is: 6 in number of lemmas, 6 in number of types and 14 in number of tokens.

The 10 most frequent X lemmas: 的、 了、 the、 A、 的話、 NBA、 JR、 of、 B、 Google

The 10 most frequent X types: 的、 了、 the、 A、 的話、 NBA、 JR、 of、 B、 Google

The 10 most frequent ambiguous lemmas: 的 (PART 5503, X 134), 了 (PART 765, X 43, VERB 2), , (PUNCT 7694, X 4, ADV 1, NUM 1), ( (PUNCT 559, X 3), ) (PUNCT 559, X 3), Casey (X 3, PROPN 1), F (X 2, PROPN 1), James (X 2, PROPN 1), °C (X 2, NOUN 1), 之 (PART 247, PRON 23, X 1)

The 10 most frequent ambiguous types: 的 (PART 5503, X 134), 了 (PART 765, X 43, VERB 2), , (PUNCT 7694, X 4, ADV 1, NUM 1), ( (PUNCT 559, X 3), ) (PUNCT 559, X 3), Casey (X 3, PROPN 1), F (X 2, PROPN 1), James (X 2, PROPN 1), °C (X 2, NOUN 1), 之 (PART 247, PRON 23, X 1)

Morphology

The form / lemma ratio of X is 1.000000 (the average of all parts of speech is 1.000266).

The 1st highest number of forms (1) was observed with the lemma “#A”: #A.

The 2nd highest number of forms (1) was observed with the lemma “#B”: #B.

The 3rd highest number of forms (1) was observed with the lemma “#C”: #C.

X occurs with 2 features: Mood (8; 1% instances), Aspect (1; 0% instances)

X occurs with 2 feature-value pairs: Aspect=Perf, Mood=Inter

X occurs with 3 feature combinations. The most frequent feature combination is _ (1200 tokens). Examples: 的、 了、 the、 A、 的話、 NBA、 JR、 of、 B、 Google

Relations

X nodes are attached to their parents using 17 different relations: flat:foreign (322; 27% instances), appos (246; 20% instances), discourse (194; 16% instances), nmod (120; 10% instances), nsubj (80; 7% instances), obj (56; 5% instances), conj (45; 4% instances), case:suff (36; 3% instances), dep (34; 3% instances), det (29; 2% instances), obl (26; 2% instances), root (8; 1% instances), nummod (5; 0% instances), acl (3; 0% instances), advmod (2; 0% instances), amod (2; 0% instances), ccomp (1; 0% instances)

Parents of X nodes belong to 10 different parts of speech: X (366; 30% instances), VERB (333; 28% instances), NOUN (244; 20% instances), PART (118; 10% instances), PROPN (81; 7% instances), ADJ (44; 4% instances), NUM (12; 1% instances), (8; 1% instances), ADP (2; 0% instances), PRON (1; 0% instances)

725 (60%) X nodes are leaves.

117 (10%) X nodes have one child.

151 (12%) X nodes have two children.

216 (18%) X nodes have three or more children.

The highest child degree of a X node is 10.

Children of X nodes are attached using 21 different relations: punct (576; 46% instances), flat:foreign (319; 25% instances), appos (59; 5% instances), nmod (53; 4% instances), conj (46; 4% instances), case (29; 2% instances), case:dec (29; 2% instances), cc (27; 2% instances), cop (17; 1% instances), det (17; 1% instances), nsubj (16; 1% instances), dep (14; 1% instances), nummod (14; 1% instances), acl (13; 1% instances), acl:relcl (10; 1% instances), amod (7; 1% instances), dislocated (2; 0% instances), nmod:tmod (2; 0% instances), advmod (1; 0% instances), csubj (1; 0% instances), obj (1; 0% instances)

Children of X nodes belong to 15 different parts of speech: PUNCT (572; 46% instances), X (366; 29% instances), NOUN (101; 8% instances), PART (57; 5% instances), ADP (38; 3% instances), CCONJ (26; 2% instances), NUM (23; 2% instances), VERB (20; 2% instances), AUX (17; 1% instances), PROPN (17; 1% instances), ADJ (8; 1% instances), SYM (4; 0% instances), PRON (2; 0% instances), ADV (1; 0% instances), DET (1; 0% instances)