home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_English-EWT: POS Tags: X

There are 641 X lemmas (3%), 644 X types (3%) and 987 X tokens (0%). Out of 17 observed tags, the rank of X is: 7 in number of lemmas, 7 in number of types and 15 in number of tokens.

The 10 most frequent X lemmas: etc, etc., .doc, carol.st.clair@enron.com, -, (, ), a, access, analysis_0712

The 10 most frequent X types: etc, etc., .doc, carol.st.clair@enron.com, -, (, ), Access, Analysis_0712, COMMUNICATIONS

The 10 most frequent ambiguous lemmas: etc. (X 20, ADJ 1, NOUN 1), - (PUNCT 1649, SYM 119, X 8), ( (PUNCT 1030, X 7), ) (PUNCT 1067, X 7), a (DET 5354, NOUN 14, NUM 7, SYM 6, X 6, ADV 3, ADJ 1, PROPN 1), access (NOUN 32, VERB 6, X 6), and (CCONJ 6111, X 6), pricing (NOUN 13, X 6), transmission (X 6, NOUN 5), for (ADP 2094, SCONJ 189, CCONJ 6, X 5, VERB 1)

The 10 most frequent ambiguous types: etc. (X 18, ADJ 1, NOUN 1), - (PUNCT 1628, SYM 119, X 8), ( (PUNCT 1030, X 7), ) (PUNCT 1067, X 7), Oct (PROPN 8, X 6), Pricing (X 6, VERB 1), Transmission (X 6, PROPN 2, NOUN 1), a (DET 4542, ADP 6, NUM 6, NOUN 4, ADV 3, X 2, ADJ 1, AUX 1, CCONJ 1, PART 1), and (CCONJ 5915, X 6, DET 5, ADP 2), for (ADP 2025, SCONJ 187, CCONJ 5, X 5)

Morphology

The form / lemma ratio of X is 1.004680 (the average of all parts of speech is 1.220397).

The 1st highest number of forms (3) was observed with the lemma “etc.”: ect., etc, etc..

The 2nd highest number of forms (2) was observed with the lemma “et”: et, et..

The 3rd highest number of forms (2) was observed with the lemma “space.com”: SPACE.com, Space.com.

X occurs with 9 features: Number (54; 5% instances), Foreign (42; 4% instances), Degree (13; 1% instances), VerbForm (6; 1% instances), PronType (3; 0% instances), Typo (3; 0% instances), Mood (2; 0% instances), Abbr (1; 0% instances), NumType (1; 0% instances)

X occurs with 10 feature-value pairs: Abbr=Yes, Degree=Pos, Foreign=Yes, Mood=Imp, NumType=Card, Number=Sing, PronType=Int, Typo=Yes, VerbForm=Fin, VerbForm=Inf

X occurs with 12 feature combinations. The most frequent feature combination is _ (866 tokens). Examples: etc, etc., carol.st.clair@enron.com, -, (, ), Access, Analysis_0712, COMMUNICATIONS, Oct

Relations

X nodes are attached to their parents using 23 different relations: goeswith (259; 26% instances), root (239; 24% instances), appos (89; 9% instances), list (89; 9% instances), conj (73; 7% instances), compound (60; 6% instances), flat (52; 5% instances), obl (33; 3% instances), flat:foreign (32; 3% instances), nmod (12; 1% instances), parataxis (12; 1% instances), obj (10; 1% instances), amod (7; 1% instances), case (5; 1% instances), cc (4; 0% instances), dep (2; 0% instances), nsubj (2; 0% instances), obl:npmod (2; 0% instances), advcl (1; 0% instances), discourse (1; 0% instances), nmod:tmod (1; 0% instances), reparandum (1; 0% instances), xcomp (1; 0% instances)

Parents of X nodes belong to 13 different parts of speech: NOUN (342; 35% instances), (239; 24% instances), X (149; 15% instances), PROPN (104; 11% instances), VERB (83; 8% instances), ADJ (29; 3% instances), ADV (17; 2% instances), ADP (7; 1% instances), PRON (6; 1% instances), NUM (3; 0% instances), PUNCT (3; 0% instances), SCONJ (3; 0% instances), AUX (2; 0% instances)

706 (72%) X nodes are leaves.

140 (14%) X nodes have one child.

97 (10%) X nodes have two children.

44 (4%) X nodes have three or more children.

The highest child degree of a X node is 19.

Children of X nodes are attached using 19 different relations: punct (262; 50% instances), goeswith (60; 12% instances), case (45; 9% instances), flat:foreign (32; 6% instances), compound (28; 5% instances), conj (14; 3% instances), appos (13; 3% instances), list (13; 3% instances), obl (12; 2% instances), cc (9; 2% instances), parataxis (9; 2% instances), cop (5; 1% instances), nsubj (5; 1% instances), nmod (3; 1% instances), nmod:tmod (3; 1% instances), nummod (3; 1% instances), det (2; 0% instances), amod (1; 0% instances), flat (1; 0% instances)

Children of X nodes belong to 12 different parts of speech: PUNCT (262; 50% instances), X (149; 29% instances), ADP (44; 8% instances), NOUN (19; 4% instances), NUM (17; 3% instances), VERB (9; 2% instances), CCONJ (7; 1% instances), AUX (5; 1% instances), PROPN (4; 1% instances), DET (2; 0% instances), ADJ (1; 0% instances), PRON (1; 0% instances)