Treebank Statistics: UD_English-EWT: POS Tags: X
There are 556 X
lemmas (3%), 641 X
types (3%) and 918 X
tokens (0%).
Out of 17 observed tags, the rank of X
is: 7 in number of lemmas, 7 in number of types and 16 in number of tokens.
The 10 most frequent X
lemmas: _, .doc, carol.st.clair@enron.com, (, ), -, Analysis_0712, access, and, ekrapels@esaibos.com
The 10 most frequent X
types: .doc, carol.st.clair@enron.com, -, (, ), Access, Analysis_0712, COMMUNICATIONS, Oct, Pricing
The 10 most frequent ambiguous lemmas: _ (X 155, PUNCT 5), ( (PUNCT 1030, X 7), ) (PUNCT 1067, X 7), - (PUNCT 1647, SYM 119, X 6), access (NOUN 32, VERB 6, X 6), and (CCONJ 6111, X 6), pricing (NOUN 13, X 6), transmission (X 6, NOUN 5), mid (X 5, ADJ 1, ADV 1), & (CCONJ 139, X 4)
The 10 most frequent ambiguous types: - (PUNCT 1626, SYM 119, X 8), ( (PUNCT 1030, X 7), ) (PUNCT 1067, X 7), Oct (PROPN 8, X 6), Pricing (X 6, VERB 1), Transmission (X 6, PROPN 2, NOUN 1), a (DET 4542, ADP 7, NUM 6, NOUN 4, ADV 2, X 2, ADJ 1, AUX 1, CCONJ 1, PART 1), and (CCONJ 5915, X 6, DET 5, ADP 2), for (ADP 2029, SCONJ 183, CCONJ 5, X 5), mid (X 4, NOUN 2, ADJ 1, ADV 1)
- -
- (
- )
- Oct
- Pricing
- Transmission
- a
- DET 4542: Read the entire article ; there ‘s a punchline , too .
- ADP 7: Big deal kind a stuff .
- NUM 6: 2 ) I would like to say on a island with an a ) all inclusive resort ( if possible ) , and a beach front room
- NOUN 4: Top range of bike , cheap prices , excellent a +++
- ADV 2: Also , any tour recommendations would be very helpful a well .
- X 2: A la guerre c’est comme a la guerre !
- ADJ 1: there will be talent and opportunity a plenty on the market soon .
- AUX 1: yea i guess but rabbits a easily escape a pen or another rabbit could get in there and that rabbit could be the opposite gender .
- CCONJ 1: But word of advice if you ‘re get your girlfriend a laptop make sure it s a good brand a not something like DELL , Acer , Asus , eMachines etc .
- PART 1: I feel X - BOX is a very smooth system i own it like 3 years , it s very compatible to previous versions and mostly important i was very comfortable with the User Interface and the JOYSTICK …. coz you do nt wan a hold a joystick that gives you discomfort .
- and
- CCONJ 5915: Right now that seems to be the US , EU , and IAEA .
- X 6: « Alberta Transmission Access and Pricing Analysis_0712 .doc »
- DET 5: it s your cat you can pick and name you want
- ADP 2: The people there attempt to come across and professional and nice , but I was disappointed with their customer service .
- for
- ADP 2029: Yet we did n’t charge them for the evacuation .
- SCONJ 183: Thanks for thinking of me to send it to .
- CCONJ 5: Neither was this day less fortunate to his father Philip ; for on the same day he took Potidea ; » - JOHN AUBREY , F.R.S.
- X 5: So they see the pictures flicker slower and there for it seems choppy to them .
- mid
- X 4: Otherwise , I will be sending it to Peoples as our final revision by mid morning .
- NOUN 2: Otherwise , I will be sending it to Peoples as our final revision by mid morning .
- ADJ 1: We did not have big percent of Chinese migration until the mid 90s .
- ADV 1: So I kept reading and then I saw the dates , it was from mid day Friday and arriving home mid day monday . :(
Morphology
The form / lemma ratio of X
is 1.152878 (the average of all parts of speech is 1.234270).
The 1st highest number of forms (99) was observed with the lemma “_”: -, 20, 3-5290, @, A, Abramo@ENRON, Akin@ECT, Alatorre@ENRON, Bertone@ENRON_DEVELOPMENT, Bryngelson@AZURIX, C, COMMUNICATIONS, Castagnola@ENRON_DEVELOPMENT, Delainey@ECT, Diebner@ECT, Dorsey@ENRON_DEVELOPMENT, E, ECT, Forster@ENRON, Garcia@ENRON, Griffith@ENRON, Hansen@ENRON, Hopkinson@ENRON_DEVELOPMENT, Horton@ENRON, Huble@ENRON, J, Jacoby@ECT, Johnson@ENRON, Kaminski@ECT, Kaufman@ECT, Khan@TRANSREDES, Kindall@ENRON, Leibman@ENRON, Leigh, Mann@ENRON, Martinez@ENRON, McConnell@ECT, Oct, P, Patel@ENRON, Perry@ENRON_DEVELOPMENT, Rice@ENRON, Schwartzenburg@ENRON_DEVELOPMENT, Shackleton@ECT, Sullivan@ENRON, W, Ward, Warner@ENRON, Williams@ENRON_DEVELOPMENT, back, cent, cooked, d, day, deed, donald, dramatic, educated, ever, expose, fall, for, full, get, going, h, hill, ible, in, informed, ive, line, mail, mentioned, morning, night, notebook.url, o, one, order, out, plenty, power, priced, r, respect, s, self, ship, side, standing, t, time, to, together, u, way, were, where.
The 2nd highest number of forms (2) was observed with the lemma “et”: et, et..
The 3rd highest number of forms (2) was observed with the lemma “space.com”: SPACE.com, Space.com.
X
occurs with 3 features: Typo (44; 5% instances), Foreign (42; 5% instances), Number (34; 4% instances)
X
occurs with 3 feature-value pairs: Foreign=Yes
, Number=Sing
, Typo=Yes
X
occurs with 4 feature combinations.
The most frequent feature combination is _
(798 tokens).
Examples: carol.st.clair@enron.com, -, (, ), Access, Analysis_0712, COMMUNICATIONS, Oct, Pricing, Transmission
Relations
X
nodes are attached to their parents using 23 different relations: root (238; 26% instances), flat (160; 17% instances), goeswith (155; 17% instances), appos (90; 10% instances), list (82; 9% instances), compound (59; 6% instances), obl (33; 4% instances), flat:foreign (32; 3% instances), conj (13; 1% instances), nmod (12; 1% instances), parataxis (11; 1% instances), obj (9; 1% instances), amod (7; 1% instances), case (4; 0% instances), cc (2; 0% instances), dep (2; 0% instances), nsubj (2; 0% instances), obl:npmod (2; 0% instances), advcl (1; 0% instances), discourse (1; 0% instances), iobj (1; 0% instances), nmod:tmod (1; 0% instances), reparandum (1; 0% instances)
Parents of X
nodes belong to 13 different parts of speech: NOUN (296; 32% instances), (238; 26% instances), X (149; 16% instances), PROPN (103; 11% instances), VERB (69; 8% instances), ADJ (24; 3% instances), ADV (17; 2% instances), PRON (8; 1% instances), ADP (7; 1% instances), PUNCT (3; 0% instances), AUX (2; 0% instances), NUM (1; 0% instances), SCONJ (1; 0% instances)
673 (73%) X
nodes are leaves.
105 (11%) X
nodes have one child.
97 (11%) X
nodes have two children.
43 (5%) X
nodes have three or more children.
The highest child degree of a X
node is 19.
Children of X
nodes are attached using 19 different relations: punct (226; 47% instances), goeswith (62; 13% instances), case (45; 9% instances), flat:foreign (32; 7% instances), compound (26; 5% instances), conj (14; 3% instances), appos (13; 3% instances), list (12; 2% instances), obl (12; 2% instances), parataxis (9; 2% instances), cc (8; 2% instances), cop (5; 1% instances), nsubj (5; 1% instances), nmod (3; 1% instances), nmod:tmod (3; 1% instances), nummod (3; 1% instances), det (2; 0% instances), amod (1; 0% instances), flat (1; 0% instances)
Children of X
nodes belong to 12 different parts of speech: PUNCT (226; 47% instances), X (149; 31% instances), ADP (44; 9% instances), NOUN (19; 4% instances), NUM (17; 4% instances), VERB (9; 2% instances), CCONJ (6; 1% instances), AUX (5; 1% instances), PROPN (3; 1% instances), DET (2; 0% instances), ADJ (1; 0% instances), PRON (1; 0% instances)