home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Estonian-EWT: POS Tags: X

There are 132 X lemmas (1%), 194 X types (1%) and 291 X tokens (0%). Out of 16 observed tags, the rank of X is: 8 in number of lemmas, 9 in number of types and 16 in number of tokens.

The 10 most frequent X lemmas: _, to, my, of, or, smth, the, from, opinion, Nooot

The 10 most frequent X types: to, 000, s, a, my, of, or, the, u, from

The 10 most frequent ambiguous lemmas: _ (X 91, PUNCT 3), a (NOUN 23, ADV 8, CCONJ 4, SCONJ 4, X 2, PROPN 1), la (X 2, ADV 1), need (X 2, PRON 1), nõu (NOUN 9, X 2), u (ADV 3, NOUN 2, X 2), I (ADJ 3, X 1), IT (NOUN 4, X 1), NB (PROPN 1, X 1), South (PROPN 2, X 1)

The 10 most frequent ambiguous types: 000 (X 13, NUM 1), a (NOUN 17, ADV 8, CCONJ 4, X 4, PROPN 1), u (X 4, NOUN 2, ADV 1), la (X 2, ADV 1), need (PRON 74, DET 68, X 2), olla (AUX 89, VERB 18, X 2, ADV 1), st (ADV 8, X 2), tehas (X 2, NOUN 1), - (PUNCT 325, X 1), 3 (NUM 57, ADJ 1, X 1)

Morphology

The form / lemma ratio of X is 1.469697 (the average of all parts of speech is 1.732282).

The 1st highest number of forms (64) was observed with the lemma “_”: +++, -, -1, -dega, -e, -ga, 000, 02, 3, 300, 472, AT, a, aastasele, aegaset, arvuti, de, desid, eestist, füüsikat, ga, gravitatsioonist, homme, hot.ee, itaaliast, karantiin, keemikud, kingades, konkurent, kord, korealane, koreas, kraadise, kõik, meeskonnal, mõistusele, n, ne, olla, osa, panek, parameeter, pealt, refereri, relva, s, sama, sele, seni, sest, sinane, st, sõbralik, tasandil, teadus, tehas, tehnoloogia, tulesid, täht, u, valdkonnas, versiooni, vähem, üks.

The 2nd highest number of forms (2) was observed with the lemma “smth”: smth, smth..

The 3rd highest number of forms (1) was observed with the lemma “**”: **.

X occurs with 5 features: Foreign (153; 53% instances), Abbr (6; 2% instances), Case (1; 0% instances), Number (1; 0% instances), Typo (1; 0% instances)

X occurs with 5 feature-value pairs: Abbr=Yes, Case=Nom, Foreign=Yes, Number=Sing, Typo=Yes

X occurs with 6 feature combinations. The most frequent feature combination is Foreign=Yes (152 tokens). Examples: to, my, the, from, of, opinion, or, smth, Suchen, You

Relations

X nodes are attached to their parents using 16 different relations: goeswith (91; 31% instances), flat:foreign (62; 21% instances), flat (47; 16% instances), dep (26; 9% instances), parataxis (17; 6% instances), root (15; 5% instances), nmod (7; 2% instances), appos (5; 2% instances), obj (5; 2% instances), conj (4; 1% instances), nsubj (4; 1% instances), discourse (3; 1% instances), obl (2; 1% instances), ccomp (1; 0% instances), list (1; 0% instances), nsubj:cop (1; 0% instances)

Parents of X nodes belong to 12 different parts of speech: X (100; 34% instances), NOUN (69; 24% instances), VERB (33; 11% instances), NUM (23; 8% instances), PROPN (22; 8% instances), (15; 5% instances), ADV (11; 4% instances), ADJ (10; 3% instances), PRON (4; 1% instances), INTJ (2; 1% instances), CCONJ (1; 0% instances), DET (1; 0% instances)

223 (77%) X nodes are leaves.

23 (8%) X nodes have one child.

13 (4%) X nodes have two children.

32 (11%) X nodes have three or more children.

The highest child degree of a X node is 13.

Children of X nodes are attached using 15 different relations: flat (63; 32% instances), flat:foreign (56; 28% instances), punct (53; 27% instances), conj (7; 4% instances), advmod (3; 2% instances), cop (2; 1% instances), det (2; 1% instances), discourse (2; 1% instances), mark (2; 1% instances), nsubj:cop (2; 1% instances), amod (1; 1% instances), case (1; 1% instances), cc (1; 1% instances), dep (1; 1% instances), parataxis (1; 1% instances)

Children of X nodes belong to 14 different parts of speech: X (100; 51% instances), PUNCT (53; 27% instances), NOUN (17; 9% instances), PROPN (10; 5% instances), ADV (3; 2% instances), AUX (2; 1% instances), DET (2; 1% instances), SCONJ (2; 1% instances), SYM (2; 1% instances), VERB (2; 1% instances), ADJ (1; 1% instances), ADP (1; 1% instances), CCONJ (1; 1% instances), INTJ (1; 1% instances)