home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Estonian-EWT: POS Tags: X

There are 147 X lemmas (1%), 209 X types (1%) and 313 X tokens (0%). Out of 16 observed tags, the rank of X is: 8 in number of lemmas, 9 in number of types and 16 in number of tokens.

The 10 most frequent X lemmas: _, to, my, of, or, smth, the, from, no, opinion

The 10 most frequent X types: to, 000, s, a, my, of, or, the, u, from

The 10 most frequent ambiguous lemmas: _ (X 91, PUNCT 3), no (INTJ 62, X 3, ADV 1), a (NOUN 23, ADV 8, CCONJ 4, SCONJ 4, X 2, PROPN 1), i (NOUN 1, SYM 1, X 1), imo (ADV 6, X 2), la (X 2, ADV 1), need (X 2, PRON 1), nõu (NOUN 9, X 2), u (ADV 3, NOUN 2, X 2), COVID-19 (PROPN 2, NOUN 1, X 1)

The 10 most frequent ambiguous types: 000 (X 13, NUM 1), a (NOUN 17, ADV 8, CCONJ 4, X 4, PROPN 1), u (X 4, NOUN 2, ADV 1), no (INTJ 33, X 3, ADV 1), I (ADJ 3, X 2), imo (ADV 6, X 2), la (X 2, ADV 1), need (PRON 74, DET 68, X 2), olla (AUX 89, VERB 18, X 2, ADV 1), st (ADV 8, X 2)

Morphology

The form / lemma ratio of X is 1.421769 (the average of all parts of speech is 1.732342).

The 1st highest number of forms (64) was observed with the lemma “_”: +++, -, -1, -dega, -e, -ga, 000, 02, 3, 300, 472, AT, a, aastasele, aegaset, arvuti, de, desid, eestist, füüsikat, ga, gravitatsioonist, homme, hot.ee, itaaliast, karantiin, keemikud, kingades, konkurent, kord, korealane, koreas, kraadise, kõik, meeskonnal, mõistusele, n, ne, olla, osa, panek, parameeter, pealt, refereri, relva, s, sama, sele, seni, sest, sinane, st, sõbralik, tasandil, teadus, tehas, tehnoloogia, tulesid, täht, u, valdkonnas, versiooni, vähem, üks.

The 2nd highest number of forms (2) was observed with the lemma “smth”: smth, smth..

The 3rd highest number of forms (1) was observed with the lemma “**”: **.

X occurs with 3 features: Foreign (177; 57% instances), Abbr (7; 2% instances), Typo (1; 0% instances)

X occurs with 3 feature-value pairs: Abbr=Yes, Foreign=Yes, Typo=Yes

X occurs with 5 feature combinations. The most frequent feature combination is Foreign=Yes (176 tokens). Examples: to, my, or, the, from, opinion, smth, I, Suchen, You

Relations

X nodes are attached to their parents using 16 different relations: goeswith (91; 29% instances), flat:foreign (71; 23% instances), flat (47; 15% instances), dep (26; 8% instances), parataxis (19; 6% instances), root (17; 5% instances), nmod (8; 3% instances), appos (7; 2% instances), discourse (7; 2% instances), obj (6; 2% instances), conj (5; 2% instances), nsubj (4; 1% instances), obl (2; 1% instances), ccomp (1; 0% instances), list (1; 0% instances), nsubj:cop (1; 0% instances)

Parents of X nodes belong to 11 different parts of speech: X (116; 37% instances), NOUN (70; 22% instances), VERB (37; 12% instances), NUM (24; 8% instances), PROPN (22; 7% instances), (17; 5% instances), ADV (11; 4% instances), ADJ (10; 3% instances), PRON (4; 1% instances), CCONJ (1; 0% instances), DET (1; 0% instances)

235 (75%) X nodes are leaves.

25 (8%) X nodes have one child.

17 (5%) X nodes have two children.

36 (12%) X nodes have three or more children.

The highest child degree of a X node is 13.

Children of X nodes are attached using 19 different relations: flat:foreign (71; 32% instances), punct (62; 28% instances), flat (60; 27% instances), conj (8; 4% instances), advmod (3; 1% instances), case (2; 1% instances), cc (2; 1% instances), cop (2; 1% instances), det (2; 1% instances), mark (2; 1% instances), nsubj:cop (2; 1% instances), acl:relcl (1; 0% instances), advcl (1; 0% instances), amod (1; 0% instances), dep (1; 0% instances), discourse (1; 0% instances), nmod (1; 0% instances), parataxis (1; 0% instances), vocative (1; 0% instances)

Children of X nodes belong to 14 different parts of speech: X (116; 52% instances), PUNCT (62; 28% instances), NOUN (16; 7% instances), PROPN (10; 4% instances), ADV (3; 1% instances), VERB (3; 1% instances), ADJ (2; 1% instances), ADP (2; 1% instances), AUX (2; 1% instances), CCONJ (2; 1% instances), DET (2; 1% instances), SCONJ (2; 1% instances), INTJ (1; 0% instances), SYM (1; 0% instances)