Treebank Statistics: UD_Estonian-EWT: POS Tags: X
There are 147 X
lemmas (1%), 209 X
types (1%) and 313 X
tokens (0%).
Out of 16 observed tags, the rank of X
is: 8 in number of lemmas, 9 in number of types and 16 in number of tokens.
The 10 most frequent X
lemmas: _, to, my, of, or, smth, the, from, no, opinion
The 10 most frequent X
types: to, 000, s, a, my, of, or, the, u, from
The 10 most frequent ambiguous lemmas: _ (X 91, PUNCT 3), no (INTJ 62, X 3, ADV 1), a (NOUN 23, ADV 8, CCONJ 4, SCONJ 4, X 2, PROPN 1), i (NOUN 1, SYM 1, X 1), imo (ADV 6, X 2), la (X 2, ADV 1), need (X 2, PRON 1), nõu (NOUN 9, X 2), u (ADV 3, NOUN 2, X 2), COVID-19 (PROPN 2, NOUN 1, X 1)
The 10 most frequent ambiguous types: 000 (X 13, NUM 1), a (NOUN 17, ADV 8, CCONJ 4, X 4, PROPN 1), u (X 4, NOUN 2, ADV 1), no (INTJ 33, X 3, ADV 1), I (ADJ 3, X 2), imo (ADV 6, X 2), la (X 2, ADV 1), need (PRON 74, DET 68, X 2), olla (AUX 89, VERB 18, X 2, ADV 1), st (ADV 8, X 2)
- 000
- a
- NOUN 17: a võibolla on sul õõigus gasoline ..
- ADV 8: a tegelt päris omapärane ju
- CCONJ 4: pilku on normaalne a premium parem
- X 4: Sakul on olemas väiksed pudelid , a la ice .
- PROPN 1: a le coq oli nagu mingi etapp vms , mida läbida , kõikide sõpradega nii olnud ja keegi ei tarbi regulaarselt tartu tooteid , kuigi paljud on nüüd ülikooli raames ka lõunasse kolinud .
- u
- X 4: selle tädi tarkus siis kui viirus hakkas maailmas levima , siis rääkis hoopis teist jutt u
- NOUN 2: Üks selline roomikute kpl peaks kestma 1000 km ja eks tank rüüpab samuti u 3 ltr/km .
- ADV 1: Kokku sõitnud u 10 aastat talvel põhiliselt naastude ja 7 aastat lamellidega , “ libeduse “ ( loe : oma rumaluse ) põhjustatud õnnetusi siiani pole .
- no
- I
- imo
- la
- need
- olla
- AUX 89: 14-korruse katusel , kell kaks öösel on meeletult hea olla .
- VERB 18: Võiks olla midagi India või Nepali stiilis .
- X 2: Võib olla oleks ikkagi kõige targem osta sakslastelt Leopard 1 või 2 de .
- ADV 1: Võib olla kukutatakse ka Arani vähem , tulevad meelde tema tulemused 2-l järjestikusel bashol .
- st
Morphology
The form / lemma ratio of X
is 1.421769 (the average of all parts of speech is 1.732342).
The 1st highest number of forms (64) was observed with the lemma “_”: +++, -, -1, -dega, -e, -ga, 000, 02, 3, 300, 472, AT, a, aastasele, aegaset, arvuti, de, desid, eestist, füüsikat, ga, gravitatsioonist, homme, hot.ee, itaaliast, karantiin, keemikud, kingades, konkurent, kord, korealane, koreas, kraadise, kõik, meeskonnal, mõistusele, n, ne, olla, osa, panek, parameeter, pealt, refereri, relva, s, sama, sele, seni, sest, sinane, st, sõbralik, tasandil, teadus, tehas, tehnoloogia, tulesid, täht, u, valdkonnas, versiooni, vähem, üks.
The 2nd highest number of forms (2) was observed with the lemma “smth”: smth, smth..
The 3rd highest number of forms (1) was observed with the lemma “**”: **.
X
occurs with 3 features: Foreign (177; 57% instances), Abbr (7; 2% instances), Typo (1; 0% instances)
X
occurs with 3 feature-value pairs: Abbr=Yes
, Foreign=Yes
, Typo=Yes
X
occurs with 5 feature combinations.
The most frequent feature combination is Foreign=Yes
(176 tokens).
Examples: to, my, or, the, from, opinion, smth, I, Suchen, You
Relations
X
nodes are attached to their parents using 16 different relations: goeswith (91; 29% instances), flat:foreign (71; 23% instances), flat (47; 15% instances), dep (26; 8% instances), parataxis (19; 6% instances), root (17; 5% instances), nmod (8; 3% instances), appos (7; 2% instances), discourse (7; 2% instances), obj (6; 2% instances), conj (5; 2% instances), nsubj (4; 1% instances), obl (2; 1% instances), ccomp (1; 0% instances), list (1; 0% instances), nsubj:cop (1; 0% instances)
Parents of X
nodes belong to 11 different parts of speech: X (116; 37% instances), NOUN (70; 22% instances), VERB (37; 12% instances), NUM (24; 8% instances), PROPN (22; 7% instances), (17; 5% instances), ADV (11; 4% instances), ADJ (10; 3% instances), PRON (4; 1% instances), CCONJ (1; 0% instances), DET (1; 0% instances)
235 (75%) X
nodes are leaves.
25 (8%) X
nodes have one child.
17 (5%) X
nodes have two children.
36 (12%) X
nodes have three or more children.
The highest child degree of a X
node is 13.
Children of X
nodes are attached using 19 different relations: flat:foreign (71; 32% instances), punct (62; 28% instances), flat (60; 27% instances), conj (8; 4% instances), advmod (3; 1% instances), case (2; 1% instances), cc (2; 1% instances), cop (2; 1% instances), det (2; 1% instances), mark (2; 1% instances), nsubj:cop (2; 1% instances), acl:relcl (1; 0% instances), advcl (1; 0% instances), amod (1; 0% instances), dep (1; 0% instances), discourse (1; 0% instances), nmod (1; 0% instances), parataxis (1; 0% instances), vocative (1; 0% instances)
Children of X
nodes belong to 14 different parts of speech: X (116; 52% instances), PUNCT (62; 28% instances), NOUN (16; 7% instances), PROPN (10; 4% instances), ADV (3; 1% instances), VERB (3; 1% instances), ADJ (2; 1% instances), ADP (2; 1% instances), AUX (2; 1% instances), CCONJ (2; 1% instances), DET (2; 1% instances), SCONJ (2; 1% instances), INTJ (1; 0% instances), SYM (1; 0% instances)