home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Estonian-EDT: POS Tags: X

There are 305 X lemmas (1%), 306 X types (0%) and 658 X tokens (0%). Out of 16 observed tags, the rank of X is: 7 in number of lemmas, 9 in number of types and 15 in number of tokens.

The 10 most frequent X lemmas: et, al, of, in, drive, the, i, de, key, to

The 10 most frequent X types: al., et, of, in, drive, the, I, de, key, to

The 10 most frequent ambiguous lemmas: et (SCONJ 4199, X 95), al (X 94, NOUN 1), the (NOUN 7, X 3), i (SYM 5, NOUN 3, X 1), de (PROPN 27, X 5), International (PROPN 8, X 4), pruritus (X 4, NOUN 1), World (PROPN 5, X 3), a (NOUN 231, X 2, ADV 1), and (NOUN 7, X 3, PROPN 2, CCONJ 1)

The 10 most frequent ambiguous types: et (SCONJ 4030, X 94), the (NOUN 7, X 3), I (ADJ 23, X 5, NUM 1), de (PROPN 27, X 5), to (X 5, DET 1), International (PROPN 7, X 4), World (PROPN 3, X 3), a (NOUN 104, X 2, ADV 1), and (NOUN 6, X 3, PROPN 2, CCONJ 1), for (NOUN 4, X 3)

Morphology

The form / lemma ratio of X is 1.003279 (the average of all parts of speech is 1.911857).

The 1st highest number of forms (2) was observed with the lemma “al”: al, al..

The 2nd highest number of forms (2) was observed with the lemma “et”: et, et..

The 3rd highest number of forms (2) was observed with the lemma “is”: ‘is, is.

X occurs with 6 features: Abbr (202; 31% instances), Foreign (76; 12% instances), Case (16; 2% instances), Number (16; 2% instances), NumForm (1; 0% instances), NumType (1; 0% instances)

X occurs with 7 feature-value pairs: Abbr=Yes, Case=Gen, Case=Nom, Foreign=Yes, NumForm=Roman, NumType=Ord, Number=Sing

X occurs with 6 feature combinations. The most frequent feature combination is _ (363 tokens). Examples: et, drive, the, key, de, out, I, International, Marsa, World

Relations

X nodes are attached to their parents using 15 different relations: flat:foreign (269; 41% instances), flat (206; 31% instances), appos (55; 8% instances), conj (32; 5% instances), parataxis (25; 4% instances), root (23; 3% instances), nmod (21; 3% instances), obl (10; 2% instances), goeswith (5; 1% instances), advcl (3; 0% instances), nsubj (3; 0% instances), advmod (2; 0% instances), nsubj:cop (2; 0% instances), obj (1; 0% instances), orphan (1; 0% instances)

Parents of X nodes belong to 10 different parts of speech: PROPN (267; 41% instances), X (211; 32% instances), NOUN (108; 16% instances), VERB (26; 4% instances), (23; 3% instances), ADJ (6; 1% instances), INTJ (6; 1% instances), NUM (5; 1% instances), SYM (5; 1% instances), ADV (1; 0% instances)

499 (76%) X nodes are leaves.

26 (4%) X nodes have one child.

35 (5%) X nodes have two children.

98 (15%) X nodes have three or more children.

The highest child degree of a X node is 13.

Children of X nodes are attached using 19 different relations: punct (256; 47% instances), flat:foreign (162; 30% instances), flat (55; 10% instances), conj (12; 2% instances), parataxis (11; 2% instances), nmod (10; 2% instances), cc (7; 1% instances), advmod (5; 1% instances), appos (5; 1% instances), cop (4; 1% instances), nsubj:cop (3; 1% instances), nummod (2; 0% instances), obl (2; 0% instances), acl (1; 0% instances), advcl (1; 0% instances), amod (1; 0% instances), cc:preconj (1; 0% instances), csubj:cop (1; 0% instances), obj (1; 0% instances)

Children of X nodes belong to 12 different parts of speech: PUNCT (256; 47% instances), X (211; 39% instances), NOUN (27; 5% instances), NUM (15; 3% instances), ADV (8; 1% instances), CCONJ (7; 1% instances), PROPN (5; 1% instances), AUX (4; 1% instances), VERB (3; 1% instances), ADJ (2; 0% instances), PRON (1; 0% instances), SYM (1; 0% instances)