home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Estonian-EDT: POS Tags: X

There are 226 X lemmas (1%), 302 X types (0%) and 765 X tokens (0%). Out of 17 observed tags, the rank of X is: 7 in number of lemmas, 9 in number of types and 14 in number of tokens.

The 10 most frequent X lemmas: _, al, et, of, in, ceteris, de, paribus, F, a

The 10 most frequent X types: 000, al., et, of, in, 900, 500, 600, 700, ceteris

The 10 most frequent ambiguous lemmas: et (SCONJ 4256, X 95), of (ADP 24, X 12, ADV 1), in (X 7, ADP 6), de (PROPN 26, X 5, ADP 1), a (NOUN 231, X 2, ADV 1), b (NOUN 16, X 1), no (INTJ 43, X 3), out (X 3, ADP 1), C (NOUN 5, X 2, PROPN 1), XX (ADJ 4, X 2)

The 10 most frequent ambiguous types: et (SCONJ 4086, X 94), of (ADP 22, X 12, ADV 1), in (X 7, ADP 6), 900 (NUM 8, X 7), 500 (NUM 22, X 5, ADJ 1), 600 (NUM 15, X 5), 700 (NUM 8, X 5), de (PROPN 26, X 5, ADP 1), a (NOUN 104, X 2, ADV 1), 400 (NUM 19, X 3)

Morphology

The form / lemma ratio of X is 1.336283 (the average of all parts of speech is 1.912964).

The 1st highest number of forms (76) was observed with the lemma “_”: ‘is, -1,5, -2, -3-rasvhapete, -45,8, -5, -6,5, -9,9, -aastased, -aastaselt, -kilobaidine, /1995, 000, 000-100, 000-l, 000-naelasele, 000-objektise, 000kroonine, 000ni, 000st, 02, 04, 083*1012, 090, 100, 150, 17, 17.00, 2, 20, 203, 257, 357, 371, 400, 402, 44, 479, 496, 500, 50aastased, 522, 547, 60, 600, 690, 692, 700, 756, 780, 782, 800, 83, 890, 892, 90, 900, 914, 930, 950, 950-kroonise, 951, 981, 996, Angeles-klassi, Angeles-klassile, aastased, arhitektuurides, e, keelsessegi, kroonine, mehelises, o, rian, trulli, °C.

The 2nd highest number of forms (2) was observed with the lemma “al”: al, al..

The 3rd highest number of forms (2) was observed with the lemma “et”: et, et..

X occurs with 2 features: Foreign (424; 55% instances), Abbr (30; 4% instances)

X occurs with 2 feature-value pairs: Abbr=Yes, Foreign=Yes

X occurs with 3 feature combinations. The most frequent feature combination is Foreign=Yes (424 tokens). Examples: al., et, ceteris, de, paribus, in, tõ, Helicobacter, Marsa, khorji

Relations

X nodes are attached to their parents using 18 different relations: goeswith (271; 35% instances), flat:foreign (211; 28% instances), flat (134; 18% instances), parataxis (39; 5% instances), root (23; 3% instances), nmod (21; 3% instances), appos (18; 2% instances), conj (16; 2% instances), ccomp (8; 1% instances), dep (4; 1% instances), nsubj (4; 1% instances), nsubj:cop (4; 1% instances), obj (4; 1% instances), obl (3; 0% instances), amod (2; 0% instances), discourse (1; 0% instances), fixed (1; 0% instances), orphan (1; 0% instances)

Parents of X nodes belong to 10 different parts of speech: NUM (254; 33% instances), X (239; 31% instances), PROPN (116; 15% instances), NOUN (76; 10% instances), VERB (32; 4% instances), (23; 3% instances), ADJ (22; 3% instances), ADV (1; 0% instances), PRON (1; 0% instances), SYM (1; 0% instances)

535 (70%) X nodes are leaves.

129 (17%) X nodes have one child.

28 (4%) X nodes have two children.

73 (10%) X nodes have three or more children.

The highest child degree of a X node is 9.

Children of X nodes are attached using 24 different relations: flat:foreign (211; 43% instances), punct (179; 37% instances), flat (22; 5% instances), conj (20; 4% instances), advmod (7; 1% instances), parataxis (7; 1% instances), cc (5; 1% instances), cop (5; 1% instances), nummod (5; 1% instances), appos (4; 1% instances), nmod (3; 1% instances), nsubj:cop (3; 1% instances), obl (3; 1% instances), advcl (2; 0% instances), acl (1; 0% instances), acl:relcl (1; 0% instances), amod (1; 0% instances), case (1; 0% instances), cc:preconj (1; 0% instances), csubj:cop (1; 0% instances), dep (1; 0% instances), fixed (1; 0% instances), obj (1; 0% instances), orphan (1; 0% instances)

Children of X nodes belong to 12 different parts of speech: X (239; 49% instances), PUNCT (179; 37% instances), NOUN (25; 5% instances), NUM (11; 2% instances), ADV (9; 2% instances), AUX (5; 1% instances), CCONJ (5; 1% instances), VERB (5; 1% instances), ADJ (3; 1% instances), PROPN (2; 0% instances), SYM (2; 0% instances), ADP (1; 0% instances)