home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Estonian-EDT: POS Tags: SYM

There are 88 SYM lemmas (0%), 104 SYM types (0%) and 682 SYM tokens (0%). Out of 16 observed tags, the rank of SYM is: 10 in number of lemmas, 12 in number of types and 14 in number of tokens.

The 10 most frequent SYM lemmas: %, &, =, *, -, ?, A, i, §, +

The 10 most frequent SYM types: %, %-l, =, %-ni, &, *, -, %-lt, &, ?

The 10 most frequent ambiguous lemmas: % (SYM 511, ADJ 2), & (SYM 22, CCONJ 8), = (SYM 17, PUNCT 1), * (SYM 12, PUNCT 1), - (PUNCT 1456, SYM 9), ? (PUNCT 1289, SYM 6, X 1), A (SYM 5, NOUN 3, PROPN 3, X 1), i (SYM 5, NOUN 3, X 1), s (SYM 3, X 1), B (NOUN 3, SYM 2, PROPN 1, X 1)

The 10 most frequent ambiguous types: = (SYM 17, PUNCT 1), * (SYM 12, PUNCT 1), - (PUNCT 1420, SYM 9), & (CCONJ 8, SYM 6), ? (PUNCT 1289, SYM 6, X 1), ‘i (SYM 3, X 1), ! (PUNCT 960, SYM 1), 1/2 (NUM 5, SYM 1), > (PUNCT 22, SYM 1)

Morphology

The form / lemma ratio of SYM is 1.181818 (the average of all parts of speech is 1.911857).

The 1st highest number of forms (9) was observed with the lemma “%”: %, %-ga, %-l, %-le, %-lise, %-lt, %-ni, %l, %ni.

The 2nd highest number of forms (4) was observed with the lemma “A”: -A, A-, A-ga, A-ks.

The 3rd highest number of forms (3) was observed with the lemma “i”: ‘i, i-ga, i-le.

SYM occurs with 6 features: Abbr (124; 18% instances), NumForm (90; 13% instances), NumType (90; 13% instances), Case (72; 11% instances), Number (72; 11% instances), Hyph (3; 0% instances)

SYM occurs with 14 feature-value pairs: Abbr=Yes, Case=Abl, Case=All, Case=Com, Case=Ela, Case=Gen, Case=Nom, Case=Par, Case=Ter, Case=Tra, Hyph=Yes, NumForm=Digit, NumType=Card, Number=Sing

SYM occurs with 14 feature combinations. The most frequent feature combination is _ (468 tokens). Examples: %, %-l, &, %-ni, -, &, ?, *, %-lise, %-lt

Relations

SYM nodes are attached to their parents using 19 different relations: advmod (168; 25% instances), parataxis (105; 15% instances), conj (75; 11% instances), nmod (55; 8% instances), nsubj (43; 6% instances), obl (43; 6% instances), flat (38; 6% instances), root (35; 5% instances), orphan (30; 4% instances), goeswith (26; 4% instances), obj (21; 3% instances), appos (17; 2% instances), nsubj:cop (11; 2% instances), acl:relcl (5; 1% instances), advcl (4; 1% instances), amod (2; 0% instances), ccomp (2; 0% instances), compound (1; 0% instances), list (1; 0% instances)

Parents of SYM nodes belong to 10 different parts of speech: NOUN (297; 44% instances), VERB (211; 31% instances), NUM (39; 6% instances), PROPN (38; 6% instances), (35; 5% instances), ADJ (24; 4% instances), SYM (21; 3% instances), ADV (15; 2% instances), PRON (1; 0% instances), X (1; 0% instances)

132 (19%) SYM nodes are leaves.

268 (39%) SYM nodes have one child.

154 (23%) SYM nodes have two children.

128 (19%) SYM nodes have three or more children.

The highest child degree of a SYM node is 9.

Children of SYM nodes are attached using 22 different relations: nummod (438; 41% instances), punct (271; 25% instances), nmod (137; 13% instances), compound (39; 4% instances), conj (38; 4% instances), nsubj:cop (25; 2% instances), cop (22; 2% instances), cc (18; 2% instances), obl (17; 2% instances), advmod (16; 1% instances), case (14; 1% instances), orphan (13; 1% instances), parataxis (13; 1% instances), mark (4; 0% instances), acl:relcl (2; 0% instances), aux (2; 0% instances), advcl (1; 0% instances), amod (1; 0% instances), appos (1; 0% instances), cc:preconj (1; 0% instances), det (1; 0% instances), list (1; 0% instances)

Children of SYM nodes belong to 15 different parts of speech: NUM (480; 45% instances), PUNCT (271; 25% instances), NOUN (185; 17% instances), AUX (24; 2% instances), SYM (21; 2% instances), ADV (18; 2% instances), CCONJ (18; 2% instances), ADP (14; 1% instances), PROPN (12; 1% instances), ADJ (9; 1% instances), PRON (7; 1% instances), VERB (6; 1% instances), X (5; 0% instances), SCONJ (4; 0% instances), DET (1; 0% instances)