home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Estonian-EDT: POS Tags: SYM

There are 117 SYM lemmas (0%), 133 SYM types (0%) and 740 SYM tokens (0%). Out of 17 observed tags, the rank of SYM is: 9 in number of lemmas, 12 in number of types and 15 in number of tokens.

The 10 most frequent SYM lemmas: %, &, =, ω-3, *, ?, ω-6, 1-, A, §

The 10 most frequent SYM types: %, %-l, =, %-ni, &, ω-3-, *, &, %-lt, ?

The 10 most frequent ambiguous lemmas: % (SYM 518, ADJ 2, NOUN 1), & (SYM 22, CCONJ 8), = (SYM 17, PUNCT 1), * (SYM 12, PUNCT 1), ? (PUNCT 1298, SYM 6, X 1), 1- (SYM 5, NUM 2), A (SYM 5, NOUN 2, PROPN 2, X 2, DET 1), i (NOUN 3, SYM 3, X 1), 2- (SYM 2, NUM 1), 3- (SYM 2, NUM 1)

The 10 most frequent ambiguous types: %-l (SYM 24, NOUN 1), = (SYM 17, PUNCT 1), & (SYM 15, CCONJ 8), * (SYM 12, PUNCT 1), ? (PUNCT 1298, SYM 6, X 1), 1- (SYM 5, NUM 2), 2- (SYM 2, NUM 1), 3- (SYM 2, NUM 1), ! (PUNCT 960, SYM 1), ‘i (SYM 1, X 1)

Morphology

The form / lemma ratio of SYM is 1.136752 (the average of all parts of speech is 1.912964).

The 1st highest number of forms (8) was observed with the lemma “%”: %, %-ga, %-l, %-le, %-lt, %-ni, %l, %ni.

The 2nd highest number of forms (4) was observed with the lemma “A”: -A, A-, A-ga, A-ks.

The 3rd highest number of forms (3) was observed with the lemma “i”: ‘i, i-ga, i-le.

SYM occurs with 6 features: Abbr (105; 14% instances), Case (59; 8% instances), Number (58; 8% instances), Hyph (18; 2% instances), NumType (7; 1% instances), NumForm (6; 1% instances)

SYM occurs with 11 feature-value pairs: Abbr=Yes, Case=Abl, Case=Ade, Case=All, Case=Com, Case=Ter, Case=Tra, Hyph=Yes, NumForm=Digit, NumType=Card, Number=Sing

SYM occurs with 16 feature combinations. The most frequent feature combination is _ (562 tokens). Examples: %, &, =, &, ?, 1-, *, 7-, 9qh+, ω-3

Relations

SYM nodes are attached to their parents using 19 different relations: parataxis (142; 19% instances), obl (130; 18% instances), nmod (127; 17% instances), conj (87; 12% instances), flat (58; 8% instances), nsubj (49; 7% instances), root (39; 5% instances), orphan (33; 4% instances), obj (27; 4% instances), appos (12; 2% instances), nsubj:cop (12; 2% instances), advcl (6; 1% instances), nummod (5; 1% instances), acl:relcl (4; 1% instances), advmod (4; 1% instances), ccomp (2; 0% instances), acl (1; 0% instances), list (1; 0% instances), xcomp (1; 0% instances)

Parents of SYM nodes belong to 10 different parts of speech: NOUN (331; 45% instances), VERB (231; 31% instances), SYM (46; 6% instances), (39; 5% instances), PROPN (37; 5% instances), ADJ (22; 3% instances), ADV (16; 2% instances), NUM (14; 2% instances), PRON (2; 0% instances), X (2; 0% instances)

133 (18%) SYM nodes are leaves.

259 (35%) SYM nodes have one child.

176 (24%) SYM nodes have two children.

172 (23%) SYM nodes have three or more children.

The highest child degree of a SYM node is 9.

Children of SYM nodes are attached using 23 different relations: nummod (475; 37% instances), punct (321; 25% instances), nmod (156; 12% instances), conj (72; 6% instances), cc (37; 3% instances), nsubj:cop (34; 3% instances), cop (28; 2% instances), orphan (24; 2% instances), obl (23; 2% instances), advmod (21; 2% instances), parataxis (21; 2% instances), flat (20; 2% instances), case (13; 1% instances), compound (7; 1% instances), acl:relcl (5; 0% instances), aux (4; 0% instances), mark (4; 0% instances), advcl (2; 0% instances), det (2; 0% instances), acl (1; 0% instances), appos (1; 0% instances), cc:preconj (1; 0% instances), list (1; 0% instances)

Children of SYM nodes belong to 15 different parts of speech: NUM (505; 40% instances), PUNCT (321; 25% instances), NOUN (242; 19% instances), SYM (46; 4% instances), CCONJ (37; 3% instances), AUX (32; 3% instances), ADV (26; 2% instances), PROPN (17; 1% instances), ADP (13; 1% instances), PRON (10; 1% instances), ADJ (9; 1% instances), VERB (8; 1% instances), SCONJ (4; 0% instances), DET (2; 0% instances), X (1; 0% instances)