home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Belarusian-HSE: POS Tags: SYM

There are 561 SYM lemmas (2%), 562 SYM types (1%) and 2606 SYM tokens (1%). Out of 17 observed tags, the rank of SYM is: 8 in number of lemmas, 8 in number of types and 15 in number of tokens.

The 10 most frequent SYM lemmas: %, 📌, >, </a>, ⚡, +, 🔥, ⚡️, №, 👉

The 10 most frequent SYM types: %, 📌, >, </a>, ⚡, +, 🔥, ⚡️, №, 👉

The 10 most frequent ambiguous lemmas: 📌 (SYM 95, PUNCT 3), > (SYM 88, PUNCT 1), </a> (X 2126, SYM 83), 🔹 (SYM 43, ADP 3), 🎧 (SYM 38, X 1), / (PUNCT 225, SYM 30), ❗️ (SYM 17, PUNCT 1), °C (SYM 13, X 2), $ (SYM 10, NUM 2), (SYM 10, ADP 1)

The 10 most frequent ambiguous types: 📌 (SYM 95, PUNCT 3), > (SYM 88, PUNCT 1), </a> (X 2126, SYM 83), 🔹 (SYM 43, ADP 3), 🎧 (SYM 38, X 1), / (PUNCT 225, SYM 30), ❗️ (SYM 17, PUNCT 1), $ (SYM 10, NUM 2), (SYM 10, ADP 1), | (PUNCT 10, SYM 9)

Morphology

The form / lemma ratio of SYM is 1.001783 (the average of all parts of speech is 1.753662).

The 1st highest number of forms (2) was observed with the lemma “))))”: ))), )))).

The 2nd highest number of forms (2) was observed with the lemma “°C”: °C, °С.

The 3rd highest number of forms (1) was observed with the lemma “#”: #.

SYM occurs with 5 features: Animacy (8; 0% instances), Case (8; 0% instances), Gender (8; 0% instances), Number (8; 0% instances), Foreign (1; 0% instances)

SYM occurs with 5 feature-value pairs: Animacy=Anim, Case=Gen, Foreign=Yes, Gender=Masc, Number=Sing

SYM occurs with 3 feature combinations. The most frequent feature combination is _ (2597 tokens). Examples: %, 📌, >, </a>, ⚡, +, 🔥, ⚡️, №, 👉

Relations

SYM nodes are attached to their parents using 23 different relations: parataxis (1604; 62% instances), discourse (239; 9% instances), dep (215; 8% instances), root (116; 4% instances), list (115; 4% instances), compound (79; 3% instances), nsubj (41; 2% instances), case (38; 1% instances), obj (32; 1% instances), cc (25; 1% instances), nmod (24; 1% instances), obl (24; 1% instances), conj (20; 1% instances), flat:foreign (8; 0% instances), appos (7; 0% instances), advmod (5; 0% instances), amod (4; 0% instances), nsubj:pass (4; 0% instances), flat (2; 0% instances), dislocated (1; 0% instances), flat:name (1; 0% instances), reparandum (1; 0% instances), xcomp (1; 0% instances)

Parents of SYM nodes belong to 14 different parts of speech: VERB (915; 35% instances), NOUN (593; 23% instances), X (396; 15% instances), PROPN (147; 6% instances), NUM (134; 5% instances), ADJ (116; 4% instances), (116; 4% instances), SYM (84; 3% instances), ADV (63; 2% instances), PRON (21; 1% instances), DET (8; 0% instances), PART (8; 0% instances), AUX (3; 0% instances), INTJ (2; 0% instances)

2139 (82%) SYM nodes are leaves.

216 (8%) SYM nodes have one child.

139 (5%) SYM nodes have two children.

112 (4%) SYM nodes have three or more children.

The highest child degree of a SYM node is 9.

Children of SYM nodes are attached using 30 different relations: nummod:gov (186; 21% instances), punct (135; 15% instances), nmod (104; 12% instances), parataxis (97; 11% instances), nummod (65; 7% instances), case (63; 7% instances), dep (46; 5% instances), nsubj (42; 5% instances), list (25; 3% instances), appos (20; 2% instances), amod (19; 2% instances), advmod (15; 2% instances), cc (14; 2% instances), conj (14; 2% instances), orphan (10; 1% instances), flat:foreign (9; 1% instances), mark (5; 1% instances), compound (4; 0% instances), cop (4; 0% instances), det (4; 0% instances), acl (3; 0% instances), acl:relcl (3; 0% instances), flat (3; 0% instances), discourse (2; 0% instances), xcomp (2; 0% instances), ccomp (1; 0% instances), csubj (1; 0% instances), iobj (1; 0% instances), obj (1; 0% instances), obl (1; 0% instances)

Children of SYM nodes belong to 16 different parts of speech: NUM (282; 31% instances), NOUN (137; 15% instances), PUNCT (135; 15% instances), SYM (84; 9% instances), X (62; 7% instances), ADP (61; 7% instances), PROPN (34; 4% instances), ADJ (33; 4% instances), VERB (16; 2% instances), ADV (14; 2% instances), CCONJ (14; 2% instances), PRON (10; 1% instances), DET (5; 1% instances), SCONJ (5; 1% instances), AUX (4; 0% instances), PART (3; 0% instances)