SYM
: symbol
Definition
A symbol is a word-like entity that differs from ordinary words by form, function, or both.
Many symbols are or contain special non-alphanumeric characters, similarly to punctuation. What makes them different from punctuation is that they can be substituted by normal words. This involves all currency symbols, e.g. $ 75 is identical to seventy-five dollars.
Mathematical operators form another group of symbols.
Another group of symbols is emoticons and emoji.
Strings that consists entirely of alphanumeric characters are not symbols but they may be proper nouns: 130XE, DC10; others may be tagged PROPN
(rather than SYM
) even if they contain special characters: DC-10.
Similarly, abbreviations for single words are not symbols but are assigned the part of speech of the full form. For example, п. (пан or пані), кг (кілограм), км (кілометр), проф. (професор) should be tagged nouns. Acronyms for proper names such as OSN and NATO should be tagged as proper nouns.
Characters used as bullets in itemized lists (•, ‣) are not symbols, they are punctuation.
Examples
- $, %, §, ©
- +, −, ×, ÷, =, <, >
- :), ♥‿♥, 😝
- john.doe@universal.org, http://universaldependencies.org/, 1-800-COMPANY
Treebank Statistics (UD_Ukrainian)
There are 4 SYM
lemmas (1%), 4 SYM
types (1%) and 4 SYM
tokens (0%).
Out of 16 observed tags, the rank of SYM
is: 14 in number of lemmas, 14 in number of types and 14 in number of tokens.
The 10 most frequent SYM
lemmas: 27А15676, 890-365-345, jarko@gmail.com, С
The 10 most frequent SYM
types: 27А15676-С, 890-365-345, jarko@gmail.com, С
The 10 most frequent ambiguous lemmas:
The 10 most frequent ambiguous types:
Morphology
The form / lemma ratio of SYM
is 1.000000 (the average of all parts of speech is 1.172859).
The 1st highest number of forms (1) was observed with the lemma “27А15676”: 27А15676-С.
The 2nd highest number of forms (1) was observed with the lemma “890-365-345”: 890-365-345.
The 3rd highest number of forms (1) was observed with the lemma “jarko@gmail.com”: jarko@gmail.com.
SYM
does not occur with any features.
Relations
SYM
nodes are attached to their parents using 3 different relations: appos (2; 50% instances), name (1; 25% instances), nmod (1; 25% instances)
Parents of SYM
nodes belong to 2 different parts of speech: NOUN (3; 75% instances), NUM (1; 25% instances)
4 (100%) SYM
nodes are leaves.
The highest child degree of a SYM
node is 0.
SYM in other languages: [bg] [cs] [de] [el] [en] [es] [eu] [fa] [fi] [fr] [ga] [he] [hu] [it] [ja] [ko] [sv] [u]