: symbol
A symbol is a word-like entity that differs from ordinary words by form, function, or both.
We recognize as symbols:
- currency symbols: $
- mathematical operators: µg / m3
- ’/’ used a separator: 2001 / 923 / CE
- emoticons and emoji: :-)
- URL’s and emails
The following are not symbols:
- Proper nouns with numbers and special characters: 130XE, DC10, DC-10 are tagged PROPN.
- Acronyms for proper nouns: UN, NATO are tagged as PROPN.
- Abbreviated words: Sig. (signore), kg (chilogrammo), km (chilometro), dott (dottore) are tagged NOUN.
- Characters used as bullets in itemized lists (*, •, ‣) are PUNCT.
- $, %, §, ©
- +, −, ×, ÷, =, <, >
- :), ♥‿♥, 😝
Treebank Statistics (UD_Italian)
There are 9 SYM
lemmas (0%), 9 SYM
types (0%) and 194 SYM
tokens (0%).
Out of 17 observed tags, the rank of SYM
is: 16 in number of lemmas, 16 in number of types and 14 in number of tokens.
The 10 most frequent SYM
lemmas: /, %, &, =, +, -,,, x
The 10 most frequent SYM
types: /, %, &, =, +, -,,, x
The 10 most frequent ambiguous lemmas: / (SYM 94, PUNCT 2), & (SYM 6, PROPN 1), - (PUNCT 765, SYM 2)
The 10 most frequent ambiguous types: / (SYM 94, PUNCT 2), & (SYM 6, PROPN 1), - (PUNCT 766, SYM 2), x (SYM 1, ADJ 1)
- /
- &
- -
- x
The form / lemma ratio of SYM
is 1.000000 (the average of all parts of speech is 1.491677).
The 1st highest number of forms (1) was observed with the lemma “%”: %.
The 2nd highest number of forms (1) was observed with the lemma “&”: &.
The 3rd highest number of forms (1) was observed with the lemma “+”: +.
does not occur with any features.
nodes are attached to their parents using 11 different relations: punct (82; 42% instances), nmod (60; 31% instances), dobj (13; 7% instances), name (11; 6% instances), cc (9; 5% instances), conj (5; 3% instances), nsubj (5; 3% instances), xcomp (3; 2% instances), mwe (2; 1% instances), nsubjpass (2; 1% instances), nummod (2; 1% instances)
Parents of SYM
nodes belong to 12 different parts of speech: NUM (60; 31% instances), VERB (52; 27% instances), NOUN (35; 18% instances), PROPN (25; 13% instances), ADJ (12; 6% instances), ADP (2; 1% instances), SYM (2; 1% instances), X (2; 1% instances), ADV (1; 1% instances), CONJ (1; 1% instances), PRON (1; 1% instances), PUNCT (1; 1% instances)
108 (56%) SYM
nodes are leaves.
2 (1%) SYM
nodes have one child.
10 (5%) SYM
nodes have two children.
74 (38%) SYM
nodes have three or more children.
The highest child degree of a SYM
node is 5.
Children of SYM
nodes are attached using 11 different relations: nummod (84; 30% instances), det (82; 29% instances), case (54; 19% instances), nmod (39; 14% instances), advmod (8; 3% instances), amod (4; 1% instances), cc (2; 1% instances), conj (2; 1% instances), punct (2; 1% instances), it-dep/acl:relcl (1; 0% instances), advcl (1; 0% instances)
Children of SYM
nodes belong to 11 different parts of speech: NUM (84; 30% instances), DET (82; 29% instances), ADP (54; 19% instances), NOUN (35; 13% instances), ADV (8; 3% instances), ADJ (4; 1% instances), PROPN (4; 1% instances), CONJ (2; 1% instances), PUNCT (2; 1% instances), SYM (2; 1% instances), VERB (2; 1% instances)
SYM in other languages: [bg] [cs] [de] [el] [en] [es] [eu] [fa] [fi] [fr] [ga] [he] [hu] [it] [ja] [ko] [sv] [u]