SYM
: symbol
Definition
A symbol is a word-like entity that differs from ordinary words by form, function, or both.
Many symbols are or contain special non-alphanumeric characters, similarly to punctuation. What makes them different from punctuation is that they can be substituted by normal words. This involves all currency symbols, e.g. $ 75 is identical to seventy-five dollars.
Mathematical operators form another group of symbols.
Another group of symbols is emoticons and emoji.
Strings that consists entirely of alphanumeric characters are not
symbols but they may be proper nouns: 130XE, DC10; others
may be tagged PROPN
(rather than SYM
) even if they contain special
characters: DC-10. Similarly, abbreviations for single words are not
symbols but are assigned the part of speech of the full form. For
example, Sr. (senhor), kg (kilograma), km (quilômetro), Dr
(doutor) should be tagged nouns. Acronyms for proper names
such as PT and IBM should be tagged as proper nouns.
Characters used as bullets in itemized lists (•, ‣) and parentheses are not symbols, they are punctuation.
Examples
- $, %, §, ©
- +, −, ×, ÷, =, <, >
- :), ♥‿♥, 😝
- john.doe@universal.org, http://universaldependencies.org/, 1-800-COMPANY
Treebank Statistics (UD_Portuguese)
There are 6 SYM
lemmas (0%), 7 SYM
types (0%) and 450 SYM
tokens (0%).
Out of 17 observed tags, the rank of SYM
is: 16 in number of lemmas, 15 in number of types and 13 in number of tokens.
The 10 most frequent SYM
lemmas: %, US$, R$, /, CR$, U$
The 10 most frequent SYM
types: %, US$, R$, /, CR$, $%, U$
The 10 most frequent ambiguous lemmas: / (SYM 34, PUNCT 9, CONJ 1)
The 10 most frequent ambiguous types: / (SYM 34, PUNCT 9, CONJ 1)
- /
- SYM 34: Média é de 4 passageiros / viagem
- PUNCT 9: Um friso inenarrável de personagens marginais e / ou marginalizadas , com o pós-guerra em pano de fundo .
- CONJ 1: Apesar de o grotesco de a situação , qualquer caloiro que procurasse saber de as diligências que necessita de efectuar para se inscrever em Ciências , deparava com uma longa lista de preceitos , intitulada « Aviso » e que explicava que todos os colocados em a faculdade « em o ano lectivo de 1994 / 95 ( 1º ano / 1ª vez ) farão a sua matrícula por via postal ( correio registado ) » , a o que se seguia uma listagem de os documentos a enviar .
Morphology
The form / lemma ratio of SYM
is 1.166667 (the average of all parts of speech is 1.432674).
The 1st highest number of forms (2) was observed with the lemma “%”: $%, %.
The 2nd highest number of forms (1) was observed with the lemma “/”: /.
The 3rd highest number of forms (1) was observed with the lemma “CR$”: CR$.
SYM
occurs with 2 features: Gender (416; 92% instances), Number (416; 92% instances)
SYM
occurs with 3 feature-value pairs: Gender=Masc
, Number=Plur
, Number=Sing
SYM
occurs with 3 feature combinations.
The most frequent feature combination is Gender=Masc|Number=Plur
(408 tokens).
Examples: %, US$, R$, CR$, $%, U$
Relations
SYM
nodes are attached to their parents using 12 different relations: nmod (165; 37% instances), dobj (123; 27% instances), advmod (54; 12% instances), cc (24; 5% instances), conj (24; 5% instances), root (23; 5% instances), nsubj (13; 3% instances), case (11; 2% instances), acl (4; 1% instances), nsubjpass (4; 1% instances), ccomp (3; 1% instances), iobj (2; 0% instances)
Parents of SYM
nodes belong to 11 different parts of speech: VERB (191; 42% instances), NOUN (149; 33% instances), SYM (53; 12% instances), ROOT (23; 5% instances), PROPN (12; 3% instances), ADJ (11; 2% instances), ADP (3; 1% instances), NUM (3; 1% instances), ADV (2; 0% instances), DET (2; 0% instances), PRON (1; 0% instances)
38 (8%) SYM
nodes are leaves.
123 (27%) SYM
nodes have one child.
155 (34%) SYM
nodes have two children.
134 (30%) SYM
nodes have three or more children.
The highest child degree of a SYM
node is 12.
Children of SYM
nodes are attached using 16 different relations: nummod (318; 31% instances), nmod (247; 24% instances), case (215; 21% instances), punct (91; 9% instances), cop (28; 3% instances), nsubj (24; 2% instances), conj (22; 2% instances), advmod (21; 2% instances), cc (18; 2% instances), amod (13; 1% instances), acl (12; 1% instances), det (12; 1% instances), advcl (1; 0% instances), dobj (1; 0% instances), mark (1; 0% instances), parataxis (1; 0% instances)
Children of SYM
nodes belong to 13 different parts of speech: NUM (321; 31% instances), NOUN (219; 21% instances), ADP (203; 20% instances), PUNCT (91; 9% instances), SYM (53; 5% instances), VERB (41; 4% instances), ADV (28; 3% instances), CONJ (17; 2% instances), PRON (15; 1% instances), ADJ (14; 1% instances), DET (12; 1% instances), PROPN (10; 1% instances), SCONJ (1; 0% instances)
Treebank Statistics (UD_Portuguese-Bosque)
There are 1 SYM
lemmas (0%), 1 SYM
types (0%) and 203 SYM
tokens (0%).
Out of 17 observed tags, the rank of SYM
is: 16 in number of lemmas, 16 in number of types and 14 in number of tokens.
The 10 most frequent SYM
lemmas: %
The 10 most frequent SYM
types: %
The 10 most frequent ambiguous lemmas:
The 10 most frequent ambiguous types:
Morphology
The form / lemma ratio of SYM
is 1.000000 (the average of all parts of speech is 1.449059).
The 1st highest number of forms (1) was observed with the lemma “%”: %.
SYM
occurs with 2 features: Gender (203; 100% instances), Number (203; 100% instances)
SYM
occurs with 2 feature-value pairs: Gender=Masc
, Number=Plur
SYM
occurs with 1 feature combinations.
The most frequent feature combination is Gender=Masc|Number=Plur
(203 tokens).
Examples: %
Relations
SYM
nodes are attached to their parents using 11 different relations: nmod (102; 50% instances), dobj (40; 20% instances), appos (23; 11% instances), nsubj (11; 5% instances), root (11; 5% instances), conj (8; 4% instances), ccomp (3; 1% instances), pt-dep/nmod:npmod (2; 1% instances), parataxis (1; 0% instances), remnant (1; 0% instances), xcomp (1; 0% instances)
Parents of SYM
nodes belong to 8 different parts of speech: VERB (92; 45% instances), NOUN (64; 32% instances), SYM (23; 11% instances), ROOT (11; 5% instances), PROPN (6; 3% instances), ADJ (5; 2% instances), ADV (1; 0% instances), NUM (1; 0% instances)
1 (0%) SYM
nodes are leaves.
41 (20%) SYM
nodes have one child.
69 (34%) SYM
nodes have two children.
92 (45%) SYM
nodes have three or more children.
The highest child degree of a SYM
node is 10.
Children of SYM
nodes are attached using 18 different relations: nummod (202; 38% instances), case (109; 21% instances), nmod (80; 15% instances), punct (61; 11% instances), cop (14; 3% instances), nsubj (12; 2% instances), conj (11; 2% instances), advmod (10; 2% instances), cc (6; 1% instances), det (6; 1% instances), amod (5; 1% instances), acl (4; 1% instances), appos (4; 1% instances), advcl (2; 0% instances), parataxis (2; 0% instances), mark (1; 0% instances), pt-dep/nmod:npmod (1; 0% instances), remnant (1; 0% instances)
Children of SYM
nodes belong to 13 different parts of speech: NUM (204; 38% instances), ADP (109; 21% instances), NOUN (72; 14% instances), PUNCT (61; 11% instances), SYM (23; 4% instances), VERB (20; 4% instances), ADV (10; 2% instances), ADJ (8; 2% instances), DET (7; 1% instances), PROPN (7; 1% instances), CONJ (6; 1% instances), PRON (3; 1% instances), SCONJ (1; 0% instances)
SYM in other languages: [bg] [cs] [de] [el] [en] [es] [eu] [fa] [fi] [fr] [ga] [he] [hu] [it] [ja] [ko] [sv] [u]