PUNCT
: punctuation
Definition
Punctuation marks are non-alphabetical characters and character groups used in many languages to delimit linguistic units in printed text.
Punctuation is not taken to include logograms such as $, %, and §, which are instead tagged as SYM.
Examples
- Period: .
- Comma: ,
- Parentheses: ()
References
Treebank Statistics (UD_Turkish)
There are 13 PUNCT
lemmas (0%), 13 PUNCT
types (0%) and 10424 PUNCT
tokens (18%).
Out of 14 observed tags, the rank of PUNCT
is: 12 in number of lemmas, 13 in number of types and 3 in number of tokens.
The 10 most frequent PUNCT
lemmas: ., ,, “, …, ?, :, -, ;, !, )
The 10 most frequent PUNCT
types: ., ,, “, …, ?, :, -, ;, !, )
The 10 most frequent ambiguous lemmas:
The 10 most frequent ambiguous types: ? (PUNCT 231, PRON 3, PROPN 2)
- ?
Morphology
The form / lemma ratio of PUNCT
is 1.000000 (the average of all parts of speech is 2.815350).
The 1st highest number of forms (1) was observed with the lemma “!”: !.
The 2nd highest number of forms (1) was observed with the lemma “””: ”.
The 3rd highest number of forms (1) was observed with the lemma “’”: ’.
PUNCT
does not occur with any features.
Relations
PUNCT
nodes are attached to their parents using 8 different relations: punct (10231; 98% instances), conj (136; 1% instances), root (39; 0% instances), cc (12; 0% instances), dobj (3; 0% instances), advmod:emph (1; 0% instances), nmod:poss (1; 0% instances), nsubj (1; 0% instances)
Parents of PUNCT
nodes belong to 15 different parts of speech: VERB (8508; 82% instances), NOUN (927; 9% instances), ADJ (588; 6% instances), ADV (118; 1% instances), PRON (88; 1% instances), PROPN (55; 1% instances), ROOT (39; 0% instances), PUNCT (26; 0% instances), INTJ (19; 0% instances), CONJ (17; 0% instances), NUM (15; 0% instances), ADP (11; 0% instances), DET (9; 0% instances), AUX (2; 0% instances), X (2; 0% instances)
10344 (99%) PUNCT
nodes are leaves.
45 (0%) PUNCT
nodes have one child.
20 (0%) PUNCT
nodes have two children.
15 (0%) PUNCT
nodes have three or more children.
The highest child degree of a PUNCT
node is 6.
Children of PUNCT
nodes are attached using 12 different relations: conj (49; 36% instances), nsubj (19; 14% instances), punct (18; 13% instances), nmod (17; 12% instances), dobj (9; 7% instances), advmod (7; 5% instances), amod (6; 4% instances), cc (3; 2% instances), discourse (3; 2% instances), acl (2; 1% instances), advmod:emph (2; 1% instances), csubj (2; 1% instances)
Children of PUNCT
nodes belong to 11 different parts of speech: VERB (38; 28% instances), NOUN (27; 20% instances), PUNCT (26; 19% instances), ADV (12; 9% instances), ADJ (10; 7% instances), PRON (9; 7% instances), CONJ (6; 4% instances), INTJ (3; 2% instances), NUM (3; 2% instances), PROPN (2; 1% instances), DET (1; 1% instances)
PUNCT in other languages: [bg] [cs] [de] [el] [en] [es] [eu] [fa] [fi] [fr] [ga] [he] [hu] [it] [ja] [ko] [sv] [u]