PUNCT

home edit page issue tracker

This page pertains to UD version 2.

It appears that you have Javascript disabled. Please consider enabling Javascript for this page to see the visualizations.

`PUNCT`: punctuation

Definition

Punctuation marks are non-alphabetical characters and character groups used in many languages to delimit linguistic units in printed text.

Punctuation is not taken to include logograms such as $, %, and §, which are instead tagged as SYM.

Note, that there is infixed punctuation (exclamation, emphasis and question marks). We refer to such cases as multiword tokens, as in ինչո՞ւ “why?”, which become two tokens, ինչու and ՞ (for more details see the tokenization page).

Examples

Period: ։
Comma: ,
Parentheses: ()
Quotation mark: «»
Exclamation mark: ՜
Question mark։ ՞
Emphasis mark, Acute accent: ՛

PUNCT in other languages: [bej] [bg] [ca] [cs] [cy] [da] [el] [en] [es] [et] [fi] [fr] [ga] [grc] [hbo] [hy] [hyw] [it] [ja] [ka] [kk] [kpv] [ky] [myv] [naq] [no] [oge] [pt] [ru] [sl] [sv] [tr] [tt] [uk] [u] [urj] [xcl] [xmf] [yue] [zh]

PUNCT: punctuation

Definition

Examples

`PUNCT`: punctuation