This is part of archived UD v1 documentation. See http://universaldependencies.org/ for the current version.
home grc/pos issue tracker

PUNCT: punctuation

Definition

Punctuation marks in Ancient Greek texts have in general been added by modern editors. There are four main punctuation marks that can be found in modern editions: comma (COMMA “U+002C”), period (FULL STOP “U+002E”), the point above the line (MIDDLE DOT “U+00B7” corresponding to a an English colon or semicolon), and question mark (SEMICOLON “U+003B”).

The mark for elision (Smyth 1920: 23-24) is the apostrophe (COMBINING COMMA ABOVE “U+0313”). Crasis (Smyth 1920: 22-23) and aphaeresis (Smyth 1920: 24) are signaled by a smooth breathing (COMBINING COMMA ABOVE “U+0313”) standing either on the vowel/diphthong resulting from crasis or for an elided ε at the beginning of a word (aphaeresis).

References

Smyth, Herbert Weir. 1920. A Greek Grammar for Colleges. New York: American Book Company (Perseus Digital Library; Internet Archive).


Treebank Statistics (UD_Ancient_Greek)

There are 16 PUNCT lemmas (0%), 16 PUNCT types (0%) and 30470 PUNCT tokens (12%). Out of 13 observed tags, the rank of PUNCT is: 10 in number of lemmas, 12 in number of types and 4 in number of tokens.

The 10 most frequent PUNCT lemmas: ,, ., ·, ;, “, ̓, -, ], ;”, [

The 10 most frequent PUNCT types: ,, ., ·, ;, “, ̓, -, ], ;”, [

The 10 most frequent ambiguous lemmas: (PUNCT 625, X 6)

The 10 most frequent ambiguous types: (PUNCT 625, X 6)

Morphology

The form / lemma ratio of PUNCT is 1.000000 (the average of all parts of speech is 3.041201).

The 1st highest number of forms (1) was observed with the lemma “””: .

The 2nd highest number of forms (1) was observed with the lemma “,”: ,.

The 3rd highest number of forms (1) was observed with the lemma “-”: -.

PUNCT does not occur with any features.

Relations

PUNCT nodes are attached to their parents using 8 different relations: punct (29938; 98% instances), cc (464; 2% instances), advmod (45; 0% instances), appos (11; 0% instances), mark (9; 0% instances), amod (1; 0% instances), conj (1; 0% instances), root (1; 0% instances)

Parents of PUNCT nodes belong to 13 different parts of speech: VERB (19439; 64% instances), NOUN (6234; 20% instances), ADJ (2861; 9% instances), ADV (839; 3% instances), PRON (740; 2% instances), ADP (93; 0% instances), PUNCT (92; 0% instances), CONJ (73; 0% instances), SCONJ (49; 0% instances), NUM (19; 0% instances), DET (18; 0% instances), INTJ (12; 0% instances), ROOT (1; 0% instances)

30129 (99%) PUNCT nodes are leaves.

23 (0%) PUNCT nodes have one child.

183 (1%) PUNCT nodes have two children.

135 (0%) PUNCT nodes have three or more children.

The highest child degree of a PUNCT node is 10.

Children of PUNCT nodes are attached using 15 different relations: nmod (132; 15% instances), case (126; 15% instances), advmod (112; 13% instances), nsubj (107; 12% instances), mark (101; 12% instances), punct (91; 10% instances), xcomp (68; 8% instances), dobj (54; 6% instances), advcl (26; 3% instances), neg (16; 2% instances), iobj (13; 1% instances), amod (7; 1% instances), cc (6; 1% instances), conj (5; 1% instances), acl (3; 0% instances)

Children of PUNCT nodes belong to 10 different parts of speech: NOUN (238; 27% instances), ADP (113; 13% instances), ADV (113; 13% instances), PUNCT (92; 11% instances), SCONJ (83; 10% instances), ADJ (82; 9% instances), VERB (82; 9% instances), PRON (45; 5% instances), CONJ (18; 2% instances), DET (1; 0% instances)


PUNCT in other languages: [bg] [cs] [de] [el] [en] [es] [eu] [fa] [fi] [fr] [ga] [he] [hu] [it] [ja] [ko] [sv] [u]