home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Arabic: POS Tags: PUNCT

There are 21 PUNCT lemmas (0%), 21 PUNCT types (0%) and 22450 PUNCT tokens (8%). Out of 16 observed tags, the rank of PUNCT is: 11 in number of lemmas, 14 in number of types and 5 in number of tokens.

The 10 most frequent PUNCT lemmas: .، ،، “، -، )، (، /، »، «، :

The 10 most frequent PUNCT types: .، ،، “، -، )، (، /، »، «، :

The 10 most frequent ambiguous lemmas: / (PUNCT 754, SYM 14), ر (PUNCT 22, X 20)

The 10 most frequent ambiguous types: / (PUNCT 754, SYM 14), ر (PUNCT 22, X 20)

Morphology

The form / lemma ratio of PUNCT is 1.000000 (the average of all parts of speech is 1.685281).

The 1st highest number of forms (1) was observed with the lemma “!”: !.

The 2nd highest number of forms (1) was observed with the lemma “””: “.

The 3rd highest number of forms (1) was observed with the lemma “(”: (.

PUNCT does not occur with any features.

Relations

PUNCT nodes are attached to their parents using 3 different relations: punct (22444; 100% instances), root (5; 0% instances), cop (1; 0% instances)

Parents of PUNCT nodes belong to 16 different parts of speech: NOUN (5913; 26% instances), CCONJ (4808; 21% instances), VERB (3528; 16% instances), X (3431; 15% instances), ADJ (2227; 10% instances), NUM (1558; 7% instances), PRON (352; 2% instances), PUNCT (351; 2% instances), DET (104; 0% instances), ADV (87; 0% instances), PART (60; 0% instances), ADP (17; 0% instances), (5; 0% instances), PROPN (4; 0% instances), INTJ (3; 0% instances), AUX (2; 0% instances)

22100 (98%) PUNCT nodes are leaves.

334 (1%) PUNCT nodes have one child.

7 (0%) PUNCT nodes have two children.

9 (0%) PUNCT nodes have three or more children.

The highest child degree of a PUNCT node is 4.

Children of PUNCT nodes are attached using 9 different relations: punct (351; 93% instances), obl:arg (6; 2% instances), dep (5; 1% instances), obl (5; 1% instances), nmod (4; 1% instances), parataxis (3; 1% instances), amod (2; 1% instances), advcl (1; 0% instances), nsubj (1; 0% instances)

Children of PUNCT nodes belong to 7 different parts of speech: PUNCT (351; 93% instances), NOUN (12; 3% instances), NUM (8; 2% instances), VERB (3; 1% instances), ADJ (2; 1% instances), ADP (1; 0% instances), X (1; 0% instances)