home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Persian-PerDT: POS Tags: PUNCT

There are 26 PUNCT lemmas (0%), 25 PUNCT types (0%) and 44336 PUNCT tokens (9%). Out of 16 observed tags, the rank of PUNCT is: 12 in number of lemmas, 13 in number of types and 4 in number of tokens.

The 10 most frequent PUNCT lemmas: .، ،، ؟، (، )، “، !، «، »، :

The 10 most frequent PUNCT types: .، ،، ؟، (، )، “، !، «، »، :

The 10 most frequent ambiguous lemmas: ، (PUNCT 10574, CCONJ 3, NOUN 1), - (PUNCT 136, CCONJ 3), … (PUNCT 16, NOUN 1), ٬ (PUNCT 4, NOUN 1), % (PUNCT 3, NOUN 2), و (CCONJ 20347, NOUN 5, PUNCT 2, ADJ 1, ADP 1), _ (VERB 3, NOUN 2, SCONJ 2, CCONJ 1, PRON 1, PUNCT 1), سطح (NOUN 139, PUNCT 1), چیز (NOUN 388, PUNCT 1)

The 10 most frequent ambiguous types: ، (PUNCT 10576, CCONJ 3, NOUN 1), - (PUNCT 136, CCONJ 3), … (PUNCT 17, NOUN 1), ٬ (PUNCT 4, NOUN 1), % (PUNCT 3, NOUN 2), و (CCONJ 20348, NOUN 5, PRON 4, PUNCT 2, ADJ 1, ADP 1), سطح (NOUN 117, PUNCT 1), چیزی (NOUN 220, PUNCT 1)

Morphology

The form / lemma ratio of PUNCT is 0.961538 (the average of all parts of speech is 1.486663).

The 1st highest number of forms (3) was observed with the lemma “.”: ., ،, ….

The 2nd highest number of forms (2) was observed with the lemma “؛”: ،, ؛.

The 3rd highest number of forms (1) was observed with the lemma “!”: !.

PUNCT occurs with 1 features: Number (2; 0% instances)

PUNCT occurs with 1 feature-value pairs: Number=Sing

PUNCT occurs with 2 feature combinations. The most frequent feature combination is _ (44334 tokens). Examples: .، ،، ؟، (، )، “، !، «، »، :

Relations

PUNCT nodes are attached to their parents using 1 different relations: punct (44336; 100% instances)

Parents of PUNCT nodes belong to 14 different parts of speech: VERB (30540; 69% instances), NOUN (7399; 17% instances), ADJ (2260; 5% instances), PROPN (1896; 4% instances), AUX (982; 2% instances), PRON (641; 1% instances), INTJ (281; 1% instances), ADV (99; 0% instances), SCONJ (77; 0% instances), ADP (73; 0% instances), CCONJ (49; 0% instances), NUM (30; 0% instances), DET (6; 0% instances), PART (3; 0% instances)

44336 (100%) PUNCT nodes are leaves.

The highest child degree of a PUNCT node is 0.