home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Polish-PDB: POS Tags: PUNCT

There are 32 PUNCT lemmas (0%), 33 PUNCT types (0%) and 57873 PUNCT tokens (17%). Out of 17 observed tags, the rank of PUNCT is: 14 in number of lemmas, 16 in number of types and 2 in number of tokens.

The 10 most frequent PUNCT lemmas: ., ,, -, “, ?, :, ), (, –, !

The 10 most frequent PUNCT types: ., ,, -, “, ?, :, ), (, –, !

The 10 most frequent ambiguous lemmas: - (PUNCT 5393, SYM 5), (PUNCT 517, ADV 1), / (PUNCT 31, ADV 1), * (SYM 2, PUNCT 1)

The 10 most frequent ambiguous types: - (PUNCT 5393, SYM 5), (PUNCT 517, ADV 1), / (PUNCT 31, ADV 1), * (SYM 2, PUNCT 1)

Morphology

The form / lemma ratio of PUNCT is 1.031250 (the average of all parts of speech is 1.966055).

The 1st highest number of forms (3) was observed with the lemma “””: ”, ‘’, ’’.

The 2nd highest number of forms (2) was observed with the lemma “,”: ,, ,,.

The 3rd highest number of forms (1) was observed with the lemma “!”: !.

PUNCT occurs with 2 features: PunctType (57870; 100% instances), PunctSide (1925; 3% instances)

PUNCT occurs with 13 feature-value pairs: PunctSide=Fin, PunctSide=Ini, PunctType=Brck, PunctType=Colo, PunctType=Comm, PunctType=Dash, PunctType=Elip, PunctType=Excl, PunctType=Peri, PunctType=Qest, PunctType=Quot, PunctType=Semi, PunctType=Slsh

PUNCT occurs with 15 feature combinations. The most frequent feature combination is PunctType=Peri (22866 tokens). Examples: .

Relations

PUNCT nodes are attached to their parents using 2 different relations: punct (57872; 100% instances), root (1; 0% instances)

Parents of PUNCT nodes belong to 17 different parts of speech: VERB (36576; 63% instances), NOUN (10539; 18% instances), ADJ (5193; 9% instances), PROPN (2042; 4% instances), ADV (1015; 2% instances), PUNCT (881; 2% instances), X (519; 1% instances), PRON (336; 1% instances), PART (229; 0% instances), DET (224; 0% instances), INTJ (128; 0% instances), NUM (105; 0% instances), ADP (64; 0% instances), SYM (19; 0% instances), CCONJ (1; 0% instances), (1; 0% instances), SCONJ (1; 0% instances)

57000 (98%) PUNCT nodes are leaves.

866 (1%) PUNCT nodes have one child.

4 (0%) PUNCT nodes have two children.

3 (0%) PUNCT nodes have three or more children.

The highest child degree of a PUNCT node is 3.

Children of PUNCT nodes are attached using 3 different relations: punct (881; 100% instances), obj (1; 0% instances), obl (1; 0% instances)

Children of PUNCT nodes belong to 3 different parts of speech: PUNCT (881; 100% instances), ADJ (1; 0% instances), PRON (1; 0% instances)