home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Czech-PDTC: POS Tags: PART

There are 148 PART lemmas (0%), 153 PART types (0%) and 64790 PART tokens (2%). Out of 17 observed tags, the rank of PART is: 8 in number of lemmas, 10 in number of types and 14 in number of tokens.

The 10 most frequent PART lemmas: i, tak, asi, také, ještě, jen, až, taky, ne, už

The 10 most frequent PART types: i, tak, asi, také, ještě, jen, až, taky, ne, už

The 10 most frequent ambiguous lemmas: i (CCONJ 6678, PART 6206, NOUN 23, X 3, ADJ 1), tak (ADV 5088, PART 4056, CCONJ 489), ještě (PART 3186, ADV 2368), jen (PART 2925, ADV 721, NOUN 423, SCONJ 5), (PART 2708, CCONJ 1402, SCONJ 445), ne (PART 2285, ADJ 1), (ADV 5406, PART 2200), ani (PART 1881, CCONJ 1013), například (PART 1864, ADV 120), jenom (PART 1420, ADV 62)

The 10 most frequent ambiguous types: i (PART 6200, CCONJ 5716, NOUN 23, X 3, ADJ 1), tak (ADV 4790, PART 3727, CCONJ 470), ještě (PART 2673, ADV 2142), jen (PART 2750, ADV 672, NOUN 14, SCONJ 2), (PART 2529, CCONJ 1400, SCONJ 360), ne (PART 1609, ADJ 1), (ADV 4692, PART 1995), ani (PART 1547, CCONJ 963), například (PART 1093, ADV 116), jenom (PART 1313, ADV 58)

Morphology

The form / lemma ratio of PART is 1.033784 (the average of all parts of speech is 2.169184).

The 1st highest number of forms (2) was observed with the lemma “cirka”: cca, cirka.

The 2nd highest number of forms (2) was observed with the lemma “například”: např, například.

The 3rd highest number of forms (2) was observed with the lemma “nejspíš”: nejspíš, nejspíše.

PART occurs with 3 features: Abbr (458; 1% instances), Style (387; 1% instances), ExtPos (172; 0% instances)

PART occurs with 4 feature-value pairs: Abbr=Yes, ExtPos=ADV, Style=Coll, Style=Vulg

PART occurs with 5 feature combinations. The most frequent feature combination is _ (63773 tokens). Examples: i, tak, asi, také, ještě, jen, až, taky, ne, už

Relations

PART nodes are attached to their parents using 24 different relations: advmod:emph (54352; 84% instances), root (3135; 5% instances), advmod (2404; 4% instances), nmod (2134; 3% instances), conj (1270; 2% instances), fixed (820; 1% instances), cc (307; 0% instances), appos (94; 0% instances), dep (52; 0% instances), ccomp (45; 0% instances), discourse (39; 0% instances), obj (38; 0% instances), advcl (37; 0% instances), orphan (13; 0% instances), acl (10; 0% instances), acl:relcl (7; 0% instances), parataxis (7; 0% instances), mark (6; 0% instances), nsubj (6; 0% instances), xcomp (5; 0% instances), csubj (4; 0% instances), obl:arg (3; 0% instances), advcl:pred (1; 0% instances), iobj (1; 0% instances)

Parents of PART nodes belong to 17 different parts of speech: VERB (18912; 29% instances), NOUN (17057; 26% instances), ADJ (7297; 11% instances), ADV (5830; 9% instances), NUM (4434; 7% instances), (3135; 5% instances), DET (3099; 5% instances), PROPN (1694; 3% instances), PRON (1660; 3% instances), CCONJ (739; 1% instances), PART (711; 1% instances), X (82; 0% instances), AUX (80; 0% instances), SYM (44; 0% instances), INTJ (10; 0% instances), ADP (4; 0% instances), SCONJ (2; 0% instances)

56616 (87%) PART nodes are leaves.

5592 (9%) PART nodes have one child.

1031 (2%) PART nodes have two children.

1551 (2%) PART nodes have three or more children.

The highest child degree of a PART node is 10.

Children of PART nodes are attached using 30 different relations: punct (8128; 57% instances), conj (933; 7% instances), cop (907; 6% instances), dep (756; 5% instances), nsubj (576; 4% instances), cc (492; 3% instances), advmod (429; 3% instances), obl (412; 3% instances), advmod:emph (402; 3% instances), advcl (397; 3% instances), mark (276; 2% instances), fixed (172; 1% instances), aux (113; 1% instances), obj (69; 0% instances), csubj (67; 0% instances), obl:arg (55; 0% instances), orphan (43; 0% instances), appos (25; 0% instances), ccomp (20; 0% instances), parataxis (18; 0% instances), amod (16; 0% instances), det (10; 0% instances), advcl:pred (9; 0% instances), case (9; 0% instances), discourse (8; 0% instances), nmod (4; 0% instances), vocative (2; 0% instances), xcomp (2; 0% instances), expl:pass (1; 0% instances), iobj (1; 0% instances)

Children of PART nodes belong to 17 different parts of speech: PUNCT (8128; 57% instances), NOUN (1254; 9% instances), AUX (1039; 7% instances), VERB (833; 6% instances), ADV (818; 6% instances), PART (711; 5% instances), CCONJ (485; 3% instances), DET (299; 2% instances), SCONJ (242; 2% instances), ADJ (199; 1% instances), PRON (158; 1% instances), PROPN (124; 1% instances), NUM (39; 0% instances), ADP (10; 0% instances), INTJ (5; 0% instances), X (5; 0% instances), SYM (3; 0% instances)