home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Czech-PDTC: POS Tags: X

There are 7972 X lemmas (9%), 8257 X types (4%) and 47618 X tokens (1%). Out of 17 observed tags, the rank of X is: 4 in number of lemmas, 5 in number of types and 15 in number of tokens.

The 10 most frequent X lemmas: Corp, Inc, co, New, s, Wall, street, San, International, de

The 10 most frequent X types: Corp, Inc, co, New, s, Wall, street, San, International, de

The 10 most frequent ambiguous lemmas: co (PRON 6414, SCONJ 808, ADV 382, PART 26, X 3), s (ADP 25778, X 607, NOUN 34), Wall (X 357, PROPN 15), American (X 327, PROPN 7), bank (NOUN 1, X 1), York (PROPN 492, X 298), National (X 229, PROPN 3), Los (X 181, PROPN 1), Jersey (X 161, PROPN 16), Lynch (X 149, PROPN 17)

The 10 most frequent ambiguous types: co (PRON 3565, SCONJ 794, ADV 368, PART 14, X 3), s (ADP 20009, X 607, NOUN 368, ADJ 7), Wall (X 356, PROPN 9), American (X 323, PROPN 7), bank (NOUN 289, X 1), York (X 293, PROPN 22), National (X 224, PROPN 3), Los (X 180, PROPN 1), Jersey (X 160, PROPN 16), Lynch (X 136, PROPN 15)

Morphology

The form / lemma ratio of X is 1.035750 (the average of all parts of speech is 2.169184).

The 1st highest number of forms (3) was observed with the lemma “Grid”: GRiD, GRid, Grid.

The 2nd highest number of forms (3) was observed with the lemma “Macmillan”: MACMILLAN, MacMillan, Macmillan.

The 3rd highest number of forms (2) was observed with the lemma “Ace”: ACE, Ace.

X occurs with 2 features: Foreign (47618; 100% instances), ExtPos (56; 0% instances)

X occurs with 3 feature-value pairs: ExtPos=ADP, ExtPos=ADV, Foreign=Yes

X occurs with 3 feature combinations. The most frequent feature combination is Foreign=Yes (47562 tokens). Examples: Corp, Inc, co, New, s, Wall, street, San, International, of

Relations

X nodes are attached to their parents using 27 different relations: flat (26175; 55% instances), nmod (16853; 35% instances), conj (1335; 3% instances), nsubj (846; 2% instances), obl (501; 1% instances), appos (458; 1% instances), root (432; 1% instances), obj (313; 1% instances), parataxis (214; 0% instances), dep (106; 0% instances), cc (76; 0% instances), obl:arg (62; 0% instances), fixed (56; 0% instances), advmod (50; 0% instances), orphan (30; 0% instances), advcl:pred (28; 0% instances), nsubj:pass (27; 0% instances), advcl (23; 0% instances), iobj (14; 0% instances), case (5; 0% instances), ccomp (5; 0% instances), acl (2; 0% instances), acl:relcl (2; 0% instances), xcomp (2; 0% instances), advmod:emph (1; 0% instances), amod (1; 0% instances), vocative (1; 0% instances)

Parents of X nodes belong to 14 different parts of speech: X (27201; 57% instances), NOUN (15748; 33% instances), PROPN (1946; 4% instances), VERB (1497; 3% instances), (432; 1% instances), ADJ (346; 1% instances), NUM (343; 1% instances), ADV (36; 0% instances), AUX (32; 0% instances), DET (19; 0% instances), INTJ (5; 0% instances), PART (5; 0% instances), PRON (5; 0% instances), SYM (3; 0% instances)

28156 (59%) X nodes are leaves.

6841 (14%) X nodes have one child.

5444 (11%) X nodes have two children.

7177 (15%) X nodes have three or more children.

The highest child degree of a X node is 28.

Children of X nodes are attached using 30 different relations: flat (26169; 57% instances), punct (10753; 24% instances), nmod (2511; 6% instances), case (1654; 4% instances), conj (1355; 3% instances), cc (814; 2% instances), appos (726; 2% instances), nummod (403; 1% instances), amod (382; 1% instances), parataxis (212; 0% instances), advmod:emph (104; 0% instances), acl:relcl (99; 0% instances), mark (98; 0% instances), cop (60; 0% instances), dep (56; 0% instances), fixed (56; 0% instances), nsubj (46; 0% instances), det (42; 0% instances), advmod (29; 0% instances), obl (10; 0% instances), orphan (10; 0% instances), advcl (8; 0% instances), acl (7; 0% instances), aux (5; 0% instances), obj (3; 0% instances), advcl:pred (2; 0% instances), ccomp (1; 0% instances), csubj (1; 0% instances), obl:arg (1; 0% instances), vocative (1; 0% instances)

Children of X nodes belong to 17 different parts of speech: X (27201; 60% instances), PUNCT (10753; 24% instances), NOUN (2933; 6% instances), ADP (1655; 4% instances), PROPN (825; 2% instances), CCONJ (784; 2% instances), NUM (506; 1% instances), ADJ (418; 1% instances), VERB (157; 0% instances), ADV (82; 0% instances), PART (82; 0% instances), DET (70; 0% instances), SCONJ (68; 0% instances), AUX (66; 0% instances), SYM (8; 0% instances), PRON (7; 0% instances), INTJ (3; 0% instances)