home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Czech-PDT: POS Tags: PROPN

There are 4725 PROPN lemmas (17%), 6531 PROPN types (12%) and 15741 PROPN tokens (5%). Out of 17 observed tags, the rank of PROPN is: 3 in number of lemmas, 4 in number of types and 7 in number of tokens.

The 10 most frequent PROPN lemmas: Praha, ČR, Německo, ODS, Evropa, LN, Jan, Jiří, Brno, Slovensko

The 10 most frequent PROPN types: Praha, ČR, ODS, Praze, LN, USA, Jiří, Jan, OSN, Václav

The 10 most frequent ambiguous lemmas: Washington (PROPN 24, X 1), Fischer (PROPN 16, X 1), York (X 20, PROPN 16), Bohemia (PROPN 15, X 2), Brod (PROPN 9, X 1), Panton (PROPN 9, X 1), Benetton (PROPN 8, X 1), Inkatha (PROPN 8, X 1), Albert (PROPN 7, X 1), Ford (PROPN 7, X 1)

The 10 most frequent ambiguous types: Plzeň (PROPN 22, NOUN 2), Nováček (PROPN 15, NOUN 1), Maďarsko (PROPN 14, ADJ 1), Bohemia (PROPN 13, X 2), C (NOUN 23, PROPN 12), Fischer (PROPN 11, X 1), Plzni (PROPN 9, NOUN 1), Škoda (PROPN 9, NOUN 4), Albert (PROPN 7, X 1), Benetton (PROPN 6, X 1)

Morphology

The form / lemma ratio of PROPN is 1.382222 (the average of all parts of speech is 1.964432).

The 1st highest number of forms (8) was observed with the lemma “Američan”: Američan, Američana, Američanem, Američani, Američany, Američané, Američanů, Američanům.

The 2nd highest number of forms (8) was observed with the lemma “Čech”: Čech, Čecha, Čechem, Čechy, Čechů, Čechům, Češi, Češích.

The 3rd highest number of forms (7) was observed with the lemma “Kanada”: KANADA, Kan, Kanada, Kanadou, Kanadu, Kanady, Kanadě.

PROPN occurs with 9 features: NameType (15741; 100% instances), Polarity (15741; 100% instances), Gender (14282; 91% instances), Case (13840; 88% instances), Number (13840; 88% instances), Animacy (9109; 58% instances), Abbr (1457; 9% instances), Typo (14; 0% instances), Style (12; 0% instances)

PROPN occurs with 27 feature-value pairs: Abbr=Yes, Animacy=Anim, Animacy=Inan, Case=Acc, Case=Dat, Case=Gen, Case=Ins, Case=Loc, Case=Nom, Case=Voc, Gender=Fem, Gender=Masc, Gender=Neut, NameType=Geo, NameType=Geo,Giv, NameType=Geo,Giv,Oth, NameType=Geo,Oth, NameType=Giv, NameType=Giv,Nat, NameType=Giv,Oth, NameType=Nat, NameType=Oth, Number=Plur, Number=Sing, Polarity=Pos, Style=Coll, Typo=Yes

PROPN occurs with 169 feature combinations. The most frequent feature combination is Animacy=Anim|Case=Nom|Gender=Masc|NameType=Giv|Number=Sing|Polarity=Pos (4546 tokens). Examples: Jiří, Jan, Václav, Vladimír, Klaus, Petr, Pavel, Josef, John, Havel

Relations

PROPN nodes are attached to their parents using 20 different relations: nmod (4423; 28% instances), flat (4004; 25% instances), nsubj (2089; 13% instances), conj (1447; 9% instances), obl (1312; 8% instances), root (1015; 6% instances), dep (576; 4% instances), obl:arg (257; 2% instances), obj (211; 1% instances), appos (176; 1% instances), orphan (101; 1% instances), nsubj:pass (48; 0% instances), iobj (47; 0% instances), advcl (19; 0% instances), xcomp (7; 0% instances), vocative (5; 0% instances), acl:relcl (1; 0% instances), amod (1; 0% instances), ccomp (1; 0% instances), parataxis (1; 0% instances)

Parents of PROPN nodes belong to 15 different parts of speech: NOUN (6848; 44% instances), PROPN (3704; 24% instances), VERB (3376; 21% instances), (1015; 6% instances), ADJ (422; 3% instances), AUX (102; 1% instances), NUM (82; 1% instances), X (69; 0% instances), ADV (68; 0% instances), DET (33; 0% instances), PRON (16; 0% instances), PART (3; 0% instances), ADP (1; 0% instances), CCONJ (1; 0% instances), SYM (1; 0% instances)

7462 (47%) PROPN nodes are leaves.

5099 (32%) PROPN nodes have one child.

1811 (12%) PROPN nodes have two children.

1369 (9%) PROPN nodes have three or more children.

The highest child degree of a PROPN node is 29.

Children of PROPN nodes are attached using 27 different relations: case (3218; 23% instances), punct (3008; 21% instances), flat (1742; 12% instances), conj (1560; 11% instances), nmod (1289; 9% instances), amod (802; 6% instances), cc (711; 5% instances), dep (589; 4% instances), appos (249; 2% instances), advmod:emph (237; 2% instances), acl:relcl (232; 2% instances), nummod (179; 1% instances), orphan (83; 1% instances), xcomp (75; 1% instances), mark (46; 0% instances), parataxis (21; 0% instances), det (18; 0% instances), nsubj (13; 0% instances), obl (13; 0% instances), cop (11; 0% instances), advmod (9; 0% instances), acl (7; 0% instances), nummod:gov (5; 0% instances), aux (2; 0% instances), det:numgov (2; 0% instances), ccomp (1; 0% instances), expl:pv (1; 0% instances)

Children of PROPN nodes belong to 16 different parts of speech: PROPN (3704; 26% instances), ADP (3200; 23% instances), PUNCT (3008; 21% instances), NOUN (1409; 10% instances), ADJ (866; 6% instances), CCONJ (770; 5% instances), NUM (363; 3% instances), VERB (303; 2% instances), X (148; 1% instances), ADV (135; 1% instances), PART (88; 1% instances), SCONJ (46; 0% instances), DET (41; 0% instances), AUX (28; 0% instances), PRON (8; 0% instances), SYM (6; 0% instances)