home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_German-HDT: POS Tags: PROPN

There are 27321 PROPN lemmas (34%), 27316 PROPN types (14%) and 193940 PROPN tokens (6%). Out of 16 observed tags, the rank of PROPN is: 1 in number of lemmas, 2 in number of types and 7 in number of tokens.

The 10 most frequent PROPN lemmas: Microsoft, Telekom, Deutschland, Intel, USA, AOL, ibm, c’t, Europa, AMD

The 10 most frequent PROPN types: Microsoft, Telekom, Deutschland, Intel, USA, AOL, ibm, c’t, Europa, AMD

The 10 most frequent ambiguous lemmas: Telekom (PROPN 3831, NOUN 1, X 1), Intel (PROPN 2657, X 6), USA (PROPN 2566, X 2), c’t (PROPN 1250, X 2), AMD (PROPN 1143, X 1), Apple (PROPN 1071, X 6), Windows (PROPN 1043, NOUN 852, X 108), online (PROPN 806, ADJ 647, X 7), Sun (PROPN 841, X 4), Microsofts (PROPN 745, X 3)

The 10 most frequent ambiguous types: Telekom (PROPN 3831, NOUN 1, X 1), Intel (PROPN 2657, X 6), USA (PROPN 2566, X 2), c’t (PROPN 1250, X 2), AMD (PROPN 1143, X 1), Apple (PROPN 1071, X 6, NOUN 3), Windows (PROPN 1043, NOUN 852, X 108), online (PROPN 806, ADJ 640, X 7), Sun (PROPN 841, X 4), Microsofts (PROPN 745, X 3)

Morphology

The form / lemma ratio of PROPN is 0.999817 (the average of all parts of speech is 2.529657).

The 1st highest number of forms (3) was observed with the lemma “PowerBook”: G3-PowerBook, PowerBook, PowerBooks.

The 2nd highest number of forms (2) was observed with the lemma “Exchange”: Exchange, SoundExchange.

The 3rd highest number of forms (2) was observed with the lemma “Festival”: KurzFilmFestival, bitfilmFestival.

PROPN occurs with 3 features: Number (128855; 66% instances), Case (61547; 32% instances), Gender (27734; 14% instances)

PROPN occurs with 9 feature-value pairs: Case=Acc, Case=Dat, Case=Gen, Case=Nom, Gender=Fem, Gender=Masc, Gender=Neut, Number=Plur, Number=Sing

PROPN occurs with 34 feature combinations. The most frequent feature combination is _ (62383 tokens). Examples: telepolis, ICANN, CeBIT, NT, UMTS, Mobilcom, RegTP, Street, IFA, Transmeta

Relations

PROPN nodes are attached to their parents using 20 different relations: flat:name (55475; 29% instances), nsubj (43648; 23% instances), nmod (37652; 19% instances), obl (21365; 11% instances), conj (14054; 7% instances), appos (9051; 5% instances), obj (5272; 3% instances), root (3852; 2% instances), nsubj:pass (1844; 1% instances), xcomp (668; 0% instances), obl:arg (574; 0% instances), parataxis (437; 0% instances), ccomp (15; 0% instances), acl (11; 0% instances), advcl (9; 0% instances), amod (4; 0% instances), flat (3; 0% instances), vocative (3; 0% instances), csubj (2; 0% instances), nmod:poss (1; 0% instances)

Parents of PROPN nodes belong to 12 different parts of speech: NOUN (81507; 42% instances), VERB (64383; 33% instances), PROPN (30376; 16% instances), X (5732; 3% instances), ADJ (4301; 2% instances), (3852; 2% instances), AUX (2113; 1% instances), DET (1043; 1% instances), NUM (296; 0% instances), ADV (171; 0% instances), PRON (162; 0% instances), ADP (4; 0% instances)

93138 (48%) PROPN nodes are leaves.

53007 (27%) PROPN nodes have one child.

30485 (16%) PROPN nodes have two children.

17310 (9%) PROPN nodes have three or more children.

The highest child degree of a PROPN node is 43.

Children of PROPN nodes are attached using 26 different relations: case (50492; 29% instances), punct (27458; 16% instances), det (22326; 13% instances), flat:name (17180; 10% instances), conj (13508; 8% instances), flat (9576; 5% instances), appos (9558; 5% instances), cc (8923; 5% instances), advmod (5432; 3% instances), nmod (5044; 3% instances), amod (4321; 2% instances), acl (1752; 1% instances), obl (232; 0% instances), nummod (205; 0% instances), nsubj (190; 0% instances), cop (159; 0% instances), parataxis (46; 0% instances), mark (27; 0% instances), aux (19; 0% instances), advcl (14; 0% instances), ccomp (7; 0% instances), csubj (3; 0% instances), orphan (2; 0% instances), det:poss (1; 0% instances), expl (1; 0% instances), xcomp (1; 0% instances)

Children of PROPN nodes belong to 15 different parts of speech: ADP (48179; 27% instances), PROPN (30376; 17% instances), PUNCT (27458; 16% instances), DET (22524; 13% instances), NOUN (12910; 7% instances), CCONJ (11242; 6% instances), X (7763; 4% instances), ADJ (4920; 3% instances), ADV (4837; 3% instances), NUM (4094; 2% instances), VERB (1661; 1% instances), AUX (227; 0% instances), PART (171; 0% instances), PRON (88; 0% instances), SCONJ (27; 0% instances)