home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Dutch-Alpino: POS Tags: PROPN

There are 6197 PROPN lemmas (26%), 6288 PROPN types (22%) and 14294 PROPN tokens (7%). Out of 16 observed tags, the rank of PROPN is: 2 in number of lemmas, 2 in number of types and 6 in number of tokens.

The 10 most frequent PROPN lemmas: van, de, Nederland, Amsterdam, J., zaterdag, den, Jan, Rotterdam, Groningen

The 10 most frequent PROPN types: van, de, Nederland, J., Amsterdam, zaterdag, den, Jan, Rotterdam, Groningen

The 10 most frequent ambiguous lemmas: van (ADP 6083, PROPN 175), de (DET 12907, PROPN 138, X 1), Nederland (PROPN 133, ADJ 1), den (PROPN 30, DET 5), der (PROPN 66, DET 1), mr. (PROPN 49, NOUN 1, X 1), dr. (PROPN 60, NOUN 1), H. (PROPN 54, SYM 1), W. (PROPN 40, SYM 1), C. (PROPN 32, SYM 10)

The 10 most frequent ambiguous types: van (ADP 5985, PROPN 175), de (DET 11159, PROPN 137, X 1), den (PROPN 30, DET 6), der (DET 78, PROPN 67), mr. (PROPN 49, NOUN 1, X 1), dr. (PROPN 49, NOUN 1), H. (PROPN 54, SYM 1), W. (PROPN 40, SYM 1), Tweede (PROPN 35, ADJ 1), C. (PROPN 32, SYM 10)

Morphology

The form / lemma ratio of PROPN is 1.014685 (the average of all parts of speech is 1.214322).

The 1st highest number of forms (4) was observed with the lemma “Nederland”: Nederland, Nederlanden, Nederlands, Neerlands.

The 2nd highest number of forms (4) was observed with the lemma “Sovjet-Unie”: Sovjet-Unie, SovjetUnie, Sowjet-Unie, Sowjetunie.

The 3rd highest number of forms (3) was observed with the lemma “Amsterdam”: AMSTERDAM, Amsterdam, Amsterdams.

PROPN occurs with 2 features: Number (6655; 47% instances), Gender (6443; 45% instances)

PROPN occurs with 5 feature-value pairs: Gender=Com, Gender=Com,Neut, Gender=Neut, Number=Plur, Number=Sing

PROPN occurs with 6 feature combinations. The most frequent feature combination is _ (7639 tokens). Examples: van, de, J., den, der, mr., dr., Jan, H., Haag

Relations

PROPN nodes are attached to their parents using 22 different relations: flat:name (4654; 33% instances), nmod (2615; 18% instances), nsubj (2089; 15% instances), appos (1554; 11% instances), obl (1542; 11% instances), conj (810; 6% instances), obj (288; 2% instances), nsubj:pass (155; 1% instances), root (146; 1% instances), parataxis (114; 1% instances), nmod:poss (100; 1% instances), obl:agent (76; 1% instances), iobj (51; 0% instances), advcl (41; 0% instances), xcomp (16; 0% instances), orphan (13; 0% instances), acl:relcl (10; 0% instances), acl (7; 0% instances), amod (4; 0% instances), compound:prt (4; 0% instances), ccomp (3; 0% instances), advmod (2; 0% instances)

Parents of PROPN nodes belong to 13 different parts of speech: PROPN (5153; 36% instances), NOUN (4203; 29% instances), VERB (3780; 26% instances), ADJ (351; 2% instances), NUM (205; 1% instances), PRON (197; 1% instances), (146; 1% instances), SYM (62; 0% instances), ADV (60; 0% instances), DET (56; 0% instances), X (51; 0% instances), ADP (25; 0% instances), INTJ (5; 0% instances)

6637 (46%) PROPN nodes are leaves.

3851 (27%) PROPN nodes have one child.

1969 (14%) PROPN nodes have two children.

1837 (13%) PROPN nodes have three or more children.

The highest child degree of a PROPN node is 12.

Children of PROPN nodes are attached using 23 different relations: flat:name (4518; 31% instances), case (3993; 27% instances), det (1746; 12% instances), punct (1324; 9% instances), conj (837; 6% instances), nmod (524; 4% instances), cc (520; 4% instances), amod (396; 3% instances), appos (231; 2% instances), acl:relcl (167; 1% instances), nummod (84; 1% instances), mark (81; 1% instances), acl (60; 0% instances), cop (58; 0% instances), parataxis (58; 0% instances), nsubj (55; 0% instances), advmod (21; 0% instances), nmod:poss (17; 0% instances), orphan (17; 0% instances), obl (11; 0% instances), aux (3; 0% instances), advcl (2; 0% instances), xcomp (1; 0% instances)

Children of PROPN nodes belong to 16 different parts of speech: PROPN (5153; 35% instances), ADP (4021; 27% instances), DET (1773; 12% instances), PUNCT (1324; 9% instances), NOUN (622; 4% instances), CCONJ (536; 4% instances), ADJ (324; 2% instances), VERB (255; 2% instances), NUM (211; 1% instances), ADV (174; 1% instances), X (90; 1% instances), SCONJ (69; 0% instances), SYM (62; 0% instances), AUX (61; 0% instances), PRON (47; 0% instances), INTJ (2; 0% instances)