This is part of archived UD v1 documentation. See http://universaldependencies.org/ for the current version.
home nl/pos issue tracker

PROPN: proper noun

This document is a placeholder for the language-specific documentation for PROPN.


Treebank Statistics (UD_Dutch)

There are 6602 PROPN lemmas (28%), 6669 PROPN types (22%) and 14691 PROPN tokens (7%). Out of 16 observed tags, the rank of PROPN is: 2 in number of lemmas, 2 in number of types and 7 in number of tokens.

The 10 most frequent PROPN lemmas: van, Nederland, J., Amsterdam, zaterdag, den, der, Groningen, Rotterdam, jan

The 10 most frequent PROPN types: van, Nederland, J., Amsterdam, zaterdag, den, der, Jan, Rotterdam, Groningen

The 10 most frequent ambiguous lemmas: van (ADP 5616, X 384, PROPN 200, ADV 88), Nederland (PROPN 129, X 1), Amsterdam (PROPN 110, X 5), den (PROPN 29, X 6, DET 1), der (PROPN 72, DET 63, X 7), Rotterdam (PROPN 71, X 1), jan (PROPN 4, NOUN 1), dr. (PROPN 51, NOUN 1), Europa (PROPN 54, X 5), W. (PROPN 40, X 1)

The 10 most frequent ambiguous types: van (ADP 5516, X 384, PROPN 199, ADV 87), Nederland (PROPN 127, X 1), Amsterdam (PROPN 108, X 5), den (PROPN 29, X 6, DET 1), der (PROPN 72, DET 66, X 7), Jan (PROPN 71, X 2), Rotterdam (PROPN 71, X 1), dr. (PROPN 51, NOUN 1), Europa (PROPN 48, X 5), W. (PROPN 40, X 1)

Morphology

The form / lemma ratio of PROPN is 1.010148 (the average of all parts of speech is 1.258498).

The 1st highest number of forms (9) was observed with the lemma “straat”: Bessemoerstraat, Ebbingestraat, Herestraat, Kanaalstraat, Kerkstraat, Oosterstraat, Oranje-Vrijstraat, Schuddebeursstraat, Sophiastraat.

The 2nd highest number of forms (6) was observed with the lemma “land”: Duitsland, Finland, Griekenland, Ierland, Rusland, Westland.

The 3rd highest number of forms (5) was observed with the lemma “Duitsland”: Duitsland, Oost-Duitsland, West-Duitsland, Zuid-Duitsland, zuid-Duitsland.

PROPN occurs with 2 features: Number (6749; 46% instances), Case (90; 1% instances)

PROPN occurs with 3 feature-value pairs: Case=Gen, Number=Plur, Number=Sing

PROPN occurs with 4 feature combinations. The most frequent feature combination is _ (7942 tokens). Examples: van, J., den, der, mr., dr., H., Jan, a., Haag

Relations

PROPN nodes are attached to their parents using 23 different relations: name (4505; 31% instances), nmod (3952; 27% instances), nsubj (2115; 14% instances), appos (1650; 11% instances), conj (839; 6% instances), dobj (711; 5% instances), advmod (407; 3% instances), root (230; 2% instances), dep (129; 1% instances), cop (36; 0% instances), iobj (35; 0% instances), cc (33; 0% instances), parataxis (15; 0% instances), aux (9; 0% instances), mark (6; 0% instances), xcomp (6; 0% instances), acl (4; 0% instances), case (3; 0% instances), ccomp (2; 0% instances), advcl (1; 0% instances), amod (1; 0% instances), csubj (1; 0% instances), mwe (1; 0% instances)

Parents of PROPN nodes belong to 16 different parts of speech: PROPN (5667; 39% instances), NOUN (4390; 30% instances), VERB (2821; 19% instances), AUX (870; 6% instances), ADJ (262; 2% instances), ROOT (230; 2% instances), X (139; 1% instances), ADV (120; 1% instances), PRON (92; 1% instances), ADP (28; 0% instances), DET (23; 0% instances), SCONJ (23; 0% instances), NUM (18; 0% instances), CONJ (4; 0% instances), SYM (3; 0% instances), INTJ (1; 0% instances)

7063 (48%) PROPN nodes are leaves.

3965 (27%) PROPN nodes have one child.

1710 (12%) PROPN nodes have two children.

1953 (13%) PROPN nodes have three or more children.

The highest child degree of a PROPN node is 29.

Children of PROPN nodes are attached using 25 different relations: case (4525; 29% instances), name (4503; 29% instances), det (1671; 11% instances), punct (1278; 8% instances), conj (848; 5% instances), nmod (623; 4% instances), cc (603; 4% instances), advmod (285; 2% instances), appos (198; 1% instances), amod (158; 1% instances), nummod (138; 1% instances), acl (123; 1% instances), dep (120; 1% instances), advcl (117; 1% instances), dobj (70; 0% instances), cop (69; 0% instances), mark (56; 0% instances), parataxis (41; 0% instances), nsubj (40; 0% instances), aux (23; 0% instances), compound:prt (22; 0% instances), compound (6; 0% instances), ccomp (2; 0% instances), expl (2; 0% instances), neg (2; 0% instances)

Children of PROPN nodes belong to 16 different parts of speech: PROPN (5667; 37% instances), ADP (4381; 28% instances), DET (1683; 11% instances), PUNCT (1279; 8% instances), CONJ (567; 4% instances), NOUN (558; 4% instances), VERB (339; 2% instances), ADJ (300; 2% instances), NUM (242; 2% instances), AUX (149; 1% instances), ADV (140; 1% instances), PRON (85; 1% instances), SCONJ (59; 0% instances), X (47; 0% instances), SYM (24; 0% instances), INTJ (3; 0% instances)


Treebank Statistics (UD_Dutch-LassySmall)

There are 4088 PROPN lemmas (30%), 4125 PROPN types (26%) and 12240 PROPN tokens (12%). Out of 17 observed tags, the rank of PROPN is: 2 in number of lemmas, 2 in number of types and 3 in number of tokens.

The 10 most frequent PROPN lemmas: België, van, de, Brussel, Antwerpen, Vlaanderen, Nederland, Hasselt, Wiske, vandersteen

The 10 most frequent PROPN types: België, van, de, Brussel, Antwerpen, Vlaanderen, Hasselt, Wiske, Suske, Jan

The 10 most frequent ambiguous lemmas: van (ADP 3140, PROPN 101), de (DET 5884, PROPN 73, X 6), Frans (ADJ 84, PROPN 53), II (PROPN 47, NUM 2), Vlaams (ADJ 286, PROPN 45), VLD (PROPN 38, X 2), Mechelen (PROPN 35, X 1), der (PROPN 34, DET 3, ADV 2), ! (PROPN 31, PUNCT 18), groen (ADJ 65, PROPN 20)

The 10 most frequent ambiguous types: van (ADP 3101, PROPN 101), de (DET 4905, PROPN 73, X 6), Frans (PROPN 53, ADJ 10), II (PROPN 47, NUM 2), Vlaams (ADJ 77, PROPN 45), VLD (PROPN 38, X 2), Mechelen (PROPN 35, X 1), der (DET 52, PROPN 34, ADV 2), ! (PROPN 31, PUNCT 18), Groen (PROPN 31, ADJ 2)

Morphology

The form / lemma ratio of PROPN is 1.009051 (the average of all parts of speech is 1.179900).

The 1st highest number of forms (3) was observed with the lemma “België”: BELGIË, België, Belgiës.

The 2nd highest number of forms (3) was observed with the lemma “Vandersteen”: Vandersteen, Vandersteen’s, Vandersteens.

The 3rd highest number of forms (3) was observed with the lemma “Vlaanderen”: VLaanderen, Vlaanderen, Vlaanderens.

PROPN occurs with 2 features: Number (6266; 51% instances), Gender (5984; 49% instances)

PROPN occurs with 5 feature-value pairs: Gender=Com, Gender=Com,Neut, Gender=Neut, Number=Plur, Number=Sing

PROPN occurs with 6 feature combinations. The most frequent feature combination is _ (5974 tokens). Examples: van, de, Jan, II, Vlaams, Kim, I, Clijsters, der, !

Relations

PROPN nodes are attached to their parents using 15 different relations: name (3433; 28% instances), nmod (2730; 22% instances), root (1229; 10% instances), nsubj (1109; 9% instances), conj (982; 8% instances), mwe (842; 7% instances), appos (832; 7% instances), parataxis (658; 5% instances), dobj (222; 2% instances), acl (74; 1% instances), det (47; 0% instances), advcl (32; 0% instances), iobj (27; 0% instances), amod (19; 0% instances), ccomp (4; 0% instances)

Parents of PROPN nodes belong to 15 different parts of speech: PROPN (4742; 39% instances), NOUN (3040; 25% instances), VERB (1849; 15% instances), ROOT (1229; 10% instances), NUM (739; 6% instances), ADJ (307; 3% instances), DET (123; 1% instances), X (116; 1% instances), ADP (25; 0% instances), ADV (25; 0% instances), PRON (25; 0% instances), SYM (12; 0% instances), PUNCT (4; 0% instances), AUX (3; 0% instances), SCONJ (1; 0% instances)

6542 (53%) PROPN nodes are leaves.

2504 (20%) PROPN nodes have one child.

1403 (11%) PROPN nodes have two children.

1791 (15%) PROPN nodes have three or more children.

The highest child degree of a PROPN node is 43.

Children of PROPN nodes are attached using 24 different relations: name (3179; 24% instances), case (2376; 18% instances), punct (1998; 15% instances), conj (1056; 8% instances), det (1016; 8% instances), nmod (831; 6% instances), mwe (693; 5% instances), cc (482; 4% instances), parataxis (375; 3% instances), appos (352; 3% instances), amod (213; 2% instances), acl (188; 1% instances), nummod (160; 1% instances), advmod (120; 1% instances), cop (96; 1% instances), nsubj (89; 1% instances), mark (82; 1% instances), dobj (14; 0% instances), advcl (4; 0% instances), aux (3; 0% instances), compound (3; 0% instances), auxpass (1; 0% instances), ccomp (1; 0% instances), neg (1; 0% instances)

Children of PROPN nodes belong to 17 different parts of speech: PROPN (4742; 36% instances), ADP (2473; 19% instances), PUNCT (2063; 15% instances), DET (1087; 8% instances), NOUN (912; 7% instances), CONJ (534; 4% instances), NUM (397; 3% instances), ADJ (295; 2% instances), SYM (164; 1% instances), X (158; 1% instances), VERB (154; 1% instances), ADV (142; 1% instances), AUX (100; 1% instances), SCONJ (49; 0% instances), PART (39; 0% instances), PRON (23; 0% instances), INTJ (1; 0% instances)


PROPN in other languages: [bg] [cs] [de] [el] [en] [es] [eu] [fa] [fi] [fr] [ga] [he] [hu] [it] [ja] [ko] [sv] [u]