home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Dutch-LassySmall: POS Tags: PROPN

There are 4243 PROPN lemmas (31%), 4283 PROPN types (27%) and 13395 PROPN tokens (14%). Out of 16 observed tags, the rank of PROPN is: 2 in number of lemmas, 2 in number of types and 3 in number of tokens.

The 10 most frequent PROPN lemmas: België, van, de, Brussel, Antwerpen, Vlaanderen, Hasselt, Vlaams, Nederland, en

The 10 most frequent PROPN types: België, van, de, Brussel, Antwerpen, Vlaanderen, Vlaams, Hasselt, en, Wiske

The 10 most frequent ambiguous lemmas: van (ADP 3071, PROPN 168), de (DET 5869, PROPN 93, X 8), Vlaams (ADJ 221, PROPN 80), en (CCONJ 2337, PROPN 69), Ensor (PROPN 63, X 1), Frans (ADJ 73, PROPN 53), II (PROPN 47, NUM 1, SYM 1), der (PROPN 46, ADV 2, DET 1), Brussels (ADJ 48, PROPN 44, X 1), Mechelen (PROPN 35, X 1)

The 10 most frequent ambiguous types: van (ADP 3034, PROPN 168), de (DET 4883, PROPN 93, X 8), Vlaams (PROPN 80, ADJ 42), en (CCONJ 2315, PROPN 69), Gewest (PROPN 64, NOUN 7), Ensor (PROPN 57, X 1), Frans (PROPN 53, ADJ 10), Gemeenschap (PROPN 48, NOUN 8), II (PROPN 47, NUM 1, SYM 1), der (PROPN 46, DET 40, ADV 2)

Morphology

The form / lemma ratio of PROPN is 1.009427 (the average of all parts of speech is 1.174887).

The 1st highest number of forms (3) was observed with the lemma “België”: BELGIË, België, Belgiës.

The 2nd highest number of forms (3) was observed with the lemma “Vandersteen”: Vandersteen, Vandersteen’s, Vandersteens.

The 3rd highest number of forms (3) was observed with the lemma “Vlaanderen”: VLaanderen, Vlaanderen, Vlaanderens.

PROPN occurs with 2 features: Number (6175; 46% instances), Gender (5874; 44% instances)

PROPN occurs with 5 feature-value pairs: Gender=Com, Gender=Com,Neut, Gender=Neut, Number=Plur, Number=Sing

PROPN occurs with 6 feature combinations. The most frequent feature combination is _ (7220 tokens). Examples: van, de, Vlaams, en, Gewest, Jan, Suske, Gemeenschap, II, Wereldoorlog

Relations

PROPN nodes are attached to their parents using 22 different relations: flat:name (4860; 36% instances), nmod (2196; 16% instances), root (1302; 10% instances), conj (1055; 8% instances), nsubj (1045; 8% instances), obl (883; 7% instances), appos (797; 6% instances), parataxis (674; 5% instances), obj (174; 1% instances), nsubj:pass (136; 1% instances), obl:agent (66; 0% instances), nmod:poss (51; 0% instances), acl (36; 0% instances), advcl (36; 0% instances), xcomp (27; 0% instances), iobj (18; 0% instances), acl:relcl (13; 0% instances), orphan (11; 0% instances), amod (9; 0% instances), ccomp (4; 0% instances), csubj (1; 0% instances), det (1; 0% instances)

Parents of PROPN nodes belong to 13 different parts of speech: PROPN (5782; 43% instances), NOUN (3085; 23% instances), VERB (2019; 15% instances), (1302; 10% instances), NUM (747; 6% instances), ADJ (254; 2% instances), DET (104; 1% instances), PRON (27; 0% instances), ADV (24; 0% instances), X (22; 0% instances), ADP (20; 0% instances), SYM (7; 0% instances), AUX (2; 0% instances)

6168 (46%) PROPN nodes are leaves.

2668 (20%) PROPN nodes have one child.

2336 (17%) PROPN nodes have two children.

2223 (17%) PROPN nodes have three or more children.

The highest child degree of a PROPN node is 30.

Children of PROPN nodes are attached using 23 different relations: flat:name (4460; 28% instances), punct (3388; 21% instances), case (2616; 16% instances), det (1329; 8% instances), conj (1115; 7% instances), nmod (906; 6% instances), cc (490; 3% instances), parataxis (411; 3% instances), amod (370; 2% instances), appos (315; 2% instances), nummod (169; 1% instances), acl (100; 1% instances), cop (94; 1% instances), acl:relcl (90; 1% instances), mark (90; 1% instances), nsubj (89; 1% instances), advmod (26; 0% instances), obl (20; 0% instances), orphan (18; 0% instances), nmod:poss (12; 0% instances), advcl (4; 0% instances), xcomp (4; 0% instances), aux (3; 0% instances)

Children of PROPN nodes belong to 15 different parts of speech: PROPN (5782; 36% instances), PUNCT (3388; 21% instances), ADP (2712; 17% instances), DET (1381; 9% instances), NOUN (914; 6% instances), CCONJ (493; 3% instances), NUM (392; 2% instances), ADJ (302; 2% instances), SYM (206; 1% instances), VERB (169; 1% instances), ADV (154; 1% instances), AUX (97; 1% instances), SCONJ (54; 0% instances), X (42; 0% instances), PRON (33; 0% instances)