home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Dutch-LassySmall: POS Tags: PROPN

There are 7842 PROPN lemmas (28%), 8001 PROPN types (24%) and 30336 PROPN tokens (10%). Out of 16 observed tags, the rank of PROPN is: 2 in number of lemmas, 2 in number of types and 5 in number of tokens.

The 10 most frequent PROPN lemmas: van, de, België, Brussel, Duitsland, Vlaanderen, Wereldoorlog, Antwerpen, juni, Nederland

The 10 most frequent PROPN types: van, de, België, Brussel, Duitsland, Wereldoorlog, Vlaanderen, Antwerpen, juni, Frankrijk

The 10 most frequent ambiguous lemmas: van (ADP 9309, PROPN 340), de (DET 18975, PROPN 197, X 41), Nederland (PROPN 153, X 30), II (PROPN 131, NUM 2, X 2), Prince (PROPN 120, X 1), Dylan (PROPN 113, X 1), Vlaams (ADJ 244, PROPN 98), Tweede (PROPN 96, ADJ 1), november (PROPN 94, X 3), Eerste (PROPN 82, ADJ 3)

The 10 most frequent ambiguous types: van (ADP 9213, PROPN 340), de (DET 16356, PROPN 197, X 41), Wereldoorlog (PROPN 165, NOUN 1), II (PROPN 131, NUM 2, X 2), Verenigde (PROPN 120, VERB 2), Prince (PROPN 116, X 1), staten (NOUN 64, PROPN 1), Vlaams (PROPN 98, ADJ 44), Tweede (PROPN 96, ADJ 12), Dylan (PROPN 89, X 1)

Morphology

The form / lemma ratio of PROPN is 1.020275 (the average of all parts of speech is 1.223065).

The 1st highest number of forms (3) was observed with the lemma “Belga”: Belgae, belga, belga’s.

The 2nd highest number of forms (3) was observed with the lemma “België”: BELGIË, België, Belgiës.

The 3rd highest number of forms (3) was observed with the lemma “Bernini”: Berini, Bernini, Bernini’s.

PROPN occurs with 3 features: Number (15881; 52% instances), Gender (14502; 48% instances), ExtPos (195; 1% instances)

PROPN occurs with 6 feature-value pairs: ExtPos=PROPN, Gender=Com, Gender=Com,Neut, Gender=Neut, Number=Plur, Number=Sing

PROPN occurs with 11 feature combinations. The most frequent feature combination is _ (14455 tokens). Examples: van, de, Wereldoorlog, II, Verenigde, staten, Tweede, Vlaams, Eerste, I

Relations

PROPN nodes are attached to their parents using 23 different relations: flat (9729; 32% instances), nmod (5428; 18% instances), nsubj (3444; 11% instances), obl (2468; 8% instances), conj (2222; 7% instances), appos (2023; 7% instances), root (1573; 5% instances), parataxis (856; 3% instances), obj (722; 2% instances), obl:arg (455; 1% instances), nsubj:pass (441; 1% instances), nmod:poss (282; 1% instances), obl:agent (253; 1% instances), xcomp (116; 0% instances), advcl (107; 0% instances), iobj (100; 0% instances), acl (68; 0% instances), acl:relcl (20; 0% instances), orphan (17; 0% instances), ccomp (7; 0% instances), amod (2; 0% instances), nsubj:outer (2; 0% instances), csubj (1; 0% instances)

Parents of PROPN nodes belong to 12 different parts of speech: PROPN (11502; 38% instances), NOUN (7555; 25% instances), VERB (7453; 25% instances), (1573; 5% instances), NUM (1374; 5% instances), ADJ (441; 1% instances), DET (107; 0% instances), X (96; 0% instances), PRON (94; 0% instances), ADV (85; 0% instances), ADP (37; 0% instances), SYM (19; 0% instances)

13588 (45%) PROPN nodes are leaves.

6965 (23%) PROPN nodes have one child.

5184 (17%) PROPN nodes have two children.

4599 (15%) PROPN nodes have three or more children.

The highest child degree of a PROPN node is 30.

Children of PROPN nodes are attached using 24 different relations: flat (8921; 25% instances), case (7714; 22% instances), punct (5146; 15% instances), det (4432; 13% instances), conj (2320; 7% instances), nmod (1642; 5% instances), cc (1284; 4% instances), amod (660; 2% instances), parataxis (532; 2% instances), appos (515; 1% instances), nummod (393; 1% instances), acl:relcl (331; 1% instances), advmod (296; 1% instances), mark (262; 1% instances), acl (240; 1% instances), nsubj (123; 0% instances), cop (122; 0% instances), nmod:poss (34; 0% instances), orphan (32; 0% instances), cc:preconj (26; 0% instances), obl (14; 0% instances), advcl (6; 0% instances), aux (5; 0% instances), ccomp (1; 0% instances)

Children of PROPN nodes belong to 15 different parts of speech: PROPN (11502; 33% instances), ADP (7776; 22% instances), PUNCT (5146; 15% instances), DET (4502; 13% instances), NOUN (1554; 4% instances), CCONJ (1348; 4% instances), ADJ (717; 2% instances), NUM (703; 2% instances), VERB (552; 2% instances), ADV (359; 1% instances), SYM (315; 1% instances), SCONJ (263; 1% instances), AUX (127; 0% instances), X (107; 0% instances), PRON (80; 0% instances)