home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Dutch-LassySmall: POS Tags: NOUN

There are 10350 NOUN lemmas (38%), 12397 NOUN types (37%) and 49462 NOUN tokens (17%). Out of 16 observed tags, the rank of NOUN is: 1 in number of lemmas, 1 in number of types and 1 in number of tokens.

The 10 most frequent NOUN lemmas: jaar, land, partij, stad, tijd, oorlog, naam, deel, eeuw, plaats

The 10 most frequent NOUN types: jaar, oorlog, jaren, tijd, eeuw, stad, partij, deel, koning, naam

The 10 most frequent ambiguous lemmas: jaar (NOUN 661, PROPN 1), oorlog (NOUN 272, X 1), album (NOUN 164, X 1), tank (NOUN 150, X 1), leven (NOUN 109, VERB 53), dood (NOUN 80, ADJ 25), nummer (NOUN 80, X 25), rijk (NOUN 76, ADJ 37), weg (NOUN 76, ADV 36), uur (NOUN 67, X 6)

The 10 most frequent ambiguous types: jaar (NOUN 400, PROPN 1), oorlog (NOUN 240, X 1), begin (NOUN 132, VERB 2), landen (NOUN 127, VERB 3), album (NOUN 124, X 1), staat (NOUN 114, VERB 90), leven (NOUN 104, VERB 20), leden (NOUN 76, VERB 5), dood (NOUN 78, ADJ 8), rijk (NOUN 55, ADJ 10)

Morphology

The form / lemma ratio of NOUN is 1.197778 (the average of all parts of speech is 1.223407).

The 1st highest number of forms (5) was observed with the lemma “been”: been, beenderen, beentje, beentjes, benen.

The 2nd highest number of forms (5) was observed with the lemma “land”: land, lande, landen, landje, lands.

The 3rd highest number of forms (5) was observed with the lemma “stuk”: stuk, stukje, stukjes, stukken, stuks.

NOUN occurs with 2 features: Number (49462; 100% instances), Gender (36680; 74% instances)

NOUN occurs with 5 feature-value pairs: Gender=Com, Gender=Com,Neut, Gender=Neut, Number=Plur, Number=Sing

NOUN occurs with 5 feature combinations. The most frequent feature combination is Gender=Com|Number=Sing (24994 tokens). Examples: oorlog, tijd, eeuw, stad, partij, koning, naam, plaats, film, regering

Relations

NOUN nodes are attached to their parents using 26 different relations: nmod (9455; 19% instances), obl (8522; 17% instances), nsubj (7280; 15% instances), obj (6251; 13% instances), conj (3865; 8% instances), root (2927; 6% instances), obl:arg (2546; 5% instances), nsubj:pass (2022; 4% instances), fixed (1766; 4% instances), appos (959; 2% instances), xcomp (917; 2% instances), parataxis (848; 2% instances), advcl (653; 1% instances), obl:agent (496; 1% instances), compound:prt (222; 0% instances), flat (205; 0% instances), iobj (105; 0% instances), amod (89; 0% instances), ccomp (79; 0% instances), acl:relcl (78; 0% instances), acl (66; 0% instances), orphan (56; 0% instances), case (27; 0% instances), csubj (20; 0% instances), nsubj:outer (6; 0% instances), nmod:poss (2; 0% instances)

Parents of NOUN nodes belong to 15 different parts of speech: VERB (27349; 55% instances), NOUN (12922; 26% instances), (2927; 6% instances), ADJ (1873; 4% instances), PROPN (1539; 3% instances), ADP (1049; 2% instances), NUM (541; 1% instances), DET (343; 1% instances), PRON (294; 1% instances), ADV (249; 1% instances), X (242; 0% instances), SYM (108; 0% instances), SCONJ (22; 0% instances), INTJ (3; 0% instances), CCONJ (1; 0% instances)

4183 (8%) NOUN nodes are leaves.

11967 (24%) NOUN nodes have one child.

15336 (31%) NOUN nodes have two children.

17976 (36%) NOUN nodes have three or more children.

The highest child degree of a NOUN node is 17.

Children of NOUN nodes are attached using 34 different relations: det (28286; 26% instances), case (19953; 18% instances), amod (15198; 14% instances), nmod (14280; 13% instances), punct (7576; 7% instances), conj (3639; 3% instances), appos (3000; 3% instances), cc (2944; 3% instances), nmod:poss (2628; 2% instances), acl:relcl (2056; 2% instances), nummod (1804; 2% instances), cop (1546; 1% instances), nsubj (1420; 1% instances), mark (1346; 1% instances), acl (1300; 1% instances), parataxis (551; 1% instances), advmod (507; 0% instances), fixed (442; 0% instances), obl (433; 0% instances), flat (412; 0% instances), csubj (113; 0% instances), advcl (102; 0% instances), orphan (79; 0% instances), aux (62; 0% instances), expl (45; 0% instances), cc:preconj (37; 0% instances), obl:arg (19; 0% instances), ccomp (14; 0% instances), obj (5; 0% instances), iobj (2; 0% instances), aux:pass (1; 0% instances), compound:prt (1; 0% instances), nsubj:outer (1; 0% instances), xcomp (1; 0% instances)

Children of NOUN nodes belong to 16 different parts of speech: DET (28425; 26% instances), ADP (20293; 18% instances), ADJ (13617; 12% instances), NOUN (12922; 12% instances), PUNCT (7576; 7% instances), PROPN (7535; 7% instances), VERB (4962; 5% instances), PRON (3112; 3% instances), CCONJ (2974; 3% instances), NUM (2836; 3% instances), ADV (1846; 2% instances), AUX (1609; 1% instances), SCONJ (1266; 1% instances), X (548; 0% instances), SYM (280; 0% instances), INTJ (2; 0% instances)