home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Dutch-LassySmall: POS Tags: NOUN

There are 10351 NOUN lemmas (38%), 12396 NOUN types (37%) and 49460 NOUN tokens (17%). Out of 16 observed tags, the rank of NOUN is: 1 in number of lemmas, 1 in number of types and 1 in number of tokens.

The 10 most frequent NOUN lemmas: jaar, land, partij, stad, tijd, oorlog, naam, deel, eeuw, plaats

The 10 most frequent NOUN types: jaar, oorlog, jaren, tijd, eeuw, stad, partij, deel, koning, naam

The 10 most frequent ambiguous lemmas: jaar (NOUN 661, PROPN 1), oorlog (NOUN 272, X 1), album (NOUN 164, X 1), tank (NOUN 150, X 1), leven (NOUN 109, VERB 53), nummer (NOUN 80, X 25), dood (NOUN 79, ADJ 26), rijk (NOUN 76, ADJ 37), weg (NOUN 75, ADV 36, ADJ 1), uur (NOUN 67, X 6)

The 10 most frequent ambiguous types: jaar (NOUN 400, PROPN 1), oorlog (NOUN 240, X 1), begin (NOUN 132, VERB 2), landen (NOUN 127, VERB 3), album (NOUN 124, X 1), staat (NOUN 114, VERB 90), leven (NOUN 104, VERB 20), leden (NOUN 76, VERB 5), dood (NOUN 77, ADJ 9), rijk (NOUN 55, ADJ 10)

Morphology

The form / lemma ratio of NOUN is 1.197565 (the average of all parts of speech is 1.223065).

The 1st highest number of forms (5) was observed with the lemma “been”: been, beenderen, beentje, beentjes, benen.

The 2nd highest number of forms (5) was observed with the lemma “land”: land, lande, landen, landje, lands.

The 3rd highest number of forms (5) was observed with the lemma “stuk”: stuk, stukje, stukjes, stukken, stuks.

NOUN occurs with 3 features: Number (49460; 100% instances), Gender (36679; 74% instances), ExtPos (331; 1% instances)

NOUN occurs with 8 feature-value pairs: ExtPos=ADP, ExtPos=ADV, ExtPos=PROPN, Gender=Com, Gender=Com,Neut, Gender=Neut, Number=Plur, Number=Sing

NOUN occurs with 12 feature combinations. The most frequent feature combination is Gender=Com|Number=Sing (24816 tokens). Examples: oorlog, tijd, eeuw, stad, partij, naam, koning, plaats, film, regering

Relations

NOUN nodes are attached to their parents using 26 different relations: nmod (9486; 19% instances), obl (8532; 17% instances), nsubj (7276; 15% instances), obj (6238; 13% instances), conj (3833; 8% instances), root (2925; 6% instances), obl:arg (2531; 5% instances), nsubj:pass (2031; 4% instances), fixed (1193; 2% instances), appos (956; 2% instances), xcomp (919; 2% instances), parataxis (854; 2% instances), flat (780; 2% instances), advcl (654; 1% instances), obl:agent (510; 1% instances), compound:prt (219; 0% instances), iobj (107; 0% instances), amod (83; 0% instances), acl:relcl (78; 0% instances), ccomp (78; 0% instances), acl (65; 0% instances), orphan (57; 0% instances), case (27; 0% instances), csubj (20; 0% instances), nsubj:outer (6; 0% instances), nmod:poss (2; 0% instances)

Parents of NOUN nodes belong to 15 different parts of speech: VERB (27359; 55% instances), NOUN (12912; 26% instances), (2925; 6% instances), ADJ (1879; 4% instances), PROPN (1556; 3% instances), ADP (1045; 2% instances), NUM (526; 1% instances), DET (340; 1% instances), PRON (293; 1% instances), ADV (253; 1% instances), X (240; 0% instances), SYM (106; 0% instances), SCONJ (22; 0% instances), INTJ (3; 0% instances), CCONJ (1; 0% instances)

4178 (8%) NOUN nodes are leaves.

11977 (24%) NOUN nodes have one child.

15315 (31%) NOUN nodes have two children.

17990 (36%) NOUN nodes have three or more children.

The highest child degree of a NOUN node is 17.

Children of NOUN nodes are attached using 34 different relations: det (28274; 26% instances), case (19953; 18% instances), amod (15181; 14% instances), nmod (14286; 13% instances), punct (7583; 7% instances), conj (3623; 3% instances), appos (2984; 3% instances), cc (2948; 3% instances), nmod:poss (2628; 2% instances), acl:relcl (2052; 2% instances), nummod (1801; 2% instances), cop (1548; 1% instances), nsubj (1418; 1% instances), mark (1349; 1% instances), acl (1300; 1% instances), flat (747; 1% instances), parataxis (571; 1% instances), advmod (514; 0% instances), obl (435; 0% instances), csubj (116; 0% instances), fixed (116; 0% instances), advcl (106; 0% instances), orphan (82; 0% instances), aux (63; 0% instances), expl (45; 0% instances), cc:preconj (37; 0% instances), obl:arg (19; 0% instances), ccomp (14; 0% instances), obj (4; 0% instances), iobj (2; 0% instances), aux:pass (1; 0% instances), compound:prt (1; 0% instances), nsubj:outer (1; 0% instances), xcomp (1; 0% instances)

Children of NOUN nodes belong to 16 different parts of speech: DET (28414; 26% instances), ADP (20294; 18% instances), ADJ (13612; 12% instances), NOUN (12912; 12% instances), PUNCT (7583; 7% instances), PROPN (7539; 7% instances), VERB (4963; 5% instances), PRON (3112; 3% instances), CCONJ (2976; 3% instances), NUM (2844; 3% instances), ADV (1843; 2% instances), AUX (1612; 1% instances), SCONJ (1268; 1% instances), X (548; 0% instances), SYM (281; 0% instances), INTJ (2; 0% instances)