home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Dutch-LassySmall: POS Tags: NOUN

There are 10350 NOUN lemmas (38%), 12396 NOUN types (37%) and 49469 NOUN tokens (17%). Out of 16 observed tags, the rank of NOUN is: 1 in number of lemmas, 1 in number of types and 1 in number of tokens.

The 10 most frequent NOUN lemmas: jaar, land, partij, stad, tijd, oorlog, naam, deel, eeuw, plaats

The 10 most frequent NOUN types: jaar, oorlog, jaren, tijd, eeuw, stad, partij, deel, koning, naam

The 10 most frequent ambiguous lemmas: jaar (NOUN 661, PROPN 1), oorlog (NOUN 272, X 1), album (NOUN 164, X 1), tank (NOUN 150, X 1), leven (NOUN 109, VERB 53), nummer (NOUN 80, X 25), dood (NOUN 79, ADJ 26), rijk (NOUN 76, ADJ 37), weg (NOUN 75, ADV 36, ADJ 1), uur (NOUN 67, X 6)

The 10 most frequent ambiguous types: jaar (NOUN 400, PROPN 1), oorlog (NOUN 240, X 1), begin (NOUN 132, VERB 2), landen (NOUN 127, VERB 3), album (NOUN 124, X 1), staat (NOUN 114, VERB 90), leven (NOUN 104, VERB 20), leden (NOUN 76, VERB 5), dood (NOUN 77, ADJ 9), rijk (NOUN 55, ADJ 10)

Morphology

The form / lemma ratio of NOUN is 1.197681 (the average of all parts of speech is 1.223065).

The 1st highest number of forms (5) was observed with the lemma “been”: been, beenderen, beentje, beentjes, benen.

The 2nd highest number of forms (5) was observed with the lemma “land”: land, lande, landen, landje, lands.

The 3rd highest number of forms (5) was observed with the lemma “stuk”: stuk, stukje, stukjes, stukken, stuks.

NOUN occurs with 3 features: Number (49469; 100% instances), Gender (36690; 74% instances), ExtPos (342; 1% instances)

NOUN occurs with 8 feature-value pairs: ExtPos=ADP, ExtPos=ADV, ExtPos=PROPN, Gender=Com, Gender=Com,Neut, Gender=Neut, Number=Plur, Number=Sing

NOUN occurs with 12 feature combinations. The most frequent feature combination is Gender=Com|Number=Sing (24801 tokens). Examples: oorlog, tijd, eeuw, stad, partij, naam, koning, plaats, film, regering

Relations

NOUN nodes are attached to their parents using 26 different relations: nmod (9736; 20% instances), obl (8401; 17% instances), nsubj (7280; 15% instances), obj (6248; 13% instances), conj (3811; 8% instances), root (2929; 6% instances), obl:arg (2524; 5% instances), nsubj:pass (2031; 4% instances), fixed (1241; 3% instances), appos (953; 2% instances), xcomp (921; 2% instances), parataxis (870; 2% instances), flat (717; 1% instances), advcl (641; 1% instances), obl:agent (510; 1% instances), compound:prt (219; 0% instances), iobj (109; 0% instances), ccomp (83; 0% instances), acl:relcl (78; 0% instances), acl (63; 0% instances), orphan (45; 0% instances), case (27; 0% instances), csubj (20; 0% instances), nsubj:outer (6; 0% instances), amod (4; 0% instances), nmod:poss (2; 0% instances)

Parents of NOUN nodes belong to 14 different parts of speech: VERB (27372; 55% instances), NOUN (12900; 26% instances), (2929; 6% instances), ADJ (1887; 4% instances), PROPN (1554; 3% instances), ADP (1039; 2% instances), NUM (528; 1% instances), DET (330; 1% instances), PRON (302; 1% instances), ADV (257; 1% instances), X (240; 0% instances), SYM (106; 0% instances), SCONJ (22; 0% instances), INTJ (3; 0% instances)

4164 (8%) NOUN nodes are leaves.

11988 (24%) NOUN nodes have one child.

15326 (31%) NOUN nodes have two children.

17991 (36%) NOUN nodes have three or more children.

The highest child degree of a NOUN node is 17.

Children of NOUN nodes are attached using 34 different relations: det (28287; 26% instances), case (19963; 18% instances), amod (14193; 13% instances), nmod (14179; 13% instances), punct (7586; 7% instances), conj (3605; 3% instances), appos (2983; 3% instances), cc (2947; 3% instances), nmod:poss (2629; 2% instances), acl:relcl (2047; 2% instances), nummod (1800; 2% instances), advmod (1718; 2% instances), cop (1553; 1% instances), nsubj (1414; 1% instances), mark (1341; 1% instances), acl (1303; 1% instances), flat (797; 1% instances), parataxis (573; 1% instances), obl (318; 0% instances), fixed (134; 0% instances), csubj (124; 0% instances), advcl (99; 0% instances), orphan (68; 0% instances), aux (62; 0% instances), expl (44; 0% instances), cc:preconj (38; 0% instances), obl:arg (20; 0% instances), ccomp (14; 0% instances), obj (3; 0% instances), aux:pass (2; 0% instances), iobj (2; 0% instances), xcomp (2; 0% instances), compound:prt (1; 0% instances), nsubj:outer (1; 0% instances)

Children of NOUN nodes belong to 16 different parts of speech: DET (28424; 26% instances), ADP (20332; 19% instances), ADJ (13615; 12% instances), NOUN (12900; 12% instances), PUNCT (7586; 7% instances), PROPN (7555; 7% instances), VERB (4961; 5% instances), PRON (3111; 3% instances), CCONJ (2933; 3% instances), NUM (2845; 3% instances), ADV (1881; 2% instances), AUX (1617; 1% instances), SCONJ (1260; 1% instances), X (548; 0% instances), SYM (280; 0% instances), INTJ (2; 0% instances)