home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Dutch-LassySmall: POS Tags: X

There are 1592 X lemmas (6%), 1609 X types (5%) and 3282 X tokens (1%). Out of 16 observed tags, the rank of X is: 5 in number of lemmas, 5 in number of types and 14 in number of tokens.

The 10 most frequent X lemmas: the, of, de, Star, Trek, onder ander, Army, les, Nederland, circa

The 10 most frequent X types: the, of, de, Star, Trek, o.a., Army, les, la, in

The 10 most frequent ambiguous lemmas: the (X 72, PROPN 2), of (CCONJ 472, X 74, SCONJ 34, PROPN 22), de (DET 18976, PROPN 197, X 41), Star (X 40, PROPN 7), les (NOUN 6, X 6), Nederland (PROPN 153, X 30), circa (X 29, ADV 11, DET 3), la (PROPN 16, X 15), in (ADP 7144, X 15, PROPN 5), nummer (NOUN 80, X 25)

The 10 most frequent ambiguous types: the (X 72, PROPN 2), of (CCONJ 462, X 74, SCONJ 31, PROPN 22), de (DET 16356, PROPN 197, X 41), Star (X 40, PROPN 7), les (X 6, NOUN 3), la (PROPN 16, X 15), in (ADP 5997, X 15, PROPN 5), ca. (X 24, DET 2), Potomac (X 19, PROPN 7), and (X 11, PROPN 3)

Morphology

The form / lemma ratio of X is 1.010678 (the average of all parts of speech is 1.223065).

The 1st highest number of forms (5) was observed with the lemma “bijvoorbeeld”: b.v., bijv, bijv., bv, bv..

The 2nd highest number of forms (3) was observed with the lemma “nummer”: nr, nr., nrs.

The 3rd highest number of forms (3) was observed with the lemma “onder ander”: o.a., oa, oa..

X occurs with 3 features: Foreign (2846; 87% instances), ExtPos (634; 19% instances), Abbr (259; 8% instances)

X occurs with 7 feature-value pairs: Abbr=Yes, ExtPos=ADP, ExtPos=ADV, ExtPos=CCONJ, ExtPos=PRON, ExtPos=PROPN, Foreign=Yes

X occurs with 10 feature combinations. The most frequent feature combination is Foreign=Yes (2217 tokens). Examples: the, of, de, Trek, in, Army, les, Star, Potomac, la

Relations

X nodes are attached to their parents using 24 different relations: flat (1796; 55% instances), nmod (373; 11% instances), appos (174; 5% instances), conj (146; 4% instances), fixed (130; 4% instances), root (121; 4% instances), nsubj (120; 4% instances), obl (104; 3% instances), obj (73; 2% instances), parataxis (67; 2% instances), obl:arg (36; 1% instances), nsubj:pass (29; 1% instances), acl (27; 1% instances), mark (18; 1% instances), xcomp (18; 1% instances), advcl (15; 0% instances), case (14; 0% instances), obl:agent (7; 0% instances), amod (6; 0% instances), acl:relcl (3; 0% instances), ccomp (2; 0% instances), csubj (1; 0% instances), iobj (1; 0% instances), orphan (1; 0% instances)

Parents of X nodes belong to 12 different parts of speech: X (1912; 58% instances), NOUN (548; 17% instances), VERB (391; 12% instances), (121; 4% instances), PROPN (108; 3% instances), SYM (83; 3% instances), NUM (70; 2% instances), ADJ (31; 1% instances), DET (9; 0% instances), ADV (4; 0% instances), PRON (4; 0% instances), ADP (1; 0% instances)

2113 (64%) X nodes are leaves.

184 (6%) X nodes have one child.

231 (7%) X nodes have two children.

754 (23%) X nodes have three or more children.

The highest child degree of a X node is 59.

Children of X nodes are attached using 27 different relations: flat (1833; 42% instances), punct (974; 22% instances), case (300; 7% instances), det (279; 6% instances), conj (242; 6% instances), nmod (131; 3% instances), amod (103; 2% instances), cc (80; 2% instances), fixed (79; 2% instances), parataxis (75; 2% instances), appos (60; 1% instances), acl (36; 1% instances), mark (32; 1% instances), cop (29; 1% instances), nsubj (28; 1% instances), nummod (28; 1% instances), acl:relcl (27; 1% instances), nmod:poss (24; 1% instances), advmod (10; 0% instances), cc:preconj (6; 0% instances), advcl (4; 0% instances), obl (3; 0% instances), orphan (2; 0% instances), aux:pass (1; 0% instances), ccomp (1; 0% instances), nsubj:pass (1; 0% instances), obj (1; 0% instances)

Children of X nodes belong to 15 different parts of speech: X (1912; 44% instances), PUNCT (974; 22% instances), ADP (303; 7% instances), DET (282; 6% instances), NOUN (240; 5% instances), SYM (131; 3% instances), ADJ (103; 2% instances), PROPN (96; 2% instances), CCONJ (95; 2% instances), VERB (74; 2% instances), NUM (66; 2% instances), ADV (30; 1% instances), AUX (30; 1% instances), SCONJ (29; 1% instances), PRON (24; 1% instances)