home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_German-HDT: POS Tags: NUM

There are 6500 NUM lemmas (8%), 6500 NUM types (3%) and 71231 NUM tokens (2%). Out of 16 observed tags, the rank of NUM is: 5 in number of lemmas, 6 in number of types and 12 in number of tokens.

The 10 most frequent NUM lemmas: zwei, 2000, drei, 2001, 1999, vier, fünf, 20, 100, 30

The 10 most frequent NUM types: zwei, 2000, drei, 2001, 1999, vier, fünf, 20, 100, 30

The 10 most frequent ambiguous lemmas: 20 (NUM 1030, X 1), 2002 (NUM 477, X 1), sieben (NUM 403, VERB 1), II (NUM 219, PROPN 1, X 1), x (X 104, NUM 13, NOUN 3), elf (NUM 113, X 1), 512 (NUM 84, X 1), i (NUM 9, NOUN 2, X 1), V (NOUN 39, NUM 34), 9x (NUM 26, X 20)

The 10 most frequent ambiguous types: 20 (NUM 1030, X 1), 2002 (NUM 477, X 1), sieben (NUM 403, VERB 1), II (NUM 219, PROPN 1, X 1), x (X 104, NUM 13), elf (NUM 113, X 1), eins (NUM 83, DET 26), 512 (NUM 84, X 1), i (NUM 9, NOUN 2, X 1), V (NOUN 41, NUM 34)

Morphology

The form / lemma ratio of NUM is 1.000000 (the average of all parts of speech is 2.529657).

The 1st highest number of forms (1) was observed with the lemma “’68”: ‘68.

The 2nd highest number of forms (1) was observed with the lemma “’95”: ‘95.

The 3rd highest number of forms (1) was observed with the lemma “’96”: ‘96.

NUM occurs with 3 features: NumType (71230; 100% instances), Number (71225; 100% instances), Case (22; 0% instances)

NUM occurs with 6 feature-value pairs: Case=Dat, Case=Gen, Case=Nom, NumType=Card, Number=Plur, Number=Sing

NUM occurs with 7 feature combinations. The most frequent feature combination is Number=Plur|NumType=Card (70456 tokens). Examples: zwei, 2000, drei, 2001, 1999, vier, fünf, 20, 100, 30

Relations

NUM nodes are attached to their parents using 18 different relations: nummod (48658; 68% instances), flat (8506; 12% instances), nmod (7819; 11% instances), obl (3572; 5% instances), conj (852; 1% instances), appos (526; 1% instances), obj (500; 1% instances), nsubj (463; 1% instances), nsubj:pass (154; 0% instances), root (88; 0% instances), xcomp (54; 0% instances), parataxis (18; 0% instances), obl:arg (10; 0% instances), advcl (4; 0% instances), reparandum (3; 0% instances), orphan (2; 0% instances), acl (1; 0% instances), ccomp (1; 0% instances)

Parents of NUM nodes belong to 11 different parts of speech: NOUN (60233; 85% instances), VERB (4142; 6% instances), PROPN (4094; 6% instances), X (1144; 2% instances), NUM (687; 1% instances), ADJ (554; 1% instances), AUX (103; 0% instances), (88; 0% instances), DET (83; 0% instances), ADV (82; 0% instances), PRON (21; 0% instances)

51735 (73%) NUM nodes are leaves.

15464 (22%) NUM nodes have one child.

3047 (4%) NUM nodes have two children.

985 (1%) NUM nodes have three or more children.

The highest child degree of a NUM node is 10.

Children of NUM nodes are attached using 23 different relations: advmod (12907; 51% instances), case (4243; 17% instances), nmod (3486; 14% instances), conj (1476; 6% instances), punct (1445; 6% instances), cc (542; 2% instances), appos (298; 1% instances), det (275; 1% instances), amod (125; 0% instances), nsubj (92; 0% instances), cop (89; 0% instances), flat:name (32; 0% instances), aux (24; 0% instances), flat (21; 0% instances), acl (9; 0% instances), ccomp (7; 0% instances), mark (5; 0% instances), nummod (4; 0% instances), advcl (3; 0% instances), parataxis (3; 0% instances), reparandum (3; 0% instances), xcomp (3; 0% instances), obl (2; 0% instances)

Children of NUM nodes belong to 15 different parts of speech: ADV (11386; 45% instances), ADP (5302; 21% instances), NOUN (2937; 12% instances), ADJ (1648; 7% instances), PUNCT (1445; 6% instances), NUM (687; 3% instances), CCONJ (620; 2% instances), PROPN (296; 1% instances), DET (292; 1% instances), X (233; 1% instances), AUX (115; 0% instances), PRON (90; 0% instances), PART (19; 0% instances), VERB (19; 0% instances), SCONJ (5; 0% instances)