home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_German: POS Tags: NUM

There are 1550 NUM lemmas (3%), 1558 NUM types (3%) and 7461 NUM tokens (3%). Out of 15 observed tags, the rank of NUM is: 5 in number of lemmas, 5 in number of types and 12 in number of tokens.

The 10 most frequent NUM lemmas: zwei, drei, vier, 2007, ein, 2006, fünf, 1, 2009, 2010

The 10 most frequent NUM types: zwei, drei, vier, 2007, 2006, fünf, 1, 2009, 2010, sechs

The 10 most frequent ambiguous lemmas: zwei (NUM 339, PROPN 3, NOUN 2), drei (NUM 170, PROPN 4, NOUN 2), ein (DET 5199, PRON 176, ADV 141, NUM 76, NOUN 4, PROPN 4), 1 (NUM 72, PROPN 25), 2009 (NUM 71, PROPN 1), sechs (NUM 70, NOUN 1), 2 (NUM 68, PROPN 15), 2008 (NUM 67, PROPN 1), 3 (NUM 62, PROPN 13, ADJ 1), 100 (NUM 58, PROPN 3)

The 10 most frequent ambiguous types: zwei (NUM 313, PROPN 2, NOUN 1), drei (NUM 159, NOUN 2, PROPN 2), 1 (NUM 72, PROPN 25), 2009 (NUM 71, PROPN 1), sechs (NUM 68, NOUN 1), 2 (NUM 68, PROPN 15), 2008 (NUM 67, PROPN 1), 3 (NUM 62, PROPN 13, ADJ 1), 100 (NUM 58, PROPN 3), 20 (NUM 58, PROPN 2)

Morphology

The form / lemma ratio of NUM is 1.005161 (the average of all parts of speech is 1.186689).

The 1st highest number of forms (6) was observed with the lemma “ein”: ein, eine, einem, einen, einer, eines.

The 2nd highest number of forms (2) was observed with the lemma “Milliarde”: Milliarde, Milliarden.

The 3rd highest number of forms (2) was observed with the lemma “Million”: Million, Millionen.

NUM occurs with 4 features: NumType (7461; 100% instances), Case (35; 0% instances), Number (35; 0% instances), Gender (29; 0% instances)

NUM occurs with 9 feature-value pairs: Case=Acc, Case=Dat, Case=Nom, Gender=Fem, Gender=Masc, Gender=Masc,Neut, NumType=Card, Number=Plur, Number=Sing

NUM occurs with 10 feature combinations. The most frequent feature combination is NumType=Card (7426 tokens). Examples: zwei, drei, vier, 2007, 2006, fünf, 2009, 2010, 1, sechs

Relations

NUM nodes are attached to their parents using 18 different relations: nummod (3006; 40% instances), nmod (1906; 26% instances), obl (1442; 19% instances), appos (447; 6% instances), conj (344; 5% instances), dep (91; 1% instances), compound (76; 1% instances), amod (61; 1% instances), root (23; 0% instances), nsubj (22; 0% instances), obj (13; 0% instances), flat (11; 0% instances), cop (8; 0% instances), nsubj:pass (6; 0% instances), advcl (2; 0% instances), acl (1; 0% instances), det (1; 0% instances), fixed (1; 0% instances)

Parents of NUM nodes belong to 14 different parts of speech: NOUN (4165; 56% instances), VERB (1518; 20% instances), PROPN (884; 12% instances), NUM (483; 6% instances), ADJ (221; 3% instances), ADP (85; 1% instances), X (48; 1% instances), (23; 0% instances), ADV (15; 0% instances), PRON (10; 0% instances), AUX (5; 0% instances), CCONJ (2; 0% instances), DET (1; 0% instances), PUNCT (1; 0% instances)

4678 (63%) NUM nodes are leaves.

2168 (29%) NUM nodes have one child.

442 (6%) NUM nodes have two children.

173 (2%) NUM nodes have three or more children.

The highest child degree of a NUM node is 7.

Children of NUM nodes are attached using 23 different relations: case (1215; 33% instances), punct (811; 22% instances), advmod (639; 17% instances), conj (334; 9% instances), nmod (182; 5% instances), cc (179; 5% instances), det (77; 2% instances), nummod (58; 2% instances), compound (37; 1% instances), appos (24; 1% instances), amod (19; 1% instances), cop (19; 1% instances), nsubj (19; 1% instances), dep (11; 0% instances), obj (9; 0% instances), acl (6; 0% instances), advcl (5; 0% instances), fixed (4; 0% instances), mark (2; 0% instances), compound:prt (1; 0% instances), det:poss (1; 0% instances), parataxis (1; 0% instances), xcomp (1; 0% instances)

Children of NUM nodes belong to 15 different parts of speech: ADP (1235; 34% instances), PUNCT (803; 22% instances), ADV (591; 16% instances), NUM (483; 13% instances), CCONJ (180; 5% instances), NOUN (119; 3% instances), DET (64; 2% instances), PROPN (50; 1% instances), ADJ (44; 1% instances), X (32; 1% instances), AUX (19; 1% instances), PRON (15; 0% instances), VERB (13; 0% instances), PART (4; 0% instances), SCONJ (2; 0% instances)