home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_German-GSD: POS Tags: NUM

There are 1550 NUM lemmas (3%), 1554 NUM types (3%) and 7374 NUM tokens (3%). Out of 16 observed tags, the rank of NUM is: 5 in number of lemmas, 5 in number of types and 12 in number of tokens.

The 10 most frequent NUM lemmas: zwei, drei, vier, 2007, fünf, 2006, 2009, sechs, 2010, 1

The 10 most frequent NUM types: zwei, drei, vier, 2007, fünf, 2006, 2009, sechs, 2010, 1

The 10 most frequent ambiguous lemmas: zwei (NUM 342, PROPN 3, NOUN 2), drei (NUM 170, PROPN 4, NOUN 2), 2009 (NUM 71, PROPN 1), sechs (NUM 71, NOUN 1), 1 (NUM 68, PROPN 25, ADJ 4), 2 (NUM 67, PROPN 15, ADJ 1), 2008 (NUM 67, PROPN 1), 3 (NUM 61, PROPN 13, ADJ 2), 100 (NUM 58, PROPN 3), 20 (NUM 58, PROPN 2)

The 10 most frequent ambiguous types: zwei (NUM 313, PROPN 2, NOUN 1), drei (NUM 159, NOUN 2, PROPN 2), 2009 (NUM 71, PROPN 1), sechs (NUM 68, NOUN 1), 1 (NUM 68, PROPN 25), 2 (NUM 67, PROPN 15), 2008 (NUM 67, PROPN 1), 3 (NUM 61, PROPN 13, ADJ 1), 100 (NUM 58, PROPN 3), 20 (NUM 58, PROPN 2)

Morphology

The form / lemma ratio of NUM is 1.002581 (the average of all parts of speech is 1.185142).

The 1st highest number of forms (2) was observed with the lemma “Milliarde”: Milliarde, Milliarden.

The 2nd highest number of forms (2) was observed with the lemma “Million”: Million, Millionen.

The 3rd highest number of forms (2) was observed with the lemma “zwei”: zwei, zweien.

NUM occurs with 5 features: NumType (7374; 100% instances), Case (137; 2% instances), Number (137; 2% instances), Gender (131; 2% instances), VerbForm (1; 0% instances)

NUM occurs with 11 feature-value pairs: Case=Acc, Case=Dat, Case=Gen, Case=Nom, Gender=Fem, Gender=Masc, Gender=Neut, NumType=Card, Number=Plur, Number=Sing, VerbForm=Part

NUM occurs with 28 feature combinations. The most frequent feature combination is NumType=Card (7236 tokens). Examples: zwei, drei, vier, 2007, fünf, 2006, 2009, sechs, 2010, 1

Relations

NUM nodes are attached to their parents using 18 different relations: nummod (2934; 40% instances), nmod (1904; 26% instances), obl (1442; 20% instances), appos (447; 6% instances), conj (344; 5% instances), dep (100; 1% instances), compound (77; 1% instances), amod (42; 1% instances), root (23; 0% instances), nsubj (20; 0% instances), obj (13; 0% instances), flat (11; 0% instances), cop (7; 0% instances), nsubj:pass (5; 0% instances), advcl (2; 0% instances), acl (1; 0% instances), det (1; 0% instances), fixed (1; 0% instances)

Parents of NUM nodes belong to 14 different parts of speech: NOUN (4094; 56% instances), VERB (1516; 21% instances), PROPN (872; 12% instances), NUM (476; 6% instances), ADJ (216; 3% instances), ADP (85; 1% instances), X (48; 1% instances), (23; 0% instances), ADV (15; 0% instances), PRON (13; 0% instances), AUX (5; 0% instances), DET (5; 0% instances), SYM (4; 0% instances), CCONJ (2; 0% instances)

4481 (61%) NUM nodes are leaves.

2276 (31%) NUM nodes have one child.

448 (6%) NUM nodes have two children.

169 (2%) NUM nodes have three or more children.

The highest child degree of a NUM node is 7.

Children of NUM nodes are attached using 23 different relations: case (1212; 32% instances), punct (920; 24% instances), advmod (611; 16% instances), conj (333; 9% instances), nmod (181; 5% instances), cc (177; 5% instances), det (77; 2% instances), nummod (57; 2% instances), dep (40; 1% instances), compound (38; 1% instances), appos (24; 1% instances), amod (19; 1% instances), cop (19; 1% instances), nsubj (19; 1% instances), obj (9; 0% instances), acl (6; 0% instances), advcl (5; 0% instances), fixed (4; 0% instances), mark (2; 0% instances), compound:prt (1; 0% instances), det:poss (1; 0% instances), parataxis (1; 0% instances), xcomp (1; 0% instances)

Children of NUM nodes belong to 15 different parts of speech: ADP (1238; 33% instances), PUNCT (912; 24% instances), ADV (575; 15% instances), NUM (476; 13% instances), CCONJ (178; 5% instances), NOUN (119; 3% instances), DET (73; 2% instances), ADJ (52; 1% instances), PROPN (49; 1% instances), X (32; 1% instances), AUX (19; 1% instances), PRON (15; 0% instances), VERB (13; 0% instances), PART (4; 0% instances), SCONJ (2; 0% instances)