home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_German-GSD: POS Tags: NUM

There are 1538 NUM lemmas (3%), 1543 NUM types (3%) and 7336 NUM tokens (3%). Out of 17 observed tags, the rank of NUM is: 5 in number of lemmas, 5 in number of types and 12 in number of tokens.

The 10 most frequent NUM lemmas: zwei, drei, vier, 2007, fünf, 2006, 2009, sechs, 2010, 2008

The 10 most frequent NUM types: zwei, drei, vier, 2007, fünf, 2006, 2009, sechs, 2010, 2008

The 10 most frequent ambiguous lemmas: zwei (NUM 342, PROPN 3, NOUN 2), drei (NUM 170, PROPN 4, NOUN 2), 2009 (NUM 71, PROPN 1), sechs (NUM 71, NOUN 1), 2008 (NUM 67, PROPN 1), 1 (NUM 66, PROPN 25), 2 (NUM 66, PROPN 15), 3 (NUM 60, PROPN 13, ADJ 1), 100 (NUM 58, PROPN 3), 20 (NUM 58, PROPN 2)

The 10 most frequent ambiguous types: zwei (NUM 313, PROPN 2, NOUN 1), drei (NUM 159, NOUN 2, PROPN 2), 2009 (NUM 71, PROPN 1), sechs (NUM 68, NOUN 1), 2008 (NUM 67, PROPN 1), 1 (NUM 66, PROPN 25), 2 (NUM 66, PROPN 15), 3 (NUM 60, PROPN 13, ADJ 1), 100 (NUM 58, PROPN 3), 20 (NUM 58, PROPN 2)

Morphology

The form / lemma ratio of NUM is 1.003251 (the average of all parts of speech is 1.187855).

The 1st highest number of forms (2) was observed with the lemma “Milliarde”: Milliarde, Milliarden.

The 2nd highest number of forms (2) was observed with the lemma “Million”: Million, Millionen.

The 3rd highest number of forms (2) was observed with the lemma “Tausend”: T, Tausend.

NUM occurs with 5 features: NumType (7336; 100% instances), Case (112; 2% instances), Number (108; 1% instances), Gender (102; 1% instances), Abbr (1; 0% instances)

NUM occurs with 12 feature-value pairs: Abbr=Yes, Case=Acc, Case=Dat, Case=Gen, Case=Nom, Gender=Fem, Gender=Masc, Gender=Neut, NumType=Card, NumType=Ord, Number=Plur, Number=Sing

NUM occurs with 30 feature combinations. The most frequent feature combination is NumType=Card (7223 tokens). Examples: zwei, drei, vier, 2007, fünf, 2006, 2009, 2010, sechs, 2008

Relations

NUM nodes are attached to their parents using 19 different relations: nummod (2917; 40% instances), nmod (1892; 26% instances), obl (1497; 20% instances), appos (436; 6% instances), conj (365; 5% instances), compound (78; 1% instances), dep (48; 1% instances), nsubj (21; 0% instances), obl:tmod (17; 0% instances), root (16; 0% instances), flat (11; 0% instances), obj (11; 0% instances), xcomp (9; 0% instances), amod (7; 0% instances), nsubj:pass (5; 0% instances), advcl (2; 0% instances), orphan (2; 0% instances), fixed (1; 0% instances), obl:agent (1; 0% instances)

Parents of NUM nodes belong to 14 different parts of speech: NOUN (4104; 56% instances), VERB (1551; 21% instances), PROPN (841; 11% instances), NUM (492; 7% instances), ADJ (218; 3% instances), X (39; 1% instances), ADV (20; 0% instances), SYM (17; 0% instances), (16; 0% instances), AUX (10; 0% instances), DET (9; 0% instances), PRON (9; 0% instances), ADP (8; 0% instances), CCONJ (2; 0% instances)

4430 (60%) NUM nodes are leaves.

2253 (31%) NUM nodes have one child.

481 (7%) NUM nodes have two children.

172 (2%) NUM nodes have three or more children.

The highest child degree of a NUM node is 7.

Children of NUM nodes are attached using 24 different relations: case (1293; 34% instances), punct (886; 23% instances), advmod (599; 16% instances), conj (357; 9% instances), nmod (198; 5% instances), cc (188; 5% instances), det (76; 2% instances), nummod (52; 1% instances), compound (36; 1% instances), dep (30; 1% instances), appos (24; 1% instances), amod (20; 1% instances), nsubj (11; 0% instances), cop (10; 0% instances), obl (8; 0% instances), acl (7; 0% instances), advcl (5; 0% instances), fixed (3; 0% instances), aux (1; 0% instances), compound:prt (1; 0% instances), det:poss (1; 0% instances), mark (1; 0% instances), parataxis (1; 0% instances), xcomp (1; 0% instances)

Children of NUM nodes belong to 16 different parts of speech: ADP (1309; 34% instances), PUNCT (886; 23% instances), ADV (572; 15% instances), NUM (492; 13% instances), CCONJ (178; 5% instances), NOUN (116; 3% instances), DET (87; 2% instances), ADJ (56; 1% instances), PROPN (49; 1% instances), X (17; 0% instances), VERB (14; 0% instances), AUX (11; 0% instances), SYM (11; 0% instances), PRON (7; 0% instances), PART (3; 0% instances), SCONJ (1; 0% instances)