home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_English: POS Tags: NUM

There are 1241 NUM lemmas (6%), 1243 NUM types (5%) and 4912 NUM tokens (2%). Out of 17 observed tags, the rank of NUM is: 5 in number of lemmas, 5 in number of types and 13 in number of tokens.

The 10 most frequent NUM lemmas: one, two, 2, 3, 5, 1, 10, 4, three, 20

The 10 most frequent NUM types: one, two, 2, 3, 5, 1, 10, 4, three, 20

The 10 most frequent ambiguous lemmas: one (NUM 450, NOUN 147, PRON 26, VERB 1), 2 (NUM 145, X 30, PROPN 2, ADP 1, PART 1), 3 (NUM 122, X 17, NOUN 1), 5 (NUM 112, X 4, PROPN 1), 1 (NUM 111, X 31), 10 (NUM 99, X 2), 4 (NUM 97, X 13, ADP 1, SCONJ 1), 20 (NUM 66, NOUN 5), 6 (NUM 64, X 2), m (NUM 46, NOUN 17, PROPN 3)

The 10 most frequent ambiguous types: one (NUM 397, NOUN 106, PRON 22), 2 (NUM 145, X 30, PROPN 2, ADP 1, PART 1), 3 (NUM 122, X 17), 5 (NUM 112, X 4, PROPN 1), 1 (NUM 111, X 31), 10 (NUM 99, X 2), 4 (NUM 97, X 13, ADP 1, SCONJ 1), 20 (NUM 66, NOUN 3), 6 (NUM 64, X 2), m (NUM 41, AUX 21, NOUN 11, PROPN 3, VERB 1)

Morphology

The form / lemma ratio of NUM is 1.001612 (the average of all parts of speech is 1.176027).

The 1st highest number of forms (2) was observed with the lemma “’72”: ‘72, ’72.

The 2nd highest number of forms (2) was observed with the lemma “’73”: ‘73, ’73.

The 3rd highest number of forms (1) was observed with the lemma “’02”: ‘02.

NUM occurs with 2 features: NumType (4911; 100% instances), Number (1; 0% instances)

NUM occurs with 2 feature-value pairs: NumType=Card, Number=Sing

NUM occurs with 2 feature combinations. The most frequent feature combination is NumType=Card (4911 tokens). Examples: one, two, 2, 3, 5, 1, 10, 4, three, 20

Relations

NUM nodes are attached to their parents using 29 different relations: nummod (2895; 59% instances), root (423; 9% instances), nmod (291; 6% instances), compound (254; 5% instances), obl (248; 5% instances), appos (212; 4% instances), list (115; 2% instances), nsubj (105; 2% instances), obj (105; 2% instances), conj (95; 2% instances), nmod:tmod (53; 1% instances), amod (19; 0% instances), parataxis (18; 0% instances), obl:tmod (15; 0% instances), advcl (9; 0% instances), advmod (9; 0% instances), xcomp (9; 0% instances), ccomp (7; 0% instances), obl:npmod (7; 0% instances), nmod:npmod (6; 0% instances), nsubj:pass (4; 0% instances), acl:relcl (3; 0% instances), case (2; 0% instances), det (2; 0% instances), reparandum (2; 0% instances), iobj (1; 0% instances), nmod:poss (1; 0% instances), orphan (1; 0% instances), vocative (1; 0% instances)

Parents of NUM nodes belong to 13 different parts of speech: NOUN (2379; 48% instances), PROPN (791; 16% instances), NUM (440; 9% instances), VERB (437; 9% instances), (423; 9% instances), SYM (354; 7% instances), ADJ (41; 1% instances), ADV (18; 0% instances), X (15; 0% instances), PRON (7; 0% instances), DET (5; 0% instances), AUX (1; 0% instances), PUNCT (1; 0% instances)

3140 (64%) NUM nodes are leaves.

1106 (23%) NUM nodes have one child.

281 (6%) NUM nodes have two children.

385 (8%) NUM nodes have three or more children.

The highest child degree of a NUM node is 10.

Children of NUM nodes are attached using 35 different relations: punct (711; 23% instances), case (559; 18% instances), nmod (349; 11% instances), advmod (217; 7% instances), nmod:tmod (197; 6% instances), appos (174; 6% instances), compound (157; 5% instances), conj (102; 3% instances), cop (92; 3% instances), nummod (91; 3% instances), cc (88; 3% instances), nsubj (88; 3% instances), det (60; 2% instances), parataxis (47; 2% instances), amod (29; 1% instances), acl:relcl (19; 1% instances), mark (15; 0% instances), nmod:npmod (12; 0% instances), aux (11; 0% instances), obl (11; 0% instances), advcl (9; 0% instances), discourse (5; 0% instances), acl (4; 0% instances), _ (3; 0% instances), nmod:poss (2; 0% instances), reparandum (2; 0% instances), cc:preconj (1; 0% instances), ccomp (1; 0% instances), csubj (1; 0% instances), det:predet (1; 0% instances), goeswith (1; 0% instances), list (1; 0% instances), obj (1; 0% instances), vocative (1; 0% instances), xcomp (1; 0% instances)

Children of NUM nodes belong to 17 different parts of speech: PUNCT (701; 23% instances), NOUN (572; 19% instances), ADP (483; 16% instances), NUM (437; 14% instances), ADV (189; 6% instances), SYM (113; 4% instances), AUX (103; 3% instances), ADJ (87; 3% instances), CCONJ (86; 3% instances), PRON (82; 3% instances), VERB (75; 2% instances), DET (69; 2% instances), PROPN (47; 2% instances), SCONJ (8; 0% instances), PART (5; 0% instances), INTJ (3; 0% instances), X (3; 0% instances)