home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_English-LittlePrince: POS Tags: NUM

There are 34 NUM lemmas (3%), 34 NUM types (3%) and 142 NUM tokens (2%). Out of 16 observed tags, the rank of NUM is: 7 in number of lemmas, 7 in number of types and 13 in number of tokens.

The 10 most frequent NUM lemmas: one, hundred, five, two, twenty, four, three, million, seven, six

The 10 most frequent NUM types: one, hundred, five, two, twenty, four, three, million, seven, six

The 10 most frequent ambiguous lemmas: one (NUM 38, PRON 8, NOUN 7), million (NOUN 8, NUM 5)

The 10 most frequent ambiguous types: one (NUM 33, NOUN 7, PRON 6)

Morphology

The form / lemma ratio of NUM is 1.000000 (the average of all parts of speech is 1.151341).

The 1st highest number of forms (1) was observed with the lemma “10”: 10.

The 2nd highest number of forms (1) was observed with the lemma “11”: 11.

The 3rd highest number of forms (1) was observed with the lemma “12”: 12.

NUM occurs with 1 features: NumType (142; 100% instances)

NUM occurs with 1 feature-value pairs: NumType=Card

NUM occurs with 1 feature combinations. The most frequent feature combination is NumType=Card (142 tokens). Examples: one, hundred, five, two, twenty, four, three, million, seven, six

Relations

NUM nodes are attached to their parents using 14 different relations: nummod (55; 39% instances), compound (26; 18% instances), conj (19; 13% instances), nsubj (10; 7% instances), flat (9; 6% instances), obj (8; 6% instances), nmod (6; 4% instances), obl (3; 2% instances), advcl (1; 1% instances), appos (1; 1% instances), ccomp (1; 1% instances), obl:unmarked (1; 1% instances), parataxis (1; 1% instances), root (1; 1% instances)

Parents of NUM nodes belong to 7 different parts of speech: NUM (62; 44% instances), NOUN (44; 31% instances), VERB (24; 17% instances), PROPN (9; 6% instances), ADJ (1; 1% instances), PRON (1; 1% instances), (1; 1% instances)

66 (46%) NUM nodes are leaves.

32 (23%) NUM nodes have one child.

29 (20%) NUM nodes have two children.

15 (11%) NUM nodes have three or more children.

The highest child degree of a NUM node is 7.

Children of NUM nodes are attached using 10 different relations: punct (52; 35% instances), compound (27; 18% instances), cc (17; 11% instances), conj (17; 11% instances), nummod (14; 9% instances), nmod (8; 5% instances), case (7; 5% instances), advmod (3; 2% instances), det (3; 2% instances), appos (1; 1% instances)

Children of NUM nodes belong to 10 different parts of speech: NUM (62; 42% instances), PUNCT (52; 35% instances), CCONJ (17; 11% instances), SYM (5; 3% instances), ADP (3; 2% instances), ADV (3; 2% instances), DET (3; 2% instances), ADJ (2; 1% instances), NOUN (1; 1% instances), PRON (1; 1% instances)