Treebank Statistics: UD_English-PUD: POS Tags: NUM
There are 214 NUM
lemmas (4%), 216 NUM
types (4%) and 464 NUM
tokens (2%).
Out of 17 observed tags, the rank of NUM
is: 6 in number of lemmas, 6 in number of types and 12 in number of tokens.
The 10 most frequent NUM
lemmas: one, two, three, million, 10, four, 1, six, 3, I
The 10 most frequent NUM
types: one, two, three, million, 10, four, 1, six, 3, I
The 10 most frequent ambiguous lemmas: one (NUM 39, NOUN 7, PRON 1), million (NUM 13, NOUN 1), I (PRON 53, NUM 6), billion (NUM 6, NOUN 2), five (NUM 4, ADJ 1), ten (NUM 4, NOUN 1), thousand (NOUN 1, NUM 1)
The 10 most frequent ambiguous types: one (NUM 36, NOUN 4), I (PRON 48, NUM 6), five (NUM 3, ADJ 1)
- one
- I
- five
Morphology
The form / lemma ratio of NUM
is 1.009346 (the average of all parts of speech is 1.149901).
The 1st highest number of forms (2) was observed with the lemma “3000”: 3,000, 3000.
The 2nd highest number of forms (2) was observed with the lemma “billion”: billion, bn.
The 3rd highest number of forms (1) was observed with the lemma “1”: 1.
NUM
occurs with 3 features: NumForm (464; 100% instances), NumType (464; 100% instances), Abbr (4; 1% instances)
NUM
occurs with 6 feature-value pairs: Abbr=Yes
, NumForm=Digit
, NumForm=Roman
, NumForm=Word
, NumType=Card
, NumType=Frac
NUM
occurs with 5 feature combinations.
The most frequent feature combination is NumForm=Digit|NumType=Card
(288 tokens).
Examples: 10, 1, 3, 2014, 2015, 100, 1492, 20, 2010, 2012
Relations
NUM
nodes are attached to their parents using 16 different relations: nummod (254; 55% instances), obl (79; 17% instances), compound (31; 7% instances), nmod (31; 7% instances), flat (14; 3% instances), conj (12; 3% instances), nmod:unmarked (10; 2% instances), nsubj (10; 2% instances), obj (7; 2% instances), appos (6; 1% instances), root (3; 1% instances), nsubj:pass (2; 0% instances), orphan (2; 0% instances), advcl (1; 0% instances), amod (1; 0% instances), xcomp (1; 0% instances)
Parents of NUM
nodes belong to 9 different parts of speech: NOUN (209; 45% instances), VERB (93; 20% instances), PROPN (76; 16% instances), SYM (38; 8% instances), NUM (35; 8% instances), ADJ (5; 1% instances), ADV (4; 1% instances), (3; 1% instances), PRON (1; 0% instances)
259 (56%) NUM
nodes are leaves.
148 (32%) NUM
nodes have one child.
36 (8%) NUM
nodes have two children.
21 (5%) NUM
nodes have three or more children.
The highest child degree of a NUM
node is 8.
Children of NUM
nodes are attached using 19 different relations: case (116; 38% instances), advmod (46; 15% instances), nmod (31; 10% instances), punct (31; 10% instances), compound (18; 6% instances), cc (11; 4% instances), conj (8; 3% instances), cop (7; 2% instances), nsubj (7; 2% instances), det (6; 2% instances), nmod:unmarked (6; 2% instances), nummod (6; 2% instances), amod (3; 1% instances), acl (1; 0% instances), acl:relcl (1; 0% instances), advcl (1; 0% instances), flat (1; 0% instances), orphan (1; 0% instances), parataxis (1; 0% instances)
Children of NUM
nodes belong to 12 different parts of speech: ADP (113; 37% instances), ADV (41; 14% instances), NUM (35; 12% instances), NOUN (34; 11% instances), PUNCT (31; 10% instances), ADJ (13; 4% instances), CCONJ (11; 4% instances), AUX (7; 2% instances), DET (7; 2% instances), PROPN (4; 1% instances), VERB (4; 1% instances), SYM (2; 1% instances)