home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Spanish-PUD: POS Tags: NUM

There are 228 NUM lemmas (5%), 230 NUM types (4%) and 435 NUM tokens (2%). Out of 17 observed tags, the rank of NUM is: 5 in number of lemmas, 5 in number of types and 12 in number of tokens.

The 10 most frequent NUM lemmas: dos, tres, 1, 10, cuatro, 3, mil, seis, 70, 100

The 10 most frequent NUM types: dos, tres, 1, 10, cuatro, 3, mil, seis, 70, 100

The 10 most frequent ambiguous lemmas: mil (NUM 7, NOUN 3), iii (ADJ 4, NUM 3), (NOUN 1, NUM 1), ciento (NOUN 1, NUM 1), uno (DET 456, NOUN 19, PRON 2, NUM 1), v (ADJ 1, NUM 1)

The 10 most frequent ambiguous types: III (ADJ 4, NUM 3), un (DET 242, NUM 3), (NOUN 1, NUM 1), V (ADJ 1, NUM 1), una (DET 172, PRON 2, NOUN 1, NUM 1)

Morphology

The form / lemma ratio of NUM is 1.008772 (the average of all parts of speech is 1.314341).

The 1st highest number of forms (2) was observed with the lemma “3000”: 3.000, 3000.

The 2nd highest number of forms (2) was observed with the lemma “5000”: 5.000, 5000.

The 3rd highest number of forms (1) was observed with the lemma “1”: 1.

NUM occurs with 4 features: NumForm (435; 100% instances), NumType (435; 100% instances), Gender (430; 99% instances), Foreign (2; 0% instances)

NUM occurs with 7 feature-value pairs: Foreign=Yes, Gender=Fem, Gender=Masc, NumForm=Digit, NumForm=Roman, NumForm=Word, NumType=Card

NUM occurs with 7 feature combinations. The most frequent feature combination is Gender=Masc|NumForm=Digit|NumType=Card (290 tokens). Examples: 1, 10, 3, 70, 100, 1492, 20, 2010, 2014, 2015

Relations

NUM nodes are attached to their parents using 12 different relations: nummod (191; 44% instances), obl (80; 18% instances), nmod (75; 17% instances), appos (47; 11% instances), obl:tmod (16; 4% instances), conj (12; 3% instances), nsubj (6; 1% instances), obj (4; 1% instances), compound (1; 0% instances), flat:name (1; 0% instances), nsubj:pass (1; 0% instances), root (1; 0% instances)

Parents of NUM nodes belong to 11 different parts of speech: NOUN (282; 65% instances), VERB (102; 23% instances), SYM (21; 5% instances), NUM (20; 5% instances), PROPN (3; 1% instances), DET (2; 0% instances), ADJ (1; 0% instances), ADP (1; 0% instances), ADV (1; 0% instances), PRON (1; 0% instances), (1; 0% instances)

216 (50%) NUM nodes are leaves.

128 (29%) NUM nodes have one child.

62 (14%) NUM nodes have two children.

29 (7%) NUM nodes have three or more children.

The highest child degree of a NUM node is 6.

Children of NUM nodes are attached using 16 different relations: case (158; 46% instances), det (51; 15% instances), punct (46; 13% instances), nmod (34; 10% instances), advmod (19; 5% instances), cc (12; 3% instances), conj (12; 3% instances), nummod (6; 2% instances), acl:relcl (2; 1% instances), acl (1; 0% instances), amod (1; 0% instances), cop (1; 0% instances), nsubj (1; 0% instances), obl (1; 0% instances), orphan (1; 0% instances), parataxis (1; 0% instances)

Children of NUM nodes belong to 11 different parts of speech: ADP (159; 46% instances), DET (51; 15% instances), PUNCT (46; 13% instances), NOUN (33; 10% instances), NUM (20; 6% instances), ADV (19; 5% instances), CCONJ (12; 3% instances), VERB (4; 1% instances), ADJ (1; 0% instances), AUX (1; 0% instances), PROPN (1; 0% instances)