home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Spanish-GSD: POS Tags: NUM

There are 2304 NUM lemmas (6%), 2411 NUM types (5%) and 11062 NUM tokens (3%). Out of 17 observed tags, the rank of NUM is: 5 in number of lemmas, 5 in number of types and 11 in number of tokens.

The 10 most frequent NUM lemmas: dos, tres, 2010, 0, 3, cuatro, 1, 2, 10, 4

The 10 most frequent NUM types: dos, tres, 2010, 0, cuatro, 3, 1, 2, 10, 4

The 10 most frequent ambiguous lemmas: dos (NUM 594, NOUN 5, PROPN 5, ADP 2, X 2), tres (NUM 232, PROPN 3, NOUN 2), cuatro (NUM 164, NOUN 2, PROPN 1), uno (DET 7651, PRON 539, NUM 109, ADJ 2, NOUN 2, PROPN 2, X 1), 2000 (NUM 102, NOUN 1), 6 (NUM 96, NOUN 1), cinco (NUM 93, NOUN 2, PROPN 1), ii (NUM 70, ADJ 9), i (NUM 53, CCONJ 20, X 15, PROPN 13, ADJ 4, PRON 1), mil (NUM 47, NOUN 34, PROPN 3)

The 10 most frequent ambiguous types: dos (NUM 569, NOUN 5, X 2, ADP 1), tres (NUM 224, NOUN 1), 0 (NUM 195, CCONJ 1), cuatro (NUM 157, NOUN 1), 2000 (NUM 94, NOUN 1), cinco (NUM 87, NOUN 2), II (NUM 70, ADJ 9), un (DET 3886, NUM 53), I (NUM 53, PROPN 12, X 9, PRON 5, ADJ 4), siete (NUM 36, NOUN 1, PROPN 1)

Morphology

The form / lemma ratio of NUM is 1.046441 (the average of all parts of speech is 1.287914).

The 1st highest number of forms (4) was observed with the lemma “2000”: 2,000, 2.000, 2000, 25000.

The 2nd highest number of forms (4) was observed with the lemma “3”: 3, 33, 36, 37.

The 3rd highest number of forms (3) was observed with the lemma “1100”: 1,100, 1.100, 1100.

NUM occurs with 6 features: NumType (11061; 100% instances), NumForm (11023; 100% instances), Number (1628; 15% instances), Gender (209; 2% instances), Foreign (4; 0% instances), Typo (2; 0% instances)

NUM occurs with 10 feature-value pairs: Foreign=Yes, Gender=Fem, Gender=Masc, NumForm=Digit, NumForm=Roman, NumForm=Word, NumType=Card, Number=Plur, Number=Sing, Typo=Yes

NUM occurs with 26 feature combinations. The most frequent feature combination is NumForm=Digit|NumType=Card (8948 tokens). Examples: 2010, 0, 3, 1, 2, 10, 4, 5, 20, 2011

Relations

NUM nodes are attached to their parents using 17 different relations: nummod (6061; 55% instances), nmod (2129; 19% instances), obl (1718; 16% instances), appos (453; 4% instances), conj (438; 4% instances), dep (91; 1% instances), nsubj (70; 1% instances), obj (46; 0% instances), root (19; 0% instances), flat (11; 0% instances), xcomp (7; 0% instances), parataxis (6; 0% instances), nsubj:pass (5; 0% instances), compound (4; 0% instances), amod (2; 0% instances), obl:agent (1; 0% instances), obl:arg (1; 0% instances)

Parents of NUM nodes belong to 14 different parts of speech: NOUN (5115; 46% instances), PROPN (1986; 18% instances), VERB (1884; 17% instances), SYM (1076; 10% instances), NUM (634; 6% instances), X (224; 2% instances), ADJ (65; 1% instances), PRON (31; 0% instances), ADV (20; 0% instances), (19; 0% instances), DET (3; 0% instances), ADP (2; 0% instances), CCONJ (2; 0% instances), PART (1; 0% instances)

5413 (49%) NUM nodes are leaves.

3705 (33%) NUM nodes have one child.

1357 (12%) NUM nodes have two children.

587 (5%) NUM nodes have three or more children.

The highest child degree of a NUM node is 16.

Children of NUM nodes are attached using 22 different relations: case (3130; 37% instances), punct (1484; 18% instances), nmod (1181; 14% instances), det (966; 11% instances), conj (458; 5% instances), advmod (367; 4% instances), cc (312; 4% instances), dep (155; 2% instances), nummod (116; 1% instances), obl (71; 1% instances), appos (62; 1% instances), amod (35; 0% instances), acl:relcl (20; 0% instances), cop (17; 0% instances), nsubj (15; 0% instances), acl (11; 0% instances), advcl (9; 0% instances), compound (7; 0% instances), flat (3; 0% instances), parataxis (3; 0% instances), aux (1; 0% instances), mark (1; 0% instances)

Children of NUM nodes belong to 16 different parts of speech: ADP (3129; 37% instances), PUNCT (1484; 18% instances), DET (1034; 12% instances), PROPN (945; 11% instances), NUM (634; 8% instances), ADV (423; 5% instances), CCONJ (313; 4% instances), NOUN (209; 2% instances), SYM (83; 1% instances), VERB (48; 1% instances), ADJ (45; 1% instances), X (42; 0% instances), AUX (18; 0% instances), PRON (15; 0% instances), PART (1; 0% instances), SCONJ (1; 0% instances)