This is part of archived UD v1 documentation. See http://universaldependencies.org/ for the current version.
home ar/pos issue tracker

NUM: numeral

This document is a placeholder for the language-specific documentation for NUM.


Treebank Statistics (UD_Arabic)

There are 993 NUM lemmas (6%), 1083 NUM types (4%) and 7756 NUM tokens (3%). Out of 16 observed tags, the rank of NUM is: 5 in number of lemmas, 5 in number of types and 9 in number of tokens.

The 10 most frequent NUM lemmas: مِليُون، أَلف، 15، 3، ثَلَاثَة، مِليَار، 6، 2، 8، 7

The 10 most frequent NUM types: مليون، 15، 3، 6، 2، 8، 7، مليار، ألف، 4

The 10 most frequent ambiguous lemmas: اِثنَان (NOUN 58, NUM 44), وَاحِد (ADJ 100, NUM 31), أَحَد (NOUN 194, NUM 1)

The 10 most frequent ambiguous types: مليون (NUM 485, X 48), مليار (NUM 153, X 28), ألف (NUM 143, X 2, VERB 1), بليون (NUM 71, X 1), الف (NUM 62, X 4), عشرة (NUM 49, X 2), عشرين (NUM 31, X 1), اثنين (NUM 29, NOUN 1), الاف (NUM 26, X 4), خمس (NUM 24, X 1)

Morphology

The form / lemma ratio of NUM is 1.090634 (the average of all parts of speech is 1.685612).

The 1st highest number of forms (16) was observed with the lemma “أَلف”: آلاف, آلافا, ألف, ألفا, ألفاً, ألفي, ألفين, الآلاف, الألف, الاف, الالاف, الف, الفا, الفى, الفي, الفين.

The 2nd highest number of forms (9) was observed with the lemma “أَربَعَة”: أربع, أربعاً, أربعة, اربع, اربعة, الأربع, الأربعة, الاربع, الاربعة.

The 3rd highest number of forms (9) was observed with the lemma “مِليَار”: المليار, المليارات, مليار, مليارا, مليارات, ملياراً, مليارى, ملياري, مليارين.

NUM occurs with 7 features: NumForm (7756; 100% instances), Case (2206; 28% instances), Definite (2205; 28% instances), Number (1442; 19% instances), Gender (700; 9% instances), NumValue (580; 7% instances), Negative (1; 0% instances)

NUM occurs with 18 feature-value pairs: Case=Acc, Case=Gen, Case=Nom, Definite=Com, Definite=Def, Definite=Ind, Definite=Red, Gender=Fem, Gender=Masc, Negative=Neg, NumForm=Digit, NumForm=Word, NumValue=1, NumValue=2, NumValue=3, Number=Dual, Number=Plur, Number=Sing

NUM occurs with 77 feature combinations. The most frequent feature combination is NumForm=Digit (5521 tokens). Examples: 15، 3، 6، 2، 8، 7، 4، 11، 10، 12

Relations

NUM nodes are attached to their parents using 20 different relations: nummod (3706; 48% instances), conj (1035; 13% instances), dobj (1015; 13% instances), advmod (674; 9% instances), dep (641; 8% instances), nsubj (288; 4% instances), appos (126; 2% instances), root (117; 2% instances), nsubjpass (55; 1% instances), iobj (36; 0% instances), cop (26; 0% instances), parataxis (11; 0% instances), nmod (8; 0% instances), cc (7; 0% instances), acl (2; 0% instances), advcl (2; 0% instances), case (2; 0% instances), ccomp (2; 0% instances), xcomp (2; 0% instances), aux (1; 0% instances)

Parents of NUM nodes belong to 14 different parts of speech: NUM (2531; 33% instances), NOUN (2340; 30% instances), VERB (1739; 22% instances), X (547; 7% instances), ADJ (296; 4% instances), ROOT (117; 2% instances), PRON (114; 1% instances), CONJ (20; 0% instances), DET (17; 0% instances), ADV (13; 0% instances), ADP (8; 0% instances), PUNCT (8; 0% instances), PART (4; 0% instances), PROPN (2; 0% instances)

1576 (20%) NUM nodes are leaves.

2953 (38%) NUM nodes have one child.

1956 (25%) NUM nodes have two children.

1271 (16%) NUM nodes have three or more children.

The highest child degree of a NUM node is 22.

Children of NUM nodes are attached using 21 different relations: nmod (4128; 35% instances), case (1810; 16% instances), punct (1559; 13% instances), nummod (1493; 13% instances), conj (1015; 9% instances), cc (692; 6% instances), amod (335; 3% instances), nsubj (121; 1% instances), appos (112; 1% instances), acl (101; 1% instances), dep (64; 1% instances), advmod:emph (52; 0% instances), parataxis (49; 0% instances), advmod (34; 0% instances), cop (25; 0% instances), dobj (24; 0% instances), mark (17; 0% instances), xcomp (9; 0% instances), aux (6; 0% instances), advcl (5; 0% instances), neg (1; 0% instances)

Children of NUM nodes belong to 12 different parts of speech: NOUN (3699; 32% instances), NUM (2531; 22% instances), ADP (1832; 16% instances), PUNCT (1559; 13% instances), CONJ (527; 5% instances), X (387; 3% instances), ADJ (374; 3% instances), SYM (346; 3% instances), VERB (175; 2% instances), PRON (133; 1% instances), PART (49; 0% instances), ADV (40; 0% instances)


NUM in other languages: [bg] [cs] [de] [el] [en] [es] [eu] [fa] [fi] [fr] [ga] [he] [hu] [it] [ja] [ko] [sv] [u]