home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Arabic-NYUAD: POS Tags: NUM

There are 10 NUM lemmas (0%), 1 NUM types (6%) and 15147 NUM tokens (2%). Out of 16 observed tags, the rank of NUM is: 15 in number of lemmas, 9 in number of types and 11 in number of tokens.

The 10 most frequent NUM lemmas: _، w، h، hA، f، :، b، hmA، l، mA

The 10 most frequent NUM types: _

The 10 most frequent ambiguous lemmas: _ (NOUN 216429, PUNCT 72574, ADJ 66760, ADP 62646, VERB 54473, PROPN 48965, ADV 26129, SCONJ 23987, NUM 15122, AUX 6581, DET 6330, PART 5856, CCONJ 5168, PRON 2460, INTJ 54, X 32), w (CCONJ 43321, NOUN 190, PUNCT 136, ADP 120, ADV 117, PROPN 78, VERB 71, SCONJ 69, ADJ 55, PRON 33, PART 10, DET 9, NUM 8, AUX 5, X 3), h (PRON 12201, SCONJ 390, AUX 107, NOUN 62, ADP 36, PUNCT 14, CCONJ 12, ADJ 7, NUM 7, VERB 7, PROPN 2, PART 1, X 1), hA (PRON 10321, SCONJ 313, AUX 69, NOUN 56, ADP 25, CCONJ 19, ADJ 18, PUNCT 17, PROPN 9, VERB 9, ADV 4, NUM 3, PART 3, DET 1), f (CCONJ 1247, AUX 459, PART 441, ADV 18, NOUN 12, SCONJ 8, PUNCT 7, VERB 4, ADP 3, ADJ 2, NUM 2, PRON 2), : (PUNCT 2339, VERB 3, CCONJ 2, ADJ 1, NOUN 1, NUM 1, PROPN 1, SCONJ 1), b (ADP 12204, NOUN 65, VERB 17, ADJ 16, PUNCT 15, PRON 12, CCONJ 10, SCONJ 7, PROPN 6, ADV 5, AUX 2, PART 2, X 2, DET 1, NUM 1), hmA (PRON 594, SCONJ 17, AUX 4, ADJ 1, ADP 1, NOUN 1, NUM 1, VERB 1), l (ADP 15449, PART 123, NOUN 98, AUX 67, CCONJ 33, ADJ 30, PUNCT 19, VERB 9, SCONJ 8, PROPN 7, ADV 6, PRON 5, DET 2, INTJ 2, NUM 1, X 1), mA (SCONJ 971, NOUN 5, PART 5, PRON 3, ADP 2, PUNCT 2, VERB 2, ADV 1, NUM 1, PROPN 1)

The 10 most frequent ambiguous types: _ (NOUN 218254, ADP 91694, PUNCT 75148, ADJ 67604, PROPN 58325, VERB 55215, CCONJ 50032, PRON 31239, ADV 26527, SCONJ 26034, NUM 15147, PART 8612, AUX 7723, DET 6362, X 917, INTJ 56)

Morphology

The form / lemma ratio of NUM is 0.100000 (the average of all parts of speech is 0.002933).

The 1st highest number of forms (1) was observed with the lemma “:”: _.

The 2nd highest number of forms (1) was observed with the lemma “_”: _.

The 3rd highest number of forms (1) was observed with the lemma “b”: _.

NUM occurs with 8 features: NumForm (14868; 98% instances), Gender (3454; 23% instances), Number (3454; 23% instances), Definite (3442; 23% instances), Case (3282; 22% instances), Person (23; 0% instances), Mood (12; 0% instances), Voice (12; 0% instances)

NUM occurs with 17 feature-value pairs: Case=Acc, Case=Gen, Case=Nom, Definite=Com, Definite=Def, Definite=Ind, Gender=Fem, Gender=Masc, Mood=Ind, Mood=Sub, NumForm=Digit, NumForm=Word, Number=Dual, Number=Plur, Number=Sing, Person=3, Voice=Act

NUM occurs with 82 feature combinations. The most frequent feature combination is NumForm=Digit (11538 tokens). Examples: _

Relations

NUM nodes are attached to their parents using 15 different relations: nummod (11137; 74% instances), nmod:poss (1212; 8% instances), obj (950; 6% instances), compound (725; 5% instances), conj (492; 3% instances), root (211; 1% instances), flat (179; 1% instances), parataxis (115; 1% instances), nsubj (87; 1% instances), nsubj:pass (22; 0% instances), iobj (10; 0% instances), dep (3; 0% instances), aux (2; 0% instances), ccomp (1; 0% instances), mark (1; 0% instances)

Parents of NUM nodes belong to 14 different parts of speech: NOUN (7319; 48% instances), PROPN (2859; 19% instances), NUM (2575; 17% instances), VERB (993; 7% instances), ADJ (600; 4% instances), PUNCT (265; 2% instances), ADV (255; 2% instances), (211; 1% instances), CCONJ (29; 0% instances), PRON (26; 0% instances), DET (8; 0% instances), X (5; 0% instances), PART (1; 0% instances), SCONJ (1; 0% instances)

10681 (71%) NUM nodes are leaves.

2293 (15%) NUM nodes have one child.

1390 (9%) NUM nodes have two children.

783 (5%) NUM nodes have three or more children.

The highest child degree of a NUM node is 14.

Children of NUM nodes are attached using 20 different relations: punct (2713; 33% instances), nummod (1225; 15% instances), case (816; 10% instances), compound (724; 9% instances), nmod (673; 8% instances), cc (570; 7% instances), conj (495; 6% instances), parataxis (351; 4% instances), obj (232; 3% instances), det (150; 2% instances), amod (91; 1% instances), ccomp (64; 1% instances), advmod (52; 1% instances), mark (50; 1% instances), flat (37; 0% instances), nsubj (21; 0% instances), xcomp (17; 0% instances), dep (13; 0% instances), cop (12; 0% instances), nmod:poss (2; 0% instances)

Children of NUM nodes belong to 15 different parts of speech: PUNCT (2713; 33% instances), NUM (2575; 31% instances), ADP (816; 10% instances), NOUN (635; 8% instances), CCONJ (571; 7% instances), VERB (360; 4% instances), DET (152; 2% instances), ADJ (126; 2% instances), PROPN (111; 1% instances), PRON (75; 1% instances), ADV (64; 1% instances), SCONJ (49; 1% instances), X (31; 0% instances), AUX (15; 0% instances), PART (15; 0% instances)