home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Arabic-PADT: POS Tags: ADP

There are 60 ADP lemmas (0%), 144 ADP types (0%) and 41879 ADP tokens (15%). Out of 16 observed tags, the rank of ADP is: 6 in number of lemmas, 6 in number of types and 2 in number of tokens.

The 10 most frequent ADP lemmas: فِي، لِ، بِ، مِن، عَلَى، إِلَى، عَن، مَعَ، بَينَ، خِلَالَ

The 10 most frequent ADP types: في، ل، ب، من، على، الى، إلى، عن، فى، مع

The 10 most frequent ambiguous lemmas: فِي (ADP 8751, X 4), لِ (ADP 6661, CCONJ 202, PART 1), حَتَّى (ADP 176, ADV 65, CCONJ 50), سِوَى (ADP 43, NOUN 1)

The 10 most frequent ambiguous types: في (ADP 7587, X 3), ل (ADP 6520, CCONJ 202, PART 23, X 2), ب (ADP 5831, X 205), من (ADP 5381, DET 109), على (ADP 3345, X 6, NOUN 3), فى (ADP 1162, X 1), بين (ADP 945, NOUN 2, VERB 2, X 1), بعد (ADP 575, NOUN 26), علي (ADP 283, X 69, NOUN 5), قبل (ADP 232, NOUN 44, VERB 3)

Morphology

The form / lemma ratio of ADP is 2.400000 (the average of all parts of speech is 1.685281).

The 1st highest number of forms (35) was observed with the lemma “لِ”: ل, لإيران, لاراضيه, لاندونيسيا, لسوريا, لطهران, لـ, للأردن, للاعلام, للاوقاف, للبترول, للبضائع, للتضامن, للجدل, للحادث, للداخلية, للسلام, للسودانيين, للصحة, للصناعة, للطن, للعدل, للعراق, للعلاج, للغزو, للفرعونية, للكهرباء, للمحمول, للمقاولات, للمنازل, للنقال, لمساعدتنا, لمصر, لموسكو, لهم.

The 2nd highest number of forms (30) was observed with the lemma “بِ”: ب, بأنفسهم, بالأحزاب, بالإلغاء, بالإيدز, بالبحيرة, بالتوقف, بالجزائر, بالجنيه, بالسعودية, بالسودان, بالشلل, بالعراق, بالقدس, بالكهرباء, بالمطارات, بالنجاح, بالنقب, بالهند, بخير, بزيادة, بسامرّاء, بضمانها, بغزة, بـ, بفقدانها, بليبيا, بمصر, بهم, بهويتها.

The 3rd highest number of forms (5) was observed with the lemma “إِلَى”: إلى, إلي, إليها, الى, الي.

ADP occurs with 2 features: AdpType (41879; 100% instances), Case (5971; 14% instances)

ADP occurs with 4 feature-value pairs: AdpType=Prep, Case=Acc, Case=Gen, Case=Nom

ADP occurs with 4 feature combinations. The most frequent feature combination is AdpType=Prep (35908 tokens). Examples: في، ل، ب، من، على، الى، إلى، عن، فى، مع

Relations

ADP nodes are attached to their parents using 17 different relations: case (39432; 94% instances), mark (1115; 3% instances), fixed (1016; 2% instances), cc (139; 0% instances), advmod (45; 0% instances), advmod:emph (20; 0% instances), conj (19; 0% instances), nsubj (19; 0% instances), nmod (16; 0% instances), cop (14; 0% instances), xcomp (14; 0% instances), dep (11; 0% instances), root (6; 0% instances), orphan (5; 0% instances), appos (4; 0% instances), parataxis (3; 0% instances), advcl (1; 0% instances)

Parents of ADP nodes belong to 15 different parts of speech: NOUN (31876; 76% instances), PRON (2191; 5% instances), X (2168; 5% instances), NUM (1805; 4% instances), ADJ (1287; 3% instances), VERB (1177; 3% instances), DET (583; 1% instances), ADP (565; 1% instances), CCONJ (108; 0% instances), ADV (74; 0% instances), PART (19; 0% instances), AUX (10; 0% instances), PROPN (9; 0% instances), (6; 0% instances), PUNCT (1; 0% instances)

40175 (96%) ADP nodes are leaves.

1466 (4%) ADP nodes have one child.

183 (0%) ADP nodes have two children.

55 (0%) ADP nodes have three or more children.

The highest child degree of a ADP node is 7.

Children of ADP nodes are attached using 20 different relations: fixed (1781; 88% instances), case (53; 3% instances), nmod (30; 1% instances), obl (29; 1% instances), nsubj (28; 1% instances), cc (24; 1% instances), punct (18; 1% instances), obl:arg (14; 1% instances), conj (12; 1% instances), mark (8; 0% instances), dep (5; 0% instances), orphan (5; 0% instances), appos (3; 0% instances), parataxis (3; 0% instances), advmod (2; 0% instances), csubj (2; 0% instances), advcl (1; 0% instances), amod (1; 0% instances), det (1; 0% instances), nummod (1; 0% instances)

Children of ADP nodes belong to 12 different parts of speech: NOUN (1117; 55% instances), ADP (565; 28% instances), CCONJ (99; 5% instances), DET (66; 3% instances), X (55; 3% instances), PRON (50; 2% instances), ADJ (19; 1% instances), PUNCT (18; 1% instances), PART (13; 1% instances), VERB (9; 0% instances), NUM (8; 0% instances), ADV (2; 0% instances)