home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Arabic-PADT: POS Tags: ADP

There are 60 ADP lemmas (0%), 144 ADP types (1%) and 42555 ADP tokens (15%). Out of 16 observed tags, the rank of ADP is: 7 in number of lemmas, 6 in number of types and 2 in number of tokens.

The 10 most frequent ADP lemmas: فِي، لِ، بِ، مِن، عَلَى، إِلَى، عَن، مَعَ، بَينَ، خِلَالَ

The 10 most frequent ADP types: في، ل، ب، من، على، الى، إلى، عن، فى، مع

The 10 most frequent ambiguous lemmas: فِي (ADP 8766, X 4), لِ (ADP 6946, CCONJ 210, PART 1), حَتَّى (ADP 176, ADV 65, CCONJ 50), سِوَى (ADP 44, NOUN 1)

The 10 most frequent ambiguous types: في (ADP 7602, X 3), ل (ADP 6805, CCONJ 210, PART 24, X 2), ب (ADP 6079, X 208), من (ADP 5398, DET 109), على (ADP 3345, X 6, NOUN 3), فى (ADP 1162, X 1), بين (ADP 968, NOUN 2, VERB 2, X 1), بعد (ADP 575, NOUN 26), علي (ADP 313, X 69, NOUN 7), قبل (ADP 232, NOUN 44, VERB 3)

Morphology

The form / lemma ratio of ADP is 2.400000 (the average of all parts of speech is 1.761701).

The 1st highest number of forms (35) was observed with the lemma “لِ”: ل, لإيران, لاراضيه, لاندونيسيا, لسوريا, لطهران, لـ, للأردن, للاعلام, للاوقاف, للبترول, للبضائع, للتضامن, للجدل, للحادث, للداخلية, للسلام, للسودانيين, للصحة, للصناعة, للطن, للعدل, للعراق, للعلاج, للغزو, للفرعونية, للكهرباء, للمحمول, للمقاولات, للمنازل, للنقال, لمساعدتنا, لمصر, لموسكو, لهم.

The 2nd highest number of forms (30) was observed with the lemma “بِ”: ب, بأنفسهم, بالأحزاب, بالإلغاء, بالإيدز, بالبحيرة, بالتوقف, بالجزائر, بالجنيه, بالسعودية, بالسودان, بالشلل, بالعراق, بالقدس, بالكهرباء, بالمطارات, بالنجاح, بالنقب, بالهند, بخير, بزيادة, بسامرّاء, بضمانها, بغزة, بـ, بفقدانها, بليبيا, بمصر, بهم, بهويتها.

The 3rd highest number of forms (5) was observed with the lemma “إِلَى”: إلى, إلي, إليها, الى, الي.

ADP occurs with 2 features: AdpType (42555; 100% instances), Case (6005; 14% instances)

ADP occurs with 4 feature-value pairs: AdpType=Prep, Case=Acc, Case=Gen, Case=Nom

ADP occurs with 4 feature combinations. The most frequent feature combination is AdpType=Prep (36550 tokens). Examples: في، ل، ب، من، على، الى، إلى، عن، فى، مع

Relations

ADP nodes are attached to their parents using 17 different relations: case (40072; 94% instances), mark (1130; 3% instances), fixed (1037; 2% instances), cc (139; 0% instances), advmod (45; 0% instances), advmod:emph (20; 0% instances), conj (19; 0% instances), nsubj (19; 0% instances), nmod (16; 0% instances), cop (14; 0% instances), xcomp (14; 0% instances), dep (11; 0% instances), root (6; 0% instances), orphan (5; 0% instances), appos (4; 0% instances), parataxis (3; 0% instances), advcl (1; 0% instances)

Parents of ADP nodes belong to 15 different parts of speech: NOUN (32455; 76% instances), PRON (2325; 5% instances), X (2029; 5% instances), NUM (1806; 4% instances), ADJ (1306; 3% instances), VERB (1190; 3% instances), DET (598; 1% instances), ADP (575; 1% instances), CCONJ (138; 0% instances), ADV (74; 0% instances), PROPN (22; 0% instances), PART (20; 0% instances), AUX (10; 0% instances), (6; 0% instances), PUNCT (1; 0% instances)

40838 (96%) ADP nodes are leaves.

1470 (3%) ADP nodes have one child.

192 (0%) ADP nodes have two children.

55 (0%) ADP nodes have three or more children.

The highest child degree of a ADP node is 7.

Children of ADP nodes are attached using 20 different relations: fixed (1802; 88% instances), case (54; 3% instances), nmod (30; 1% instances), obl (29; 1% instances), nsubj (28; 1% instances), cc (24; 1% instances), punct (18; 1% instances), obl:arg (14; 1% instances), conj (12; 1% instances), mark (8; 0% instances), dep (5; 0% instances), orphan (5; 0% instances), appos (3; 0% instances), parataxis (3; 0% instances), advmod (2; 0% instances), csubj (2; 0% instances), advcl (1; 0% instances), amod (1; 0% instances), det (1; 0% instances), nummod (1; 0% instances)

Children of ADP nodes belong to 12 different parts of speech: NOUN (1134; 56% instances), ADP (575; 28% instances), CCONJ (103; 5% instances), DET (68; 3% instances), PRON (51; 2% instances), X (42; 2% instances), ADJ (19; 1% instances), PUNCT (18; 1% instances), PART (13; 1% instances), VERB (10; 0% instances), NUM (8; 0% instances), ADV (2; 0% instances)