home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Arabic: POS Tags: CCONJ

There are 48 CCONJ lemmas (0%), 102 CCONJ types (0%) and 23968 CCONJ tokens (8%). Out of 16 observed tags, the rank of CCONJ is: 8 in number of lemmas, 7 in number of types and 4 in number of tokens.

The 10 most frequent CCONJ lemmas: وَ، أَنَّ، أَن، إِنَّ، فَ، أَو، كَمَا، حَيثُ، لٰكِنَّ، لِ

The 10 most frequent CCONJ types: و، أن، ان، ف، إن، كما، أو، حيث، ل، او

The 10 most frequent ambiguous lemmas: إِنَّ (CCONJ 934, PART 200), لِ (ADP 6661, CCONJ 202, PART 1), حَتَّى (ADP 176, ADV 65, CCONJ 50), أَي (CCONJ 38, X 13), إِن (CCONJ 20, X 13)

The 10 most frequent ambiguous types: و (CCONJ 15052, X 3), أن (CCONJ 2896, X 3, VERB 1), ان (CCONJ 1956, PART 30, X 11, VERB 1), ف (CCONJ 580, X 48), إن (CCONJ 571, PART 170, X 3), أو (CCONJ 300, X 3), ل (ADP 6520, CCONJ 202, PART 23, X 2), او (CCONJ 157, X 1), إذا (CCONJ 122, ADV 1), لكن (CCONJ 104, X 25)

Morphology

The form / lemma ratio of CCONJ is 2.125000 (the average of all parts of speech is 1.685281).

The 1st highest number of forms (42) was observed with the lemma “وَ”: و, وأسلم, وأفريقيا, وأوروبا, وإسرائيل, وإيطاليا, واسرائيل, واعتدال, والأردن, والاستخبارات, والامارات, والاميركية, والبرازيل, والبورصة, والتجارة, والتضامن, والتوجيه, والجودة, والسعودية, والصحة, والعمل, والغاز, والفاحشة, واللحوم, والمتوسط, والمتوسطة, والمجر, والمحلي, والنحاس, والنسيج, والهند, والهوية, وبوش, وجونز, وسامراء, وغربه, وقرغيزستان, ولبنان, ومصر, ومنوعة, ونيجيريا, وهي.

The 2nd highest number of forms (3) was observed with the lemma “أَي”: أي, اى, اي.

The 3rd highest number of forms (3) was observed with the lemma “إِنَّ”: أن, إن, ان.

CCONJ does not occur with any features.

Relations

CCONJ nodes are attached to their parents using 19 different relations: cc (12859; 54% instances), mark (6187; 26% instances), root (4117; 17% instances), advmod (304; 1% instances), advmod:emph (244; 1% instances), fixed (155; 1% instances), case (24; 0% instances), conj (15; 0% instances), dep (15; 0% instances), cop (12; 0% instances), nmod (8; 0% instances), obj (7; 0% instances), aux (6; 0% instances), iobj (4; 0% instances), obl:arg (3; 0% instances), orphan (3; 0% instances), nsubj (2; 0% instances), parataxis (2; 0% instances), punct (1; 0% instances)

Parents of CCONJ nodes belong to 15 different parts of speech: VERB (8485; 35% instances), NOUN (7063; 29% instances), (4117; 17% instances), ADJ (1635; 7% instances), X (871; 4% instances), CCONJ (556; 2% instances), NUM (531; 2% instances), DET (182; 1% instances), ADV (154; 1% instances), PRON (145; 1% instances), PART (141; 1% instances), ADP (72; 0% instances), AUX (12; 0% instances), INTJ (2; 0% instances), PROPN (2; 0% instances)

19162 (80%) CCONJ nodes are leaves.

544 (2%) CCONJ nodes have one child.

3694 (15%) CCONJ nodes have two children.

568 (2%) CCONJ nodes have three or more children.

The highest child degree of a CCONJ node is 26.

Children of CCONJ nodes are attached using 22 different relations: punct (4809; 44% instances), parataxis (4524; 42% instances), cc (601; 6% instances), fixed (421; 4% instances), nsubj (186; 2% instances), advcl (58; 1% instances), dep (58; 1% instances), ccomp (36; 0% instances), case (16; 0% instances), obl:arg (16; 0% instances), appos (15; 0% instances), obl (15; 0% instances), nmod (14; 0% instances), advmod:emph (10; 0% instances), conj (9; 0% instances), obj (7; 0% instances), acl (6; 0% instances), mark (6; 0% instances), advmod (4; 0% instances), orphan (3; 0% instances), csubj (2; 0% instances), aux (1; 0% instances)

Children of CCONJ nodes belong to 14 different parts of speech: PUNCT (4808; 44% instances), VERB (4311; 40% instances), CCONJ (556; 5% instances), PRON (387; 4% instances), NOUN (290; 3% instances), ADJ (162; 1% instances), X (116; 1% instances), PART (62; 1% instances), DET (57; 1% instances), ADV (24; 0% instances), ADP (22; 0% instances), NUM (20; 0% instances), INTJ (1; 0% instances), PROPN (1; 0% instances)