home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Arabic-PADT: POS Tags: CCONJ

There are 51 CCONJ lemmas (0%), 104 CCONJ types (0%) and 25241 CCONJ tokens (9%). Out of 16 observed tags, the rank of CCONJ is: 8 in number of lemmas, 7 in number of types and 4 in number of tokens.

The 10 most frequent CCONJ lemmas: وَ، أَنَّ، أَن، إِنَّ، فَ، أَو، كَمَا، حَيثُ، لٰكِنَّ، لِ

The 10 most frequent CCONJ types: و، أن، ان، ف، إن، كما، أو، حيث، ل، لٰكن

The 10 most frequent ambiguous lemmas: إِنَّ (CCONJ 945, PART 212), لِ (ADP 6946, CCONJ 210, PART 1), حَتَّى (ADP 176, ADV 65, CCONJ 50), أَي (CCONJ 40, X 13), إِن (CCONJ 20, X 13)

The 10 most frequent ambiguous types: و (CCONJ 16175, X 3), أن (CCONJ 2923, X 3, VERB 1), ان (CCONJ 1956, PART 30, X 11, VERB 1), ف (CCONJ 637, X 48), إن (CCONJ 582, PART 182, X 3), كما (CCONJ 385, PRON 2), أو (CCONJ 300, X 3), ل (ADP 6805, CCONJ 210, PART 24, X 2), او (CCONJ 157, X 1), إذا (CCONJ 122, ADV 1)

Morphology

The form / lemma ratio of CCONJ is 2.039216 (the average of all parts of speech is 1.762014).

The 1st highest number of forms (42) was observed with the lemma “وَ”: و, وأسلم, وأفريقيا, وأوروبا, وإسرائيل, وإيطاليا, واسرائيل, واعتدال, والأردن, والاستخبارات, والامارات, والاميركية, والبرازيل, والبورصة, والتجارة, والتضامن, والتوجيه, والجودة, والسعودية, والصحة, والعمل, والغاز, والفاحشة, واللحوم, والمتوسط, والمتوسطة, والمجر, والمحلي, والنحاس, والنسيج, والهند, والهوية, وبوش, وجونز, وسامراء, وغربه, وقرغيزستان, ولبنان, ومصر, ومنوعة, ونيجيريا, وهي.

The 2nd highest number of forms (3) was observed with the lemma “أَي”: أي, اى, اي.

The 3rd highest number of forms (3) was observed with the lemma “إِنَّ”: أن, إن, ان.

CCONJ does not occur with any features.

Relations

CCONJ nodes are attached to their parents using 16 different relations: cc (13855; 55% instances), mark (6246; 25% instances), root (4148; 16% instances), advmod (328; 1% instances), advmod:emph (255; 1% instances), fixed (184; 1% instances), case (155; 1% instances), conj (17; 0% instances), dep (16; 0% instances), nmod (14; 0% instances), obj (6; 0% instances), iobj (4; 0% instances), nsubj (4; 0% instances), orphan (4; 0% instances), obl:arg (3; 0% instances), parataxis (2; 0% instances)

Parents of CCONJ nodes belong to 14 different parts of speech: VERB (8821; 35% instances), NOUN (7708; 31% instances), (4148; 16% instances), ADJ (1699; 7% instances), X (959; 4% instances), CCONJ (594; 2% instances), NUM (533; 2% instances), DET (189; 1% instances), PRON (157; 1% instances), ADV (155; 1% instances), PART (146; 1% instances), ADP (101; 0% instances), PROPN (29; 0% instances), INTJ (2; 0% instances)

20281 (80%) CCONJ nodes are leaves.

603 (2%) CCONJ nodes have one child.

3759 (15%) CCONJ nodes have two children.

598 (2%) CCONJ nodes have three or more children.

The highest child degree of a CCONJ node is 25.

Children of CCONJ nodes are attached using 24 different relations: punct (4839; 44% instances), parataxis (4559; 41% instances), fixed (614; 6% instances), cc (580; 5% instances), nsubj (196; 2% instances), dep (61; 1% instances), advcl (58; 1% instances), obl (52; 0% instances), ccomp (36; 0% instances), nmod (20; 0% instances), obl:arg (16; 0% instances), appos (15; 0% instances), case (14; 0% instances), obj (12; 0% instances), acl (9; 0% instances), conj (9; 0% instances), advmod:emph (8; 0% instances), mark (6; 0% instances), advmod (4; 0% instances), orphan (3; 0% instances), csubj (2; 0% instances), amod (1; 0% instances), det (1; 0% instances), dislocated (1; 0% instances)

Children of CCONJ nodes belong to 14 different parts of speech: PUNCT (4839; 44% instances), VERB (4408; 40% instances), CCONJ (594; 5% instances), PRON (401; 4% instances), NOUN (298; 3% instances), ADP (138; 1% instances), ADJ (131; 1% instances), DET (109; 1% instances), X (88; 1% instances), PART (63; 1% instances), ADV (24; 0% instances), NUM (19; 0% instances), PROPN (3; 0% instances), INTJ (1; 0% instances)