home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Arabic: POS Tags: DET

There are 16 DET lemmas (0%), 49 DET types (0%) and 5769 DET tokens (2%). Out of 16 observed tags, the rank of DET is: 13 in number of lemmas, 10 in number of types and 10 in number of tokens.

The 10 most frequent DET lemmas: اَلَّذِي، هٰذَا، مَا، ذٰلِكَ، مَن، كَيفَ، أَينَ، مَاذَا، كَم، مَتَى

The 10 most frequent DET types: التي، ما، الذي، هذه، هذا، ذلك، الذين، التى، من، ذٰلك

The 10 most frequent ambiguous lemmas: مَا (DET 1007, PART 67, INTJ 1)

The 10 most frequent ambiguous types: التي (DET 1356, X 54), ما (DET 1007, PART 67, X 4, INTJ 1), الذي (DET 708, X 65), هذه (DET 669, X 28), هذا (DET 623, X 34), ذلك (DET 273, X 69), الذين (DET 182, X 20), التى (DET 156, X 14), من (ADP 5381, DET 109), تلك (DET 101, X 7)

Morphology

The form / lemma ratio of DET is 3.062500 (the average of all parts of speech is 1.685281).

The 1st highest number of forms (13) was observed with the lemma “هٰذَا”: هؤلاء, هاتان, هاتين, هذا, هذــه, هذه, هذين, هـــذه, هــــذه, هٰؤلاء, هٰذا, هٰذان, هٰذه.

The 2nd highest number of forms (12) was observed with the lemma “اَلَّذِي”: التى, التي, الذى, الذي, الذين, اللاتى, اللاتي, اللتان, اللتين, اللذان, اللذين, اللواتي.

The 3rd highest number of forms (6) was observed with the lemma “ذٰلِكَ”: أولئك, أولٰئك, اولئك, تلك, ذلك, ذٰلك.

DET occurs with 4 features: Case (4562; 79% instances), Gender (4562; 79% instances), Number (4562; 79% instances), PronType (4562; 79% instances)

DET occurs with 10 feature-value pairs: Case=Acc, Case=Gen, Case=Nom, Gender=Fem, Gender=Masc, Number=Dual, Number=Plur, Number=Sing, PronType=Dem, PronType=Rel

DET occurs with 31 feature combinations. The most frequent feature combination is _ (1207 tokens). Examples: ما، من، كيف، ماذا، كم، أين، متى، لماذا، هكذا، اين

Relations

DET nodes are attached to their parents using 25 different relations: nsubj (2287; 40% instances), det (2093; 36% instances), obl (288; 5% instances), nsubj:pass (217; 4% instances), cc (206; 4% instances), obl:arg (153; 3% instances), obj (121; 2% instances), conj (95; 2% instances), mark (80; 1% instances), fixed (40; 1% instances), amod (36; 1% instances), parataxis (26; 0% instances), root (25; 0% instances), appos (24; 0% instances), advmod (22; 0% instances), aux (17; 0% instances), dep (9; 0% instances), cop (7; 0% instances), xcomp (7; 0% instances), iobj (6; 0% instances), ccomp (3; 0% instances), advmod:emph (2; 0% instances), case (2; 0% instances), orphan (2; 0% instances), acl (1; 0% instances)

Parents of DET nodes belong to 12 different parts of speech: VERB (2973; 52% instances), NOUN (2296; 40% instances), X (153; 3% instances), ADJ (134; 2% instances), CCONJ (57; 1% instances), DET (42; 1% instances), ADP (25; 0% instances), (25; 0% instances), NUM (21; 0% instances), PART (21; 0% instances), PRON (14; 0% instances), ADV (8; 0% instances)

4576 (79%) DET nodes are leaves.

543 (9%) DET nodes have one child.

428 (7%) DET nodes have two children.

222 (4%) DET nodes have three or more children.

The highest child degree of a DET node is 22.

Children of DET nodes are attached using 27 different relations: acl (659; 30% instances), case (619; 28% instances), cc (233; 11% instances), nsubj (146; 7% instances), nmod (139; 6% instances), punct (104; 5% instances), obl (58; 3% instances), nummod (35; 2% instances), mark (29; 1% instances), conj (25; 1% instances), fixed (25; 1% instances), amod (20; 1% instances), dep (17; 1% instances), parataxis (14; 1% instances), obl:arg (13; 1% instances), cop (11; 1% instances), advmod:emph (10; 0% instances), advcl (8; 0% instances), aux (7; 0% instances), advmod (6; 0% instances), appos (6; 0% instances), csubj (4; 0% instances), det (2; 0% instances), obj (2; 0% instances), orphan (2; 0% instances), xcomp (2; 0% instances), ccomp (1; 0% instances)

Children of DET nodes belong to 14 different parts of speech: VERB (685; 31% instances), ADP (630; 29% instances), NOUN (273; 12% instances), CCONJ (182; 8% instances), PRON (111; 5% instances), PUNCT (104; 5% instances), X (56; 3% instances), NUM (51; 2% instances), DET (42; 2% instances), ADJ (35; 2% instances), PART (12; 1% instances), ADV (8; 0% instances), AUX (7; 0% instances), INTJ (1; 0% instances)