home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Arabic-PADT: POS Tags: DET

There are 24 DET lemmas (0%), 54 DET types (0%) and 5896 DET tokens (2%). Out of 16 observed tags, the rank of DET is: 10 in number of lemmas, 10 in number of types and 10 in number of tokens.

The 10 most frequent DET lemmas: اَلَّذِي، هٰذَا، مَا، ذٰلِكَ، مَن، كَيفَ، أَينَ، مَاذَا، كَم، مَتَى

The 10 most frequent DET types: التي، ما، الذي، هذه، هذا، ذلك، الذين، ذٰلك، التى، هٰذا

The 10 most frequent ambiguous lemmas: مَا (DET 1021, PART 67, INTJ 1), ما (DET 4, X 4), هُوَ (PRON 10877, DET 1)

The 10 most frequent ambiguous types: التي (DET 1368, X 54), ما (DET 1025, PART 67, X 4, INTJ 1), الذي (DET 712, X 65), هذه (DET 669, X 28), هذا (DET 623, X 34), ذلك (DET 273, X 69), الذين (DET 185, X 20), التى (DET 156, X 14), من (ADP 5398, DET 109), تلك (DET 101, X 7)

Morphology

The form / lemma ratio of DET is 2.250000 (the average of all parts of speech is 1.761701).

The 1st highest number of forms (13) was observed with the lemma “هٰذَا”: هؤلاء, هاتان, هاتين, هذا, هذــه, هذه, هذين, هـــذه, هــــذه, هٰؤلاء, هٰذا, هٰذان, هٰذه.

The 2nd highest number of forms (12) was observed with the lemma “اَلَّذِي”: التى, التي, الذى, الذي, الذين, اللاتى, اللاتي, اللتان, اللتين, اللذان, اللذين, اللواتي.

The 3rd highest number of forms (6) was observed with the lemma “ذٰلِكَ”: أولئك, أولٰئك, اولئك, تلك, ذلك, ذٰلك.

DET occurs with 5 features: Case (4670; 79% instances), Number (4670; 79% instances), Gender (4668; 79% instances), PronType (4662; 79% instances), Person (13; 0% instances)

DET occurs with 12 feature-value pairs: Case=Acc, Case=Gen, Case=Nom, Gender=Fem, Gender=Masc, Number=Dual, Number=Plur, Number=Sing, Person=1, Person=3, PronType=Dem, PronType=Rel

DET occurs with 36 feature combinations. The most frequent feature combination is _ (1221 tokens). Examples: ما، من، كيف، ماذا، كم، أين، متى، لماذا، هكذا، اين

Relations

DET nodes are attached to their parents using 25 different relations: nsubj (2326; 39% instances), det (2110; 36% instances), obl (295; 5% instances), nsubj:pass (218; 4% instances), obl:arg (154; 3% instances), fixed (130; 2% instances), cc (128; 2% instances), obj (121; 2% instances), conj (100; 2% instances), mark (87; 1% instances), amod (36; 1% instances), advmod (29; 0% instances), case (29; 0% instances), parataxis (26; 0% instances), root (26; 0% instances), appos (25; 0% instances), aux (16; 0% instances), cop (9; 0% instances), dep (9; 0% instances), xcomp (7; 0% instances), iobj (6; 0% instances), advmod:emph (3; 0% instances), ccomp (3; 0% instances), orphan (2; 0% instances), acl (1; 0% instances)

Parents of DET nodes belong to 12 different parts of speech: VERB (3031; 51% instances), NOUN (2283; 39% instances), ADJ (139; 2% instances), X (125; 2% instances), CCONJ (116; 2% instances), ADP (68; 1% instances), DET (43; 1% instances), (26; 0% instances), PART (22; 0% instances), NUM (20; 0% instances), PRON (13; 0% instances), ADV (10; 0% instances)

4684 (79%) DET nodes are leaves.

561 (10%) DET nodes have one child.

469 (8%) DET nodes have two children.

182 (3%) DET nodes have three or more children.

The highest child degree of a DET node is 22.

Children of DET nodes are attached using 27 different relations: acl (684; 31% instances), case (624; 29% instances), cc (171; 8% instances), nsubj (150; 7% instances), nmod (128; 6% instances), punct (109; 5% instances), obl (60; 3% instances), nummod (35; 2% instances), mark (30; 1% instances), fixed (29; 1% instances), conj (25; 1% instances), amod (20; 1% instances), dep (18; 1% instances), obl:arg (15; 1% instances), parataxis (14; 1% instances), advmod:emph (12; 1% instances), cop (11; 1% instances), advcl (8; 0% instances), aux (7; 0% instances), advmod (6; 0% instances), appos (6; 0% instances), csubj (4; 0% instances), det (2; 0% instances), obj (2; 0% instances), orphan (2; 0% instances), xcomp (2; 0% instances), ccomp (1; 0% instances)

Children of DET nodes belong to 14 different parts of speech: VERB (711; 33% instances), ADP (598; 27% instances), NOUN (254; 12% instances), CCONJ (193; 9% instances), PRON (114; 5% instances), PUNCT (109; 5% instances), NUM (51; 2% instances), DET (43; 2% instances), ADJ (36; 2% instances), X (35; 2% instances), PART (13; 1% instances), ADV (10; 0% instances), AUX (7; 0% instances), INTJ (1; 0% instances)