home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Arabic-NYUAD: POS Tags: DET

There are 13 DET lemmas (0%), 1 DET types (6%) and 6362 DET tokens (1%). Out of 16 observed tags, the rank of DET is: 14 in number of lemmas, 6 in number of types and 14 in number of tokens.

The 10 most frequent DET lemmas: _، w، None، ,، “، .، l، )، 17، 34

The 10 most frequent DET types: _

The 10 most frequent ambiguous lemmas: _ (NOUN 216429, PUNCT 72574, ADJ 66760, ADP 62646, VERB 54473, PROPN 48965, ADV 26129, SCONJ 23987, NUM 15122, AUX 6581, DET 6330, PART 5856, CCONJ 5168, PRON 2460, INTJ 54, X 32), w (CCONJ 43321, NOUN 190, PUNCT 136, ADP 120, ADV 117, PROPN 78, VERB 71, SCONJ 69, ADJ 55, PRON 33, PART 10, DET 9, NUM 8, AUX 5, X 3), None (NOUN 457, X 344, VERB 264, ADJ 125, PROPN 124, ADV 34, CCONJ 20, PRON 16, SCONJ 16, PART 14, ADP 8, DET 6, AUX 2), , (NOUN 100, CCONJ 96, VERB 34, PROPN 33, ADJ 30, ADP 30, PRON 11, SCONJ 11, PART 10, AUX 5, DET 5, ADV 4), “ (NOUN 112, ADP 34, CCONJ 20, PROPN 20, ADJ 12, VERB 8, PART 6, PRON 6, SCONJ 6, ADV 5, AUX 2, DET 2, X 2), . (NOUN 107, ADJ 95, PROPN 67, PRON 20, VERB 12, PART 6, ADP 5, X 5, CCONJ 3, ADV 2, AUX 2, DET 2, SCONJ 1), l (ADP 15449, PART 123, NOUN 98, AUX 67, CCONJ 33, ADJ 30, PUNCT 19, VERB 9, SCONJ 8, PROPN 7, ADV 6, PRON 5, DET 2, INTJ 2, NUM 1, X 1), ) (PROPN 18, NOUN 16, CCONJ 6, PRON 3, ADP 2, ADJ 1, DET 1), 17 (ADJ 14, DET 1, NOUN 1, PROPN 1), TBupdate (NOUN 401, ADJ 280, VERB 263, X 174, ADV 74, PROPN 69, ADP 4, SCONJ 2, CCONJ 1, DET 1, PART 1, PRON 1)

The 10 most frequent ambiguous types: _ (NOUN 218254, ADP 91694, PUNCT 75148, ADJ 67604, PROPN 58325, VERB 55215, CCONJ 50032, PRON 31239, ADV 26527, SCONJ 26034, NUM 15147, PART 8612, AUX 7723, DET 6362, X 917, INTJ 56)

Morphology

The form / lemma ratio of DET is 0.076923 (the average of all parts of speech is 0.002933).

The 1st highest number of forms (1) was observed with the lemma “””: _.

The 2nd highest number of forms (1) was observed with the lemma “)”: _.

The 3rd highest number of forms (1) was observed with the lemma “,”: _.

DET occurs with 8 features: Gender (6040; 95% instances), Number (6040; 95% instances), Definite (6031; 95% instances), Case (70; 1% instances), Person (10; 0% instances), Mood (9; 0% instances), Voice (9; 0% instances), Polarity (2; 0% instances)

DET occurs with 17 feature-value pairs: Case=Acc, Case=Gen, Case=Nom, Definite=Com, Definite=Def, Definite=Ind, Gender=Fem, Gender=Masc, Mood=Ind, Mood=Jus, Number=Dual, Number=Plur, Number=Sing, Person=1, Person=3, Polarity=Neg, Voice=Act

DET occurs with 25 feature combinations. The most frequent feature combination is Definite=Ind|Gender=Masc|Number=Sing (3616 tokens). Examples: _

Relations

DET nodes are attached to their parents using 10 different relations: det (5284; 83% instances), nsubj (689; 11% instances), aux (126; 2% instances), obj (110; 2% instances), nmod:poss (63; 1% instances), nmod (54; 1% instances), root (17; 0% instances), nsubj:pass (9; 0% instances), parataxis (9; 0% instances), iobj (1; 0% instances)

Parents of DET nodes belong to 15 different parts of speech: NOUN (4540; 71% instances), VERB (729; 11% instances), ADV (289; 5% instances), ADJ (166; 3% instances), NUM (152; 2% instances), SCONJ (143; 2% instances), PUNCT (130; 2% instances), PART (62; 1% instances), CCONJ (47; 1% instances), AUX (32; 1% instances), PRON (27; 0% instances), PROPN (19; 0% instances), (17; 0% instances), X (7; 0% instances), DET (2; 0% instances)

5229 (82%) DET nodes are leaves.

905 (14%) DET nodes have one child.

98 (2%) DET nodes have two children.

130 (2%) DET nodes have three or more children.

The highest child degree of a DET node is 12.

Children of DET nodes are attached using 18 different relations: case (950; 57% instances), cc (113; 7% instances), mark (107; 6% instances), nmod (92; 6% instances), punct (82; 5% instances), ccomp (56; 3% instances), parataxis (42; 3% instances), amod (41; 2% instances), xcomp (41; 2% instances), cop (38; 2% instances), advmod (25; 2% instances), dep (22; 1% instances), nsubj (20; 1% instances), conj (18; 1% instances), nummod (8; 0% instances), obj (2; 0% instances), csubj (1; 0% instances), det (1; 0% instances)

Children of DET nodes belong to 16 different parts of speech: ADP (950; 57% instances), VERB (119; 7% instances), NOUN (115; 7% instances), CCONJ (113; 7% instances), SCONJ (106; 6% instances), PUNCT (82; 5% instances), ADJ (54; 3% instances), AUX (43; 3% instances), ADV (31; 2% instances), PART (18; 1% instances), PROPN (9; 1% instances), NUM (8; 0% instances), PRON (6; 0% instances), DET (2; 0% instances), X (2; 0% instances), INTJ (1; 0% instances)