home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Arabic-NYUAD: POS Tags: DET

There are 6 DET lemmas (0%), 1 DET types (6%) and 6363 DET tokens (1%). Out of 16 observed tags, the rank of DET is: 14 in number of lemmas, 6 in number of types and 13 in number of tokens.

The 10 most frequent DET lemmas: _، w، b، ,، .، l

The 10 most frequent DET types: _

The 10 most frequent ambiguous lemmas: _ (NOUN 221327, PUNCT 71973, ADJ 68841, ADP 62617, VERB 55127, PROPN 48391, ADV 23955, SCONJ 15652, NUM 15105, PRON 12926, AUX 6881, DET 6354, CCONJ 3889, PART 1501, X 380, INTJ 56), w (CCONJ 43819, SCONJ 235, ADP 42, NOUN 41, VERB 40, ADJ 14, PRON 12, PROPN 9, DET 4, PART 3, NUM 2, PUNCT 2, X 2), b (ADP 12334, NOUN 21, DET 2, PRON 2, SCONJ 2, X 2, ADJ 1, VERB 1), , (PUNCT 254, CCONJ 68, NOUN 9, ADJ 8, ADP 7, NUM 5, SCONJ 4, VERB 4, PRON 3, PROPN 3, ADV 2, DET 1, PART 1), . (PUNCT 312, ADP 3, CCONJ 3, NOUN 3, PROPN 3, VERB 2, DET 1), l (ADP 15628, PART 165, NOUN 29, SCONJ 28, ADV 2, VERB 2, ADJ 1, DET 1, NUM 1, PROPN 1, PUNCT 1, X 1)

The 10 most frequent ambiguous types: _ (NOUN 221899, ADP 91743, PUNCT 75266, ADJ 69355, PROPN 57421, VERB 55469, CCONJ 49161, PRON 43495, ADV 24067, SCONJ 16614, NUM 15377, AUX 9155, DET 6363, PART 2521, X 927, INTJ 56)

Morphology

The form / lemma ratio of DET is 0.166667 (the average of all parts of speech is 0.003044).

The 1st highest number of forms (1) was observed with the lemma “,”: _.

The 2nd highest number of forms (1) was observed with the lemma “.”: _.

The 3rd highest number of forms (1) was observed with the lemma “_”: _.

DET occurs with 8 features: Gender (6065; 95% instances), Number (6065; 95% instances), Definite (6060; 95% instances), Case (46; 1% instances), AdpType (8; 0% instances), Mood (5; 0% instances), Person (5; 0% instances), Voice (5; 0% instances)

DET occurs with 16 feature-value pairs: AdpType=Prep, Case=Acc, Case=Gen, Case=Nom, Definite=Com, Definite=Def, Definite=Ind, Gender=Fem, Gender=Masc, Mood=Ind, Mood=Jus, Number=Dual, Number=Plur, Number=Sing, Person=3, Voice=Act

DET occurs with 18 feature combinations. The most frequent feature combination is Definite=Ind|Gender=Masc|Number=Sing (3645 tokens). Examples: _

Relations

DET nodes are attached to their parents using 7 different relations: det (4202; 66% instances), obj (961; 15% instances), nsubj (731; 11% instances), nmod:poss (285; 4% instances), nmod (101; 2% instances), iobj (46; 1% instances), root (37; 1% instances)

Parents of DET nodes belong to 12 different parts of speech: NOUN (4850; 76% instances), VERB (677; 11% instances), ADV (240; 4% instances), PRON (225; 4% instances), ADJ (171; 3% instances), NUM (108; 2% instances), (37; 1% instances), PROPN (36; 1% instances), CCONJ (9; 0% instances), AUX (4; 0% instances), DET (4; 0% instances), X (2; 0% instances)

5349 (84%) DET nodes are leaves.

798 (13%) DET nodes have one child.

95 (1%) DET nodes have two children.

121 (2%) DET nodes have three or more children.

The highest child degree of a DET node is 14.

Children of DET nodes are attached using 17 different relations: case (847; 55% instances), mark (124; 8% instances), punct (121; 8% instances), nmod (74; 5% instances), obj (68; 4% instances), amod (66; 4% instances), xcomp (52; 3% instances), cop (51; 3% instances), cc (48; 3% instances), ccomp (44; 3% instances), advmod (20; 1% instances), nsubj (12; 1% instances), nummod (9; 1% instances), aux (2; 0% instances), discourse (1; 0% instances), iobj (1; 0% instances), nmod:poss (1; 0% instances)

Children of DET nodes belong to 16 different parts of speech: ADP (847; 55% instances), PUNCT (121; 8% instances), NOUN (109; 7% instances), VERB (96; 6% instances), SCONJ (92; 6% instances), ADJ (68; 4% instances), AUX (54; 4% instances), CCONJ (49; 3% instances), PART (34; 2% instances), ADV (23; 1% instances), PRON (17; 1% instances), PROPN (10; 1% instances), NUM (9; 1% instances), X (7; 0% instances), DET (4; 0% instances), INTJ (1; 0% instances)