home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Persian-Seraji: POS Tags: DET

There are 30 DET lemmas (0%), 37 DET types (0%) and 3759 DET tokens (2%). Out of 15 observed tags, the rank of DET is: 10 in number of lemmas, 11 in number of types and 11 in number of tokens.

The 10 most frequent DET lemmas: این، آن، هر، هیچ، همان، همین، چه، تمام، برخی، همه

The 10 most frequent DET types: این، آن، هر، هیچ، همان، همین، چه، برخی، تمام، تمامی

The 10 most frequent ambiguous lemmas: این (DET 2403, PRON 583, CCONJ 1), آن (PRON 1094, DET 370, NOUN 3), هیچ (DET 129, ADV 5, INTJ 1, NOUN 1), همان (DET 128, PRON 8, NOUN 1), همین (DET 119, PRON 18), چه (ADV 89, DET 84, SCONJ 27, NOUN 7, CCONJ 2), تمام (DET 66, ADJ 38, PRON 1), برخی (DET 43, PRON 41, NOUN 1), همه (PRON 207, DET 24), دیگر (ADJ 292, ADV 39, DET 21, PRON 4, NOUN 2)

The 10 most frequent ambiguous types: این (DET 2370, PRON 489, CCONJ 1), آن (PRON 592, DET 366, NOUN 3), هیچ (DET 129, ADV 5, INTJ 1, NOUN 1), همان (DET 128, PRON 8, NOUN 1), همین (DET 119, PRON 18), چه (DET 84, ADV 61, SCONJ 27, NOUN 5, CCONJ 2), برخی (DET 43, PRON 41), تمام (DET 42, ADJ 38, PRON 1), دیگر (ADJ 204, ADV 39, DET 21, PRON 1), همهٔ (DET 20, PRON 15)

Morphology

The form / lemma ratio of DET is 1.233333 (the average of all parts of speech is 1.409222).

The 1st highest number of forms (3) was observed with the lemma “آن”: آن, آن‌ها, دان.

The 2nd highest number of forms (3) was observed with the lemma “این”: این, دین, ین.

The 3rd highest number of forms (2) was observed with the lemma “تمام”: تمام, تمامی.

DET occurs with 1 features: PronType (216; 6% instances)

DET occurs with 2 feature-value pairs: PronType=Int, PronType=Neg

DET occurs with 3 feature combinations. The most frequent feature combination is _ (3543 tokens). Examples: این، آن، هر، همان، همین، برخی، تمام، تمامی، دیگر، همهٔ

Relations

DET nodes are attached to their parents using 10 different relations: det (3660; 97% instances), det:predet (45; 1% instances), mark (18; 0% instances), nsubj (15; 0% instances), advmod (6; 0% instances), obj (6; 0% instances), nmod:poss (5; 0% instances), fixed (2; 0% instances), conj (1; 0% instances), parataxis (1; 0% instances)

Parents of DET nodes belong to 9 different parts of speech: NOUN (3629; 97% instances), VERB (30; 1% instances), ADJ (26; 1% instances), ADV (24; 1% instances), PRON (24; 1% instances), NUM (23; 1% instances), ADP (1; 0% instances), CCONJ (1; 0% instances), DET (1; 0% instances)

3725 (99%) DET nodes are leaves.

13 (0%) DET nodes have one child.

19 (1%) DET nodes have two children.

2 (0%) DET nodes have three or more children.

The highest child degree of a DET node is 4.

Children of DET nodes are attached using 11 different relations: fixed (40; 69% instances), case (5; 9% instances), nmod:poss (5; 9% instances), acl:relcl (1; 2% instances), amod (1; 2% instances), aux (1; 2% instances), cc (1; 2% instances), ccomp (1; 2% instances), det (1; 2% instances), nsubj (1; 2% instances), punct (1; 2% instances)

Children of DET nodes belong to 10 different parts of speech: NOUN (22; 38% instances), CCONJ (20; 34% instances), PART (5; 9% instances), ADJ (4; 7% instances), VERB (2; 3% instances), ADP (1; 2% instances), AUX (1; 2% instances), DET (1; 2% instances), NUM (1; 2% instances), PUNCT (1; 2% instances)