home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Coptic: POS Tags: DET

There are 23 DET lemmas (2%), 46 DET types (3%) and 1390 DET tokens (12%). Out of 14 observed tags, the rank of DET is: 8 in number of lemmas, 7 in number of types and 5 in number of tokens.

The 10 most frequent DET lemmas: ⲡ, ⲟⲩ, ⲡⲉϥ, ⲡⲉⲕ, ⲡⲉⲩ, ⲡⲁⲓ, ⲡⲁ, ⲡⲟⲩ, ⲕⲉ, ⲡⲉⲓ

The 10 most frequent DET types: ⲡ, ⲛ, ⲧ, ⲟⲩ, ϩⲉⲛ, ⲧⲉ, ⲡⲉ, ⲛⲉ, ⲡⲉϥ, ⲛⲉϥ

The 10 most frequent ambiguous lemmas: ⲡ (DET 832, PRON 4), ⲟⲩ (DET 177, PRON 20, X 1), ⲕⲉ (DET 23, NOUN 10), ϩⲛ (ADP 220, DET 1), ϭⲉ (PART 11, DET 1), ⲛⲁ (ADP 116, AUX 89, VERB 4, DET 1, NOUN 1), ⲛⲧⲟⲟⲩ (PRON 495, DET 1)

The 10 most frequent ambiguous types: ⲛ (ADP 436, DET 158, AUX 74, PRON 30, ADV 22, NOUN 1, VERB 1), ⲧ (DET 153, PRON 3), ⲟⲩ (PRON 196, DET 106, X 1), ⲧⲉ (DET 58, PRON 20, NOUN 1, PART 1), ⲡⲉ (DET 48, PRON 31, PART 19, NOUN 3), ⲛⲉ (AUX 41, DET 34, PRON 15, PART 1), ⲙ (ADP 192, DET 14, ADV 1), ⲩ (PRON 264, DET 14), ⲛⲁ (ADP 104, AUX 84, DET 10, VERB 4, NOUN 1), ⲧⲁ (DET 10, PRON 1)

Morphology

The form / lemma ratio of DET is 2.000000 (the average of all parts of speech is 1.154412).

The 1st highest number of forms (7) was observed with the lemma “ⲡ”: ⲙ, ⲛ, ⲛⲉ, ⲡ, ⲡⲉ, ⲧ, ⲧⲉ.

The 2nd highest number of forms (3) was observed with the lemma “ⲟⲩ”: ϩⲉⲛ, ⲟⲩ, ⲩ.

The 3rd highest number of forms (3) was observed with the lemma “ⲡⲁ”: ⲛⲁ, ⲡⲁ, ⲧⲁ.

DET occurs with 6 features: Definite (1246; 90% instances), Number (1245; 90% instances), PronType (1084; 78% instances), Gender (780; 56% instances), Person (245; 18% instances), Poss (245; 18% instances)

DET occurs with 13 feature-value pairs: Definite=Def, Definite=Ind, Gender=Fem, Gender=Masc, Number=Plur, Number=Sing, Person=1, Person=2, Person=3, Poss=Yes, PronType=Art, PronType=Dem, PronType=Prs

DET occurs with 17 feature combinations. The most frequent feature combination is Definite=Def|Gender=Masc|Number=Sing|PronType=Art (396 tokens). Examples: ⲡ, ⲡⲉ

Relations

DET nodes are attached to their parents using 15 different relations: det (1214; 87% instances), obl (42; 3% instances), dislocated (25; 2% instances), nsubj (24; 2% instances), nmod (20; 1% instances), conj (14; 1% instances), obj (14; 1% instances), root (14; 1% instances), appos (9; 1% instances), ccomp (5; 0% instances), acl (4; 0% instances), advcl (2; 0% instances), aux (1; 0% instances), discourse (1; 0% instances), parataxis (1; 0% instances)

Parents of DET nodes belong to 12 different parts of speech: NOUN (1177; 85% instances), VERB (111; 8% instances), PROPN (33; 2% instances), PRON (20; 1% instances), DET (15; 1% instances), (14; 1% instances), NUM (9; 1% instances), ADV (4; 0% instances), AUX (3; 0% instances), X (2; 0% instances), ADP (1; 0% instances), PART (1; 0% instances)

1224 (88%) DET nodes are leaves.

60 (4%) DET nodes have one child.

65 (5%) DET nodes have two children.

41 (3%) DET nodes have three or more children.

The highest child degree of a DET node is 6.

Children of DET nodes are attached using 18 different relations: acl (132; 38% instances), case (78; 23% instances), cc (18; 5% instances), cop (18; 5% instances), punct (17; 5% instances), advmod (13; 4% instances), mark (12; 3% instances), nmod (12; 3% instances), nsubj (12; 3% instances), advcl (7; 2% instances), conj (7; 2% instances), csubj (7; 2% instances), appos (5; 1% instances), det (3; 1% instances), discourse (1; 0% instances), dislocated (1; 0% instances), orphan (1; 0% instances), parataxis (1; 0% instances)

Children of DET nodes belong to 12 different parts of speech: VERB (134; 39% instances), ADP (80; 23% instances), NOUN (27; 8% instances), PRON (25; 7% instances), PUNCT (17; 5% instances), CCONJ (15; 4% instances), DET (15; 4% instances), ADV (14; 4% instances), PART (10; 3% instances), SCONJ (4; 1% instances), PROPN (3; 1% instances), AUX (1; 0% instances)