home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Haitian_Creole-Adolphe: POS Tags: DET

There are 32 DET lemmas (1%), 37 DET types (1%) and 5131 DET tokens (7%). Out of 17 observed tags, the rank of DET is: 9 in number of lemmas, 9 in number of types and 5 in number of tokens.

The 10 most frequent DET lemmas: la, yon, sa, tout, yo, kèk, anpil, chak, plizyè, okenn

The 10 most frequent DET types: a, yon, la, an, sa, tout, yo, kèk, nan, anpil

The 10 most frequent ambiguous lemmas: la (DET 2475, ADV 15, VERB 4), yon (DET 1187, VERB 1), sa (PRON 1091, DET 399, NOUN 1), tout (DET 350, ADJ 5, ADV 2, PRON 2), yo (PRON 2446, DET 152), anpil (DET 96, ADV 84, ADJ 35), plizyè (DET 60, ADJ 4), lòt (ADJ 210, PRON 39, DET 31, NOUN 1), youn (PRON 55, DET 17, NUM 5, NOUN 1), pifò (DET 15, ADJ 1, NOUN 1)

The 10 most frequent ambiguous types: a (DET 1192, ADP 5, ADV 5, NOUN 2, PART 1, PRON 1), yon (DET 1120, PRON 1, VERB 1), la (DET 549, ADV 15, VERB 4), an (DET 531, ADP 66, NOUN 12, ADV 10), sa (PRON 1007, DET 399, NOUN 1), tout (DET 331, ADJ 5, ADV 2, PRON 2), yo (PRON 2314, DET 152), nan (ADP 1416, DET 139, VERB 7, PRON 1), anpil (DET 94, ADV 82, ADJ 35), plizyè (DET 60, ADJ 4)

Morphology

The form / lemma ratio of DET is 1.156250 (the average of all parts of speech is 1.008685).

The 1st highest number of forms (6) was observed with the lemma “la”: a, an, la, lan, nan, yon.

The 2nd highest number of forms (2) was observed with the lemma “kelkeswa”: Kèlkeswa, kelkeswa.

The 3rd highest number of forms (2) was observed with the lemma “yon”: yon, yoon.

DET occurs with 6 features: PronType (4145; 81% instances), Number (3889; 76% instances), Definite (3721; 73% instances), Polarity (12; 0% instances), Person (7; 0% instances), Poss (6; 0% instances)

DET occurs with 10 feature-value pairs: Definite=Def, Definite=Ind, Number=Plur, Number=Sing, Person=3, Polarity=Neg, Poss=Yes, PronType=Art, PronType=Dem, PronType=Prs

DET occurs with 27 feature combinations. The most frequent feature combination is Definite=Def|Number=Sing|PronType=Art (2422 tokens). Examples: a, la, an, nan, lan

Relations

DET nodes are attached to their parents using 17 different relations: det (4729; 92% instances), obl (219; 4% instances), nmod (61; 1% instances), root (31; 1% instances), acl:relcl (26; 1% instances), advcl (15; 0% instances), conj (14; 0% instances), obj (8; 0% instances), ccomp (7; 0% instances), acl (6; 0% instances), dep (5; 0% instances), nsubj (3; 0% instances), advmod (2; 0% instances), parataxis (2; 0% instances), appos (1; 0% instances), dislocated (1; 0% instances), xcomp (1; 0% instances)

Parents of DET nodes belong to 12 different parts of speech: NOUN (3933; 77% instances), PROPN (344; 7% instances), VERB (312; 6% instances), PRON (271; 5% instances), ADJ (165; 3% instances), (31; 1% instances), DET (25; 0% instances), ADV (21; 0% instances), ADP (17; 0% instances), NUM (7; 0% instances), SCONJ (4; 0% instances), CCONJ (1; 0% instances)

4700 (92%) DET nodes are leaves.

53 (1%) DET nodes have one child.

228 (4%) DET nodes have two children.

150 (3%) DET nodes have three or more children.

The highest child degree of a DET node is 8.

Children of DET nodes are attached using 23 different relations: case (285; 25% instances), nmod (275; 25% instances), nsubj (87; 8% instances), punct (82; 7% instances), advmod (36; 3% instances), amod (35; 3% instances), obl (34; 3% instances), mark (33; 3% instances), compound (32; 3% instances), det (31; 3% instances), obj (30; 3% instances), compound:svc (25; 2% instances), acl (23; 2% instances), aux (23; 2% instances), cop (19; 2% instances), acl:relcl (15; 1% instances), cc (14; 1% instances), advcl (13; 1% instances), conj (13; 1% instances), xcomp (10; 1% instances), discourse (3; 0% instances), dislocated (2; 0% instances), parataxis (1; 0% instances)

Children of DET nodes belong to 14 different parts of speech: NOUN (329; 29% instances), ADP (296; 26% instances), PRON (92; 8% instances), VERB (84; 7% instances), PUNCT (82; 7% instances), PROPN (50; 4% instances), AUX (42; 4% instances), ADJ (40; 4% instances), ADV (37; 3% instances), SCONJ (26; 2% instances), DET (25; 2% instances), CCONJ (14; 1% instances), PART (3; 0% instances), NUM (1; 0% instances)