home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_French-PUD: POS Tags: DET

There are 2 DET lemmas (10%), 27 DET types (0%) and 3630 DET tokens (15%). Out of 16 observed tags, the rank of DET is: 2 in number of lemmas, 11 in number of types and 3 in number of tokens.

The 10 most frequent DET lemmas: _, le

The 10 most frequent DET types: le, la, les, l’, un, une, des, l’, cette, de

The 10 most frequent ambiguous lemmas: _ (NOUN 4804, ADP 3324, DET 3037, VERB 3024, PUNCT 2548, ADJ 1607, PRON 1335, PROPN 1241, ADV 1035, CCONJ 562, NUM 458, AUX 274, SCONJ 206, X 48, SYM 34, PART 9)

The 10 most frequent ambiguous types: le (DET 758, PRON 17), la (DET 674, PRON 1), les (DET 631, PRON 2), l’ (DET 316, PRON 12), un (DET 226, NOUN 9, NUM 5), une (DET 208, NOUN 6), l’ (DET 124, PRON 6, PART 1), de (ADP 1557, DET 36), ce (DET 34, PRON 27), d’ (ADP 178, DET 14)

Morphology

The form / lemma ratio of DET is 13.500000 (the average of all parts of speech is 309.550000).

The 1st highest number of forms (27) was observed with the lemma “_”: That, a, ce, ces, cet, cette, d’, de, des, du, d’, e, l’, la, ladite, le, les, l‘, l’, quelle, the, tous, tout, toute, toutes, un, une.

The 2nd highest number of forms (2) was observed with the lemma “le”: le, les.

DET occurs with 2 features: Number (3627; 100% instances), Gender (3613; 100% instances)

DET occurs with 4 feature-value pairs: Gender=Fem, Gender=Masc, Number=Plur, Number=Sing

DET occurs with 7 feature combinations. The most frequent feature combination is Gender=Masc|Number=Sing (1348 tokens). Examples: le, un, l’, l’, ce, tout, de, du, cet, les

Relations

DET nodes are attached to their parents using 7 different relations: det (3566; 98% instances), fixed (36; 1% instances), det:predet (20; 1% instances), advmod (5; 0% instances), case (1; 0% instances), mark (1; 0% instances), nmod (1; 0% instances)

Parents of DET nodes belong to 10 different parts of speech: NOUN (3291; 91% instances), PROPN (246; 7% instances), ADV (41; 1% instances), ADP (31; 1% instances), NUM (9; 0% instances), ADJ (3; 0% instances), DET (3; 0% instances), VERB (3; 0% instances), PRON (2; 0% instances), X (1; 0% instances)

3616 (100%) DET nodes are leaves.

9 (0%) DET nodes have one child.

3 (0%) DET nodes have two children.

2 (0%) DET nodes have three or more children.

The highest child degree of a DET node is 5.

Children of DET nodes are attached using 5 different relations: fixed (16; 70% instances), punct (3; 13% instances), advmod (2; 9% instances), case (1; 4% instances), nmod (1; 4% instances)

Children of DET nodes belong to 5 different parts of speech: ADV (8; 35% instances), ADP (5; 22% instances), NOUN (4; 17% instances), DET (3; 13% instances), PUNCT (3; 13% instances)