home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Portuguese-GSD: POS Tags: DET

There are 43 DET lemmas (0%), 128 DET types (0%) and 47602 DET tokens (15%). Out of 16 observed tags, the rank of DET is: 8 in number of lemmas, 11 in number of types and 3 in number of tokens.

The 10 most frequent DET lemmas: o, _, um, seu, este, esse, todo, outro, algum, cada

The 10 most frequent DET types: o, a, os, as, um, uma, sua, seu, seus, esta

The 10 most frequent ambiguous lemmas: o (DET 39669, PRON 164, PROPN 1), _ (PROPN 26803, ADP 7821, PRON 6131, DET 3765, NOUN 3010, NUM 2377, AUX 1984, CCONJ 1516, PUNCT 1272, VERB 1077, SYM 904, ADJ 597, PART 561, X 379, ADV 191, SCONJ 3), um (DET 3330, NUM 85, PROPN 2, NOUN 1), seu (DET 261, PRON 1), este (DET 136, PRON 10), esse (DET 74, PRON 12), todo (DET 59, ADJ 21, NOUN 18, PRON 9, ADV 1), outro (DET 51, PRON 21), algum (DET 28, PRON 7), cada (DET 24, NOUN 1)

The 10 most frequent ambiguous types: o (DET 16453, PRON 342, ADP 1, PROPN 1, X 1), a (DET 13291, ADP 3597, SCONJ 266, PRON 77, VERB 5, PROPN 3, CCONJ 1, X 1), os (DET 3846, PRON 47, PROPN 1), as (DET 2475, PRON 17, ADP 11, PROPN 1), um (DET 1704, NUM 154, PRON 143, PROPN 2, NOUN 1), uma (DET 1639, NUM 102, PRON 73, ADP 2), sua (DET 513, PRON 2), seu (DET 422, PRON 1), esta (DET 134, PRON 23, VERB 2, AUX 1), este (DET 118, PRON 16)

Morphology

The form / lemma ratio of DET is 2.976744 (the average of all parts of speech is 2.236183).

The 1st highest number of forms (116) was observed with the lemma “_”: Duas, Imensas, Três, Tua, a, aas, algum, alguma, algumas, alguns, ambas, ambos, aquela, aquele, as, bastante, cada, casa, certa, certas, certos, cuja, cujas, cujo, cujos, dado, de, demais, determinada, determinadas, determinado, determinados, diferente, diferentes, diversas, diversos, e, essa, essas, esse, esses, esta, estas, este, estes, flagrante, la, le, les, mais, menos, meu, meus, minha, minhas, muita, muitas, muito, muitos, múltiplos, nenhum, nenhuma, nossa, nossas, nosso, nossos, numerosas, nível, o, oa, onze, os, outra, outras, outro, outros, pouca, poucas, pouco, poucos, pouquíssimos, quais, quaisquer, qual, qualquer, quantas, quantos, que, seu, seus, sua, suas, sui, tais, tal, tanta, tantas, tanto, tantos, the, toda, todas, todo, todos, um, uma, umas, uns, varias, varios, vossa, vosso, várias, vários, à, às.

The 2nd highest number of forms (5) was observed with the lemma “pouco”: menos, pouca, poucas, pouco, poucos.

The 3rd highest number of forms (4) was observed with the lemma “algum”: algum, alguma, algumas, alguns.

DET occurs with 8 features: PronType (38842; 82% instances), Number (38837; 82% instances), Gender (38829; 82% instances), Definite (38009; 80% instances), ExtPos (3; 0% instances), Foreign (3; 0% instances), NumType (2; 0% instances), Poss (1; 0% instances)

DET occurs with 19 feature-value pairs: Definite=Def, Definite=Ind, ExtPos=PROPN, Foreign=Yes, Gender=Fem, Gender=Masc, NumType=Card, Number=Plur, Number=Sing, Poss=Yes, PronType=Art, PronType=Dem, PronType=Emp, PronType=Ind, PronType=Int, PronType=Neg, PronType=Prs, PronType=Rel, PronType=Tot

DET occurs with 48 feature combinations. The most frequent feature combination is Definite=Def|Gender=Masc|Number=Sing|PronType=Art (17930 tokens). Examples: o, a

Relations

DET nodes are attached to their parents using 15 different relations: det (46161; 97% instances), det:poss (1317; 3% instances), fixed (33; 0% instances), mark (33; 0% instances), dep (13; 0% instances), nmod (9; 0% instances), conj (6; 0% instances), flat:name (6; 0% instances), appos (5; 0% instances), nsubj (5; 0% instances), obj (5; 0% instances), obl (4; 0% instances), advmod (3; 0% instances), case (1; 0% instances), root (1; 0% instances)

Parents of DET nodes belong to 14 different parts of speech: NOUN (37054; 78% instances), PROPN (9207; 19% instances), PRON (313; 1% instances), NUM (299; 1% instances), VERB (297; 1% instances), PART (132; 0% instances), ADV (130; 0% instances), ADJ (87; 0% instances), ADP (27; 0% instances), SYM (20; 0% instances), DET (18; 0% instances), X (16; 0% instances), CCONJ (1; 0% instances), (1; 0% instances)

47513 (100%) DET nodes are leaves.

37 (0%) DET nodes have one child.

42 (0%) DET nodes have two children.

10 (0%) DET nodes have three or more children.

The highest child degree of a DET node is 8.

Children of DET nodes are attached using 13 different relations: fixed (85; 53% instances), punct (15; 9% instances), flat:name (13; 8% instances), nmod (12; 7% instances), case (11; 7% instances), cc (7; 4% instances), conj (6; 4% instances), det (5; 3% instances), appos (3; 2% instances), acl:relcl (1; 1% instances), amod (1; 1% instances), nsubj (1; 1% instances), parataxis (1; 1% instances)

Children of DET nodes belong to 13 different parts of speech: NOUN (42; 26% instances), CCONJ (32; 20% instances), DET (18; 11% instances), PROPN (15; 9% instances), PUNCT (15; 9% instances), ADP (14; 9% instances), NUM (11; 7% instances), SCONJ (4; 2% instances), ADJ (3; 2% instances), PRON (3; 2% instances), ADV (2; 1% instances), PART (1; 1% instances), VERB (1; 1% instances)