Treebank Statistics: UD_Portuguese-GSD: POS Tags: X
There are 24 X
lemmas (0%), 160 X
types (0%) and 403 X
tokens (0%).
Out of 16 observed tags, the rank of X
is: 12 in number of lemmas, 8 in number of types and 16 in number of tokens.
The 10 most frequent X
lemmas: _, dele, Y, art, avant, best, center, di, food, free
The 10 most frequent X
types: disso, deles, delas, dele, do, +, etc, @, comigo, nele
The 10 most frequent ambiguous lemmas: _ (PROPN 26803, ADP 7821, PRON 6131, DET 3765, NOUN 3010, NUM 2377, AUX 1984, CCONJ 1516, PUNCT 1272, VERB 1077, SYM 904, ADJ 597, PART 561, X 379, ADV 191, SCONJ 3), art (NOUN 1, X 1), di (PROPN 1, X 1), of (PROPN 15, ADP 2, X 1), off (NOUN 1, X 1), spin (NOUN 1, X 1)
The 10 most frequent ambiguous types: disso (X 58, ADP 2), dele (X 16, PRON 2), + (X 10, PUNCT 8, PROPN 2), etc (X 10, ADV 1), @ (X 9, PROPN 1, PUNCT 1), comigo (X 9, NOUN 1), no (X 7, PRON 2), pelo (X 7, NOUN 1), desse (ADP 30, X 3, VERB 1), desses (ADP 15, X 4)
- disso
- dele
- +
- X 10: 10 dias sem custos ** Após este período , custo de R $ 0,31 + imp .
- PUNCT 8: E se você tem uma memória melhor de o que a minha , poderá decorar uma dezena de atalhos de teclado : basta teclar Ctrl + Alt + ?
- PROPN 2: A partida será exibida a o vivo em a ESPN + ( ex - ESPN HD ) , em a TV fechada , e em o Esporte Interativo , canal aberto ( em São Paulo , sintonizado por o 36 UHF ) .
- etc
- @
- X 9: ( 82 ) 8883-7564 ou com a filha de a dona de casa por o simone.bavaroski @ gmail.com .
- PROPN 1: O advogado de o meia , Sascha Beumer , decidiu empreender ações legais depois que um desconhecido , através de a conta “ @ PiratenOnline “ , assegurou , durante a partida entre Alemanha e Dinamarca , que o jogador não é alemão e criticou seu nome , de origem muçulmana .
- PUNCT 1: @ jabisbusqueti É COM IMENSO PESAR QUE FIQUEI SABENDO DE O FALECIMENTO DE O AMIGO ABDIEL – EX ASSESSOR DEP .
- comigo
- no
- pelo
- desse
- desses
Morphology
The form / lemma ratio of X
is 6.666667 (the average of all parts of speech is 2.236183).
The 1st highest number of forms (139) was observed with the lemma “_”: #, &, +, @, Amazon.com, Destes, Flyscoot.com, GameSpot.com, Neles, OBS., T, UltimoInstante, a, a_0, a_1, a_2, a_3, a_i, a_n, amaralcarvalho.org.br, ao, aos, art, atributos.Alucard, b, cdots, comigo, conosco, consigo, contato@cinedireitoshumanos.org.br, contigo, cpae@unesc.net, d, da, daquelas, daquele, daqueles, das, dela, delas, dele, deles, denunciapropaganda@tre-rj.jus.br, dessa, dessas, desse, desses, desta, destas, deste, disso, disto, do, dos, durvalorlato, e, eletrônicowww.cespe.unb.br/concursos/pc_al_12, etc, ex, fake, g1.globo.com/economia, g1.globo.com/ma, g1.globo.com/para, g1.globo.com/piaui, g1.globo.com/politica, g1.globo.com/ribeirao, g1.globo.com/vanguarda, gmail.com, http://m.goal.com, http://t.co/HmrlNAqd, http://www.cmgww.com/stars/baker/about/biography.html, http://www.portal-gestao.com/financas/folhas-de-calculo.html, i, k, m, n, na, naquilo, nela, nelas, nele, nesta, neste, nisso, no, num, o, offs, ouvidoria@imepi.pi.gov.br, p, p.e., pelo, planeta1@sercomtel.com.br, play, poupatemposp, prev, que, r., simone.bavaroski, sum_, up, usopera.com, v1, v2, vm, www.anac.gov.br., www.barracaodosamba.com, www.centropaulasouza.sp.gov.br, www.cotec.unimontes.br, www.detran.rj.gov.br, www.edraaeronautica.com.br, www.goobec.com.br, www.informalcool.org.br, www.ingresso.com, www.ipem.rj.gov.br., www.planetaeducacao.com.br, www.planexcon.com.br, www.receita.fazenda.gov.br, www.saocaetanodosul.sp.gov.br, www.submarino.com.br, www.timedoemprego.sp.gov.br, www.universa.org.br, www.valeviagemcvc.com.br, www.vestibulinhoetec.com.br, x, à, àquela, àqueles, às, λ1, λ2, λm, الاذكار, مشرق, ☎, 天台, 日, 禅, 莲.
The 2nd highest number of forms (1) was observed with the lemma “Y”: \epsilon=\epsilon_{0}.
The 3rd highest number of forms (1) was observed with the lemma “art”: art.
X
occurs with 3 features: Gender (6; 1% instances), Number (6; 1% instances), ExtPos (4; 1% instances)
X
occurs with 4 feature-value pairs: ExtPos=NOUN
, Gender=Fem
, Gender=Masc
, Number=Sing
X
occurs with 5 feature combinations.
The most frequent feature combination is _
(394 tokens).
Examples: disso, deles, delas, dele, do, +, etc, @, comigo, nele
Relations
X
nodes are attached to their parents using 19 different relations: nmod (230; 57% instances), appos (35; 9% instances), conj (27; 7% instances), fixed (25; 6% instances), flat (23; 6% instances), case (10; 2% instances), parataxis (9; 2% instances), dep (8; 2% instances), flat:foreign (8; 2% instances), cc (6; 1% instances), obj (4; 1% instances), root (4; 1% instances), amod (3; 1% instances), nsubj (3; 1% instances), iobj (2; 0% instances), mark (2; 0% instances), obl (2; 0% instances), ccomp (1; 0% instances), nsubj:pass (1; 0% instances)
Parents of X
nodes belong to 11 different parts of speech: NOUN (113; 28% instances), VERB (101; 25% instances), ADV (59; 15% instances), X (41; 10% instances), PRON (33; 8% instances), ADJ (20; 5% instances), PROPN (18; 4% instances), NUM (7; 2% instances), ADP (4; 1% instances), (4; 1% instances), SYM (3; 1% instances)
253 (63%) X
nodes are leaves.
93 (23%) X
nodes have one child.
30 (7%) X
nodes have two children.
27 (7%) X
nodes have three or more children.
The highest child degree of a X
node is 38.
Children of X
nodes are attached using 22 different relations: punct (91; 32% instances), acl:relcl (46; 16% instances), flat (30; 11% instances), case (26; 9% instances), nmod (15; 5% instances), conj (14; 5% instances), det (14; 5% instances), flat:foreign (8; 3% instances), cc (7; 2% instances), appos (5; 2% instances), acl (4; 1% instances), amod (4; 1% instances), advmod (3; 1% instances), cop (3; 1% instances), nsubj (3; 1% instances), nummod (3; 1% instances), dep (1; 0% instances), det:poss (1; 0% instances), flat:name (1; 0% instances), mark (1; 0% instances), parataxis (1; 0% instances), xcomp (1; 0% instances)
Children of X
nodes belong to 12 different parts of speech: PUNCT (91; 32% instances), VERB (52; 18% instances), X (41; 15% instances), ADP (24; 9% instances), NOUN (20; 7% instances), DET (16; 6% instances), PROPN (13; 5% instances), CCONJ (8; 3% instances), NUM (6; 2% instances), ADJ (4; 1% instances), ADV (4; 1% instances), AUX (3; 1% instances)