home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Portuguese-DANTEStocks: POS Tags: PROPN

There are 1724 PROPN lemmas (19%), 1722 PROPN types (15%) and 11762 PROPN tokens (15%). Out of 16 observed tags, the rank of PROPN is: 2 in number of lemmas, 4 in number of types and 3 in number of tokens.

The 10 most frequent PROPN lemmas: petr4, #petr4, vale5, petrobras, #vale5, vale, @live_trade, petr3, bbas3, oibr4

The 10 most frequent PROPN types: petr4, #petr4, vale5, petrobras, #vale5, vale, @live_trade, petr3, bbas3, oibr4

The 10 most frequent ambiguous lemmas: #petr4 (PROPN 115, X 24), #vale5 (X 178, PROPN 63), vale (PROPN 6, NOUN 1), bolsa (PROPN 24, NOUN 2), #oibr4 (PROPN 5, X 1), #usim5 (PROPN 11, X 3), banco (NOUN 22, PROPN 3), $PETR3 (PROPN 29, X 3), #petrobras (PROPN 8, X 2), CPI (PROPN 21, NOUN 17)

The 10 most frequent ambiguous types: petr4 (PROPN 101, X 1), #petr4 (PROPN 115, X 24), #vale5 (X 178, PROPN 63), vale (VERB 31, PROPN 6, NOUN 1), bolsa (PROPN 24, NOUN 1), #oibr4 (PROPN 5, X 1), #usim5 (PROPN 11, X 3), banco (NOUN 7, PROPN 3), $PETR3 (PROPN 29, X 3), #petrobras (PROPN 8, X 2)

Morphology

The form / lemma ratio of PROPN is 0.998840 (the average of all parts of speech is 1.238049).

The 1st highest number of forms (2) was observed with the lemma “@Bancotario”: @, @Bancotario.

The 2nd highest number of forms (1) was observed with the lemma “#”: #.

The 3rd highest number of forms (1) was observed with the lemma “#A”: #A.

PROPN occurs with 5 features: Typo (102; 1% instances), Number (6; 0% instances), Foreign (5; 0% instances), Gender (4; 0% instances), PronType (1; 0% instances)

PROPN occurs with 5 feature-value pairs: Foreign=Yes, Gender=Fem, Number=Sing, PronType=Art, Typo=Yes

PROPN occurs with 6 feature combinations. The most frequent feature combination is _ (11649 tokens). Examples: petr4, #petr4, vale5, petrobras, #vale5, vale, @live_trade, petr3, bbas3, oibr4

Relations

PROPN nodes are attached to their parents using 20 different relations: conj (3016; 26% instances), nmod (2874; 24% instances), nsubj (1573; 13% instances), vocative (895; 8% instances), appos (870; 7% instances), flat:name (668; 6% instances), parataxis (639; 5% instances), obj (569; 5% instances), obl (401; 3% instances), root (187; 2% instances), nsubj:pass (28; 0% instances), advcl (16; 0% instances), obl:agent (13; 0% instances), dislocated (3; 0% instances), xcomp (3; 0% instances), acl:relcl (2; 0% instances), discourse (2; 0% instances), acl (1; 0% instances), ccomp (1; 0% instances), reparandum (1; 0% instances)

Parents of PROPN nodes belong to 12 different parts of speech: PROPN (4765; 41% instances), VERB (3188; 27% instances), NOUN (2807; 24% instances), X (379; 3% instances), (187; 2% instances), ADJ (133; 1% instances), SYM (119; 1% instances), ADV (81; 1% instances), PRON (46; 0% instances), NUM (40; 0% instances), AUX (9; 0% instances), INTJ (8; 0% instances)

4379 (37%) PROPN nodes are leaves.

3264 (28%) PROPN nodes have one child.

2456 (21%) PROPN nodes have two children.

1663 (14%) PROPN nodes have three or more children.

The highest child degree of a PROPN node is 28.

Children of PROPN nodes are attached using 28 different relations: punct (3937; 25% instances), conj (3128; 20% instances), nmod (2186; 14% instances), case (1923; 12% instances), det (1884; 12% instances), flat:name (670; 4% instances), appos (606; 4% instances), cc (426; 3% instances), parataxis (381; 2% instances), advmod (73; 0% instances), amod (47; 0% instances), discourse (47; 0% instances), vocative (46; 0% instances), cop (44; 0% instances), nsubj (43; 0% instances), acl (41; 0% instances), acl:relcl (19; 0% instances), orphan (19; 0% instances), mark (18; 0% instances), nummod (15; 0% instances), advcl (7; 0% instances), goeswith (6; 0% instances), list (5; 0% instances), dep (4; 0% instances), obl (3; 0% instances), csubj (1; 0% instances), nsubj:outer (1; 0% instances), reparandum (1; 0% instances)

Children of PROPN nodes belong to 16 different parts of speech: PROPN (4765; 31% instances), PUNCT (3937; 25% instances), ADP (1922; 12% instances), DET (1884; 12% instances), NUM (1174; 8% instances), SYM (669; 4% instances), CCONJ (418; 3% instances), NOUN (346; 2% instances), VERB (141; 1% instances), ADV (87; 1% instances), X (77; 0% instances), ADJ (54; 0% instances), AUX (49; 0% instances), PRON (29; 0% instances), INTJ (20; 0% instances), SCONJ (9; 0% instances)