Treebank Statistics: UD_English-EWT: POS Tags: PROPN
There are 5207 PROPN
lemmas (28%), 5340 PROPN
types (24%) and 16559 PROPN
tokens (6%).
Out of 17 observed tags, the rank of PROPN
is: 2 in number of lemmas, 2 in number of types and 8 in number of tokens.
The 10 most frequent PROPN
lemmas: Bush, US, al, Iraq, enron, State, Iran, China, September, Qaeda
The 10 most frequent PROPN
types: bush, US, al, Iraq, enron, Iran, China, states, John, Qaeda
The 10 most frequent ambiguous lemmas: enron (PROPN 7, X 5), president (NOUN 30, PROPN 1), American (ADJ 88, PROPN 54), mark (NOUN 13, VERB 12, PROPN 2), North (PROPN 34, ADJ 2), god (PROPN 5, NOUN 4), south (NOUN 10, ADV 6, ADJ 1, PROPN 1), street (NOUN 29, PROPN 1), Iraqi (ADJ 52, PROPN 18), West (PROPN 18, ADJ 3)
The 10 most frequent ambiguous types: al (PROPN 67, X 1), states (NOUN 10, PROPN 6, VERB 5), president (NOUN 24, PROPN 1), may (AUX 221, PROPN 1), google (PROPN 3, VERB 2), mark (NOUN 10, VERB 6, PROPN 2), north (NOUN 6, ADV 5, ADJ 2, PROPN 2), god (PROPN 5, NOUN 2), world (NOUN 131, PROPN 4), house (NOUN 60, PROPN 3, VERB 1)
- al
- states
- president
- may
- AUX 221: Adobe Acrobat Reader 4.0 may be downloaded for FREE from www.adobe.com .
- PROPN 1: Problem is , for some reason , the visa process took longer than it should , thus I missed school this semester ( visa was issued to me about 25 days after school started so I could n’t attend ) , now I no longer want to go into that school ( because they only would accept me again on September of 2012 ) , I found a school that accepted me for may 2012 , can I use the same visa that was issued to me ?
- mark
- north
- NOUN 6: Bilboa on the north coast , Pamplona and the very famous Guernica .
- ADV 5: There ‘s a Miramar in Florida , just north of Miami .
- ADJ 2: There s a reason why Frank mcclelland was named best chef of the north east reigon .
- PROPN 2: I have never been anywhere out side my home town Charlotte north Carolina please help !!!!
- god
- PROPN 5: oh god is there an agenda .
- NOUN 2: To Paul McCartney whose company ‘s logo was a person toying with the planets as if he was a god , and who was being very much deluded in his ego trip by the fact that he was made “ Sir “ ( when in England even the road sweeper is made Sir , as long as he produces money for the nation ) , GOD provided cancer to the wife .
- world
- house
Morphology
The form / lemma ratio of PROPN
is 1.025543 (the average of all parts of speech is 1.237686).
The 1st highest number of forms (4) was observed with the lemma “Friday”: Fri, Fri., Fridays, friday.
The 2nd highest number of forms (4) was observed with the lemma “March”: MARCH, Mar, March, Marches.
The 3rd highest number of forms (4) was observed with the lemma “McDonald”: Mc.Donald, McDonal, mc, mcdonald.
PROPN
occurs with 4 features: Number (16089; 97% instances), Abbr (121; 1% instances), Typo (85; 1% instances), Style (2; 0% instances)
PROPN
occurs with 6 feature-value pairs: Abbr=Yes
, Number=Plur
, Number=Ptan
, Number=Sing
, Style=Expr
, Typo=Yes
PROPN
occurs with 10 feature combinations.
The most frequent feature combination is Number=Sing
(15212 tokens).
Examples: bush, US, al, Iraq, enron, Iran, China, Qaeda, John, india
Relations
PROPN
nodes are attached to their parents using 26 different relations: compound (3167; 19% instances), flat (2100; 13% instances), nmod (1985; 12% instances), nsubj (1982; 12% instances), obl (1692; 10% instances), root (1646; 10% instances), conj (1030; 6% instances), appos (715; 4% instances), obj (628; 4% instances), nmod:poss (482; 3% instances), list (282; 2% instances), obl:agent (140; 1% instances), vocative (133; 1% instances), nsubj:pass (124; 1% instances), obl:unmarked (81; 0% instances), iobj (77; 0% instances), parataxis (75; 0% instances), xcomp (72; 0% instances), nmod:unmarked (69; 0% instances), ccomp (32; 0% instances), advcl (28; 0% instances), acl:relcl (8; 0% instances), acl (4; 0% instances), discourse (3; 0% instances), csubj (2; 0% instances), reparandum (2; 0% instances)
Parents of PROPN
nodes belong to 13 different parts of speech: PROPN (5940; 36% instances), VERB (4250; 26% instances), NOUN (4058; 25% instances), (1646; 10% instances), ADJ (372; 2% instances), ADV (105; 1% instances), PRON (77; 0% instances), NUM (45; 0% instances), AUX (26; 0% instances), INTJ (18; 0% instances), SYM (14; 0% instances), DET (7; 0% instances), ADP (1; 0% instances)
6434 (39%) PROPN
nodes are leaves.
4997 (30%) PROPN
nodes have one child.
2802 (17%) PROPN
nodes have two children.
2326 (14%) PROPN
nodes have three or more children.
The highest child degree of a PROPN
node is 18.
Children of PROPN
nodes are attached using 35 different relations: case (4575; 23% instances), punct (2905; 15% instances), flat (2105; 11% instances), compound (1942; 10% instances), det (1421; 7% instances), conj (1182; 6% instances), amod (1122; 6% instances), cc (735; 4% instances), appos (635; 3% instances), nummod (594; 3% instances), list (585; 3% instances), nmod (513; 3% instances), cop (198; 1% instances), nsubj (186; 1% instances), advmod (160; 1% instances), parataxis (135; 1% instances), acl:relcl (124; 1% instances), nmod:poss (116; 1% instances), nmod:unmarked (84; 0% instances), goeswith (77; 0% instances), discourse (53; 0% instances), acl (45; 0% instances), mark (42; 0% instances), obl (36; 0% instances), aux (30; 0% instances), cc:preconj (27; 0% instances), advcl (11; 0% instances), nmod:desc (8; 0% instances), obl:unmarked (6; 0% instances), expl (5; 0% instances), advcl:relcl (4; 0% instances), orphan (3; 0% instances), reparandum (3; 0% instances), vocative (2; 0% instances), det:predet (1; 0% instances)
Children of PROPN
nodes belong to 17 different parts of speech: PROPN (5940; 30% instances), ADP (3967; 20% instances), PUNCT (2905; 15% instances), DET (1427; 7% instances), ADJ (1127; 6% instances), NOUN (1005; 5% instances), NUM (891; 5% instances), CCONJ (727; 4% instances), PART (586; 3% instances), VERB (324; 2% instances), AUX (230; 1% instances), PRON (178; 1% instances), ADV (152; 1% instances), X (87; 0% instances), SYM (62; 0% instances), SCONJ (34; 0% instances), INTJ (28; 0% instances)