This is part of archived UD v1 documentation. See http://universaldependencies.org/ for the current version.
home fr/pos issue tracker

PROPN: proper noun

Definition

A proper noun is a noun (or nominal content word) that is the name (or part of the name) of a specific individual, place, or object. The names of people living in a place (such as Les Américains “The Americans”) should be tagged as NOUN (but this is not yet done consistently in the French data).

Examples


Treebank Statistics (UD_French)

There are 16554 PROPN lemmas (46%), 16554 PROPN types (35%) and 30746 PROPN tokens (8%). Out of 17 observed tags, the rank of PROPN is: 1 in number of lemmas, 1 in number of types and 6 in number of tokens.

The 10 most frequent PROPN lemmas: France, Paris, Europe, États-Unis, Jean, Maroc, État, la, Espagne, New

The 10 most frequent PROPN types: France, Paris, Europe, États-Unis, Jean, Maroc, État, la, Espagne, New

The 10 most frequent ambiguous lemmas: État (PROPN 80, NOUN 48), la (PROPN 3, NOUN 1, ADV 1), New (PROPN 65, X 1), York (PROPN 64, X 1), le (DET 43244, PRON 882, PROPN 3), The (PROPN 42, X 1), de (ADP 31526, PROPN 32), sud (NOUN 119, ADJ 2, PROPN 1), saint (NOUN 30, PROPN 10, ADJ 8), ONU (PROPN 33, X 1)

The 10 most frequent ambiguous types: État (PROPN 80, NOUN 45), la (DET 9727, PRON 110, PROPN 3, NOUN 1, ADV 1), New (PROPN 65, ADJ 2, X 1), York (PROPN 64, X 1), Nord (PROPN 54, NOUN 10), le (DET 13837, PRON 287, PROPN 3), The (DET 42, PROPN 42, X 1), de (ADP 26519, DET 434, PROPN 32), sud (NOUN 94, ADJ 2, PROPN 1), saint (PROPN 10, NOUN 8, ADJ 4)

Morphology

The form / lemma ratio of PROPN is 1.000000 (the average of all parts of speech is 1.307036).

The 1st highest number of forms (2) was observed with the lemma “Jésus-Christ”: J.-C., Jésus-Christ.

The 2nd highest number of forms (1) was observed with the lemma “’upa’upa”: ‘upa’upa.

The 3rd highest number of forms (1) was observed with the lemma “02”: 02.

PROPN occurs with 2 features: fr-feat/Gender (1; 0% instances), fr-feat/Number (1; 0% instances)

PROPN occurs with 2 feature-value pairs: Gender=Masc, Number=Sing

PROPN occurs with 2 feature combinations. The most frequent feature combination is _ (30745 tokens). Examples: France, Paris, Europe, États-Unis, Jean, Maroc, État, la, Espagne, New

Relations

PROPN nodes are attached to their parents using 25 different relations: fr-dep/nmod (12655; 41% instances), fr-dep/name (6562; 21% instances), fr-dep/appos (3498; 11% instances), fr-dep/nsubj (3382; 11% instances), fr-dep/conj (2683; 9% instances), fr-dep/dobj (698; 2% instances), fr-dep/amod (266; 1% instances), fr-dep/nsubjpass (241; 1% instances), fr-dep/xcomp (183; 1% instances), fr-dep/root (180; 1% instances), fr-dep/det (162; 1% instances), fr-dep/compound (64; 0% instances), fr-dep/case (62; 0% instances), fr-dep/nummod (44; 0% instances), fr-dep/dep (21; 0% instances), fr-dep/nmod:poss (9; 0% instances), fr-dep/acl:relcl (8; 0% instances), fr-dep/parataxis (6; 0% instances), fr-dep/advmod (5; 0% instances), fr-dep/acl (4; 0% instances), fr-dep/ccomp (4; 0% instances), fr-dep/vocative (4; 0% instances), fr-dep/advcl (3; 0% instances), fr-dep/cc (1; 0% instances), fr-dep/mwe (1; 0% instances)

Parents of PROPN nodes belong to 16 different parts of speech: NOUN (11563; 38% instances), PROPN (11429; 37% instances), VERB (7007; 23% instances), ADJ (267; 1% instances), ROOT (180; 1% instances), PRON (172; 1% instances), NUM (54; 0% instances), X (25; 0% instances), SYM (13; 0% instances), ADV (11; 0% instances), ADP (9; 0% instances), INTJ (5; 0% instances), PUNCT (5; 0% instances), DET (3; 0% instances), AUX (2; 0% instances), CONJ (1; 0% instances)

10533 (34%) PROPN nodes are leaves.

9108 (30%) PROPN nodes have one child.

6083 (20%) PROPN nodes have two children.

5022 (16%) PROPN nodes have three or more children.

The highest child degree of a PROPN node is 68.

Children of PROPN nodes are attached using 30 different relations: fr-dep/case (13081; 31% instances), fr-dep/name (6558; 16% instances), fr-dep/det (6265; 15% instances), fr-dep/punct (4816; 11% instances), fr-dep/conj (2785; 7% instances), fr-dep/nmod (2529; 6% instances), fr-dep/cc (1606; 4% instances), fr-dep/appos (1414; 3% instances), fr-dep/amod (891; 2% instances), fr-dep/acl (525; 1% instances), fr-dep/acl:relcl (435; 1% instances), fr-dep/nummod (406; 1% instances), fr-dep/compound (237; 1% instances), fr-dep/advmod (170; 0% instances), fr-dep/nsubj (160; 0% instances), fr-dep/cop (145; 0% instances), fr-dep/nmod:poss (47; 0% instances), fr-dep/expl (25; 0% instances), fr-dep/dep (21; 0% instances), fr-dep/neg (14; 0% instances), fr-dep/mark (12; 0% instances), fr-dep/advcl (11; 0% instances), fr-dep/dobj (9; 0% instances), fr-dep/parataxis (8; 0% instances), fr-dep/ccomp (6; 0% instances), fr-dep/mwe (3; 0% instances), fr-dep/aux (2; 0% instances), fr-dep/xcomp (2; 0% instances), fr-dep/discourse (1; 0% instances), fr-dep/nsubjpass (1; 0% instances)

Children of PROPN nodes belong to 17 different parts of speech: ADP (12985; 31% instances), PROPN (11429; 27% instances), DET (6172; 15% instances), PUNCT (4815; 11% instances), NOUN (2002; 5% instances), CONJ (1530; 4% instances), VERB (1128; 3% instances), ADJ (728; 2% instances), NUM (702; 2% instances), ADV (272; 1% instances), PRON (164; 0% instances), X (156; 0% instances), SYM (46; 0% instances), PART (30; 0% instances), SCONJ (22; 0% instances), AUX (2; 0% instances), INTJ (2; 0% instances)


PROPN in other languages: [bg] [cs] [de] [el] [en] [es] [eu] [fa] [fi] [fr] [ga] [he] [hu] [it] [ja] [ko] [sv] [u]