This is part of archived UD v1 documentation. See http://universaldependencies.org/ for the current version.
home uk/pos issue tracker

PROPN: proper noun

Definition

A proper noun is a noun that is the name of a specific individual, place, or object. Ukrainian proper nouns are always written starting with an uppercase letter. Note that names of days of week, names of months, names of languages, and adjectives derived from geographical names are not written capitalized (unlike in English) and are not considered proper nouns.

Single-word named entities should be tagged PROPN even if they originate from a common noun (Заєць, Бук)  or an adjective (Довгополий, Масна).  Even if they were originally adjectives and inflect according to adjectival paradigms, they behave syntactically as nouns. For instance, Масна  (the feminine version of surname Масний ) is originally feminine form of the adjective масний  “fatty” but as an anthroponimic name, it is a noun. It denotes a concrete person (rather than a property of somebody/something) and its gender is limited to feminine and masculine (while adjectives have forms in all three genders).

Personal names are typically treated as a sequence of proper nouns (one or more given names and one or more surnames). If the name contains prepositions, conjunctions or particles (foreign names), these are tagged as ADP, CONJ and DET, respectively.

Ukrainian (and other Slavic) multi-word named entities have internal syntactic structure, which is preserved in the annotation. The headword is always noun and there may be other nouns involved. They will be tagged either PROPN or NOUN and possible ambiguities must be resolved individually. Modifying adjectives are never tagged PROPN. Even if an adjective is the first word of a multi-word name, and thus it starts with an uppercase letter, it is still tagged ADJ. Similarly, function words in named entities retain their normal tags. These rules are less strict for foreign named entities where the original part of speech is hidden for a Ukrainian speaker.

Examples


Treebank Statistics (UD_Ukrainian)

There are 38 PROPN lemmas (6%), 40 PROPN types (6%) and 51 PROPN tokens (3%). Out of 16 observed tags, the rank of PROPN is: 5 in number of lemmas, 6 in number of types and 9 in number of tokens.

The 10 most frequent PROPN lemmas: Микола, Богдан, Павло, Петро, Кеннеді, Крушельниця, Стрий, С’юзі, Іван, Ігор

The 10 most frequent PROPN types: Микола, Павло, Богдан, Кеннеді, Крушельниця, Петро, Стрий, С’юзі, Іван, Ігоря

The 10 most frequent ambiguous lemmas:

The 10 most frequent ambiguous types:

Morphology

The form / lemma ratio of PROPN is 1.052632 (the average of all parts of speech is 1.172859).

The 1st highest number of forms (2) was observed with the lemma “Богдан”: Богдан, Богдана.

The 2nd highest number of forms (2) was observed with the lemma “Петро”: Петро, Петрові.

The 3rd highest number of forms (1) was observed with the lemma “Іван”: Іван.

PROPN occurs with 4 features: uk-feat/Animacy (51; 100% instances), uk-feat/Case (50; 98% instances), uk-feat/Gender (49; 96% instances), uk-feat/Number (1; 2% instances)

PROPN occurs with 11 feature-value pairs: Animacy=Anim, Animacy=Inan, Case=Acc, Case=Dat, Case=Gen, Case=Loc, Case=Nom, Case=Voc, Gender=Fem, Gender=Masc, Number=Plur

PROPN occurs with 15 feature combinations. The most frequent feature combination is Animacy=Anim|Case=Nom|Gender=Masc (24 tokens). Examples: Микола, Павло, Богдан, Петро, Іван, Гнатюк, Кеннеді, Макаревич, Мартин, Марчук

Relations

PROPN nodes are attached to their parents using 9 different relations: uk-dep/nsubj (18; 35% instances), uk-dep/appos (9; 18% instances), uk-dep/nmod (8; 16% instances), uk-dep/name (4; 8% instances), uk-dep/nsubjpass (3; 6% instances), uk-dep/remnant (3; 6% instances), uk-dep/vocative (3; 6% instances), uk-dep/iobj (2; 4% instances), uk-dep/root (1; 2% instances)

Parents of PROPN nodes belong to 6 different parts of speech: VERB (22; 43% instances), NOUN (16; 31% instances), PROPN (8; 16% instances), ADJ (3; 6% instances), PRON (1; 2% instances), ROOT (1; 2% instances)

34 (67%) PROPN nodes are leaves.

10 (20%) PROPN nodes have one child.

5 (10%) PROPN nodes have two children.

2 (4%) PROPN nodes have three or more children.

The highest child degree of a PROPN node is 4.

Children of PROPN nodes are attached using 8 different relations: uk-dep/punct (6; 22% instances), uk-dep/appos (5; 19% instances), uk-dep/name (4; 15% instances), uk-dep/remnant (4; 15% instances), uk-dep/amod (2; 7% instances), uk-dep/case (2; 7% instances), uk-dep/det (2; 7% instances), uk-dep/list (2; 7% instances)

Children of PROPN nodes belong to 6 different parts of speech: PROPN (8; 30% instances), NOUN (7; 26% instances), PUNCT (6; 22% instances), ADJ (2; 7% instances), ADP (2; 7% instances), DET (2; 7% instances)


PROPN in other languages: [bg] [cs] [de] [el] [en] [es] [eu] [fa] [fi] [fr] [ga] [he] [hu] [it] [ja] [ko] [sv] [u]