PROPN
: proper noun
Definition
A proper noun is a noun that is the name of a specific individual, place, or object. Ukrainian proper nouns are always written starting with an uppercase letter. Note that names of days of week, names of months, names of languages, and adjectives derived from geographical names are not written capitalized (unlike in English) and are not considered proper nouns.
Single-word named entities should be tagged PROPN
even if they originate from a common noun (Заєць, Бук) or an adjective (Довгополий, Масна). Even if they were originally adjectives and inflect according to adjectival paradigms, they behave syntactically as nouns. For instance, Масна (the feminine version of surname Масний ) is originally feminine form of the adjective масний “fatty” but as an anthroponimic name, it is a noun. It denotes a concrete person (rather than a property of somebody/something) and its gender is limited to feminine and masculine (while adjectives have forms in all three genders).
Personal names are typically treated as a sequence of proper nouns (one or more given names and one or more surnames). If the name contains prepositions, conjunctions or particles (foreign names), these are tagged as ADP
, CONJ
and DET
, respectively.
Ukrainian (and other Slavic) multi-word named entities have internal syntactic structure, which is preserved in the annotation. The headword is always noun and there may be other nouns involved. They will be tagged either PROPN
or NOUN
and possible ambiguities must be resolved individually. Modifying adjectives are never tagged PROPN
. Even if an adjective is the first word of a multi-word name, and thus it starts with an uppercase letter, it is still tagged ADJ
. Similarly, function words in named entities retain their normal tags. These rules are less strict for foreign named entities where the original part of speech is hidden for a Ukrainian speaker.
Examples
- Франкфурт
PROPN
наADP
МайніPROPN
is a city. Франкфурт is the head and the на Майні part refers to the river flowing through the city, to distinguish it from other Frankfurts. - Організація
NOUN
об’єднанихADJ
наційNOUN
“United Nations Organization” consists of three words, none of which is proper noun. However, the acronym ООН “UN” is a single-token name and is taggedPROPN
.
Treebank Statistics (UD_Ukrainian)
There are 38 PROPN
lemmas (6%), 40 PROPN
types (6%) and 51 PROPN
tokens (3%).
Out of 16 observed tags, the rank of PROPN
is: 5 in number of lemmas, 6 in number of types and 9 in number of tokens.
The 10 most frequent PROPN
lemmas: Микола, Богдан, Павло, Петро, Кеннеді, Крушельниця, Стрий, С’юзі, Іван, Ігор
The 10 most frequent PROPN
types: Микола, Павло, Богдан, Кеннеді, Крушельниця, Петро, Стрий, С’юзі, Іван, Ігоря
The 10 most frequent ambiguous lemmas:
The 10 most frequent ambiguous types:
Morphology
The form / lemma ratio of PROPN
is 1.052632 (the average of all parts of speech is 1.172859).
The 1st highest number of forms (2) was observed with the lemma “Богдан”: Богдан, Богдана.
The 2nd highest number of forms (2) was observed with the lemma “Петро”: Петро, Петрові.
The 3rd highest number of forms (1) was observed with the lemma “Іван”: Іван.
PROPN
occurs with 4 features: uk-feat/Animacy (51; 100% instances), uk-feat/Case (50; 98% instances), uk-feat/Gender (49; 96% instances), uk-feat/Number (1; 2% instances)
PROPN
occurs with 11 feature-value pairs: Animacy=Anim
, Animacy=Inan
, Case=Acc
, Case=Dat
, Case=Gen
, Case=Loc
, Case=Nom
, Case=Voc
, Gender=Fem
, Gender=Masc
, Number=Plur
PROPN
occurs with 15 feature combinations.
The most frequent feature combination is Animacy=Anim|Case=Nom|Gender=Masc
(24 tokens).
Examples: Микола, Павло, Богдан, Петро, Іван, Гнатюк, Кеннеді, Макаревич, Мартин, Марчук
Relations
PROPN
nodes are attached to their parents using 9 different relations: uk-dep/nsubj (18; 35% instances), uk-dep/appos (9; 18% instances), uk-dep/nmod (8; 16% instances), uk-dep/name (4; 8% instances), uk-dep/nsubjpass (3; 6% instances), uk-dep/remnant (3; 6% instances), uk-dep/vocative (3; 6% instances), uk-dep/iobj (2; 4% instances), uk-dep/root (1; 2% instances)
Parents of PROPN
nodes belong to 6 different parts of speech: VERB (22; 43% instances), NOUN (16; 31% instances), PROPN (8; 16% instances), ADJ (3; 6% instances), PRON (1; 2% instances), ROOT (1; 2% instances)
34 (67%) PROPN
nodes are leaves.
10 (20%) PROPN
nodes have one child.
5 (10%) PROPN
nodes have two children.
2 (4%) PROPN
nodes have three or more children.
The highest child degree of a PROPN
node is 4.
Children of PROPN
nodes are attached using 8 different relations: uk-dep/punct (6; 22% instances), uk-dep/appos (5; 19% instances), uk-dep/name (4; 15% instances), uk-dep/remnant (4; 15% instances), uk-dep/amod (2; 7% instances), uk-dep/case (2; 7% instances), uk-dep/det (2; 7% instances), uk-dep/list (2; 7% instances)
Children of PROPN
nodes belong to 6 different parts of speech: PROPN (8; 30% instances), NOUN (7; 26% instances), PUNCT (6; 22% instances), ADJ (2; 7% instances), ADP (2; 7% instances), DET (2; 7% instances)
PROPN in other languages: [bg] [cs] [de] [el] [en] [es] [eu] [fa] [fi] [fr] [ga] [he] [hu] [it] [ja] [ko] [sv] [u]