POS tags
Open class words | Closed class words | Other |
---|---|---|
ADJ | ADP | PUNCT |
ADV | AUX | SYM |
INTJ | CONJ | X |
NOUN | DET | |
PROPN | NUM | |
VERB | PART | |
PRON | ||
SCONJ |
ADJ
: adjective
Definition
ADJ
pos tags is used for adjectives. These are morphologically nominals that
typically have comparative derivations as well. They modify nominals and
typically agree with the modified nominals in some of the morphological
features. Adjectives are typically relatively well established in grammars and
easy to spot.
Less standard cases:
- Pro-adjectives are tagged as
ADJ
. - Ordinal numerals are tagged as
ADJ
. - Participles when lexicalised are tagged as
ADJ
.
Telling DET
apart from ADJ
…
Examples
- [fi] paha “evil”
- [fi] sellainen “like that”
- [fi] ensimmäinen “first”
ADP
: adposition
Definition
ADP
is used for adpositions, i.e., prepositions and postpositions (many
Uralic languages won’t make strict distinctions). Adpositions have nominal
complements that can be identified by the cases, or the complement may be
encoded by a possessive suffix on the adposition itself. ADP
can be separated
from ADV
by lack of complement or if the complement is verb or a clause.
Adpositions are often derived from defective or historic nominal paradigms, the tagging should follow modern diachronic analysis instead of etymological.
Examples
- [fi] alla “under”, alta “from under”, alle “to under”
- [fi] vuoksi “because of”, vuokseni, “because of me”
ADV
: adverb
Description
Adverbs are words that modify verbs, clauses or other ad-words, but not nouns. Adverbs are commonly derived from all parts-of-speech, sometimes the distinction between adverb and inflection is not clear. Most common adverb derivation for most languages is from verb to manner of acting, e.g. [fi] kauniisti “beautifully” (< kaunis “beautiful).
Examples
- [fi] nopeasti “fast”, nopeammin “faster”
- [fi] maanantaisin “on mondays”, kahdesti “twice”
- [fi] käsin “manually” (< käsi “hand”, instructive being inflection as well)
- [fi] puolueittain “party by party” (< puolue “party” + distributive)
AUX
: auxiliary verb
Description
Auxiliary verb is a verb carrying some of the verb phrase’s features or categories. It has not been very systematically dealt with in traditional Uralic grammars. Most languages have selected a closed sub-set of verbs. For example verbs that have infinitive complements, verbs for syntactic tense constructions, verbs used in possessive structures, etc.
Negation verbs are not currently marked as AUX
, even when they carry the main
verb’s inflectional features (only in Finnish of current tree-banks).
Examples
- [fi] on tullut “has come”, olisi vienyt “would’ve taken”
- [fi] pitää olla “must be”
~~~ conllu # sentence-text: Grande finalen pitää olla grande, eikä pienestä lurituksesta tule muuta kuin kiukkuiseksi. 1 Grande Grande X Foreign Foreign=Foreign 5 nsubj:cop _ _ 2 finalen finale X Foreign Foreign=Foreign 1 foreign _ _ 3 pitää pitää AUX V Mood=Ind|Number=Sing|Person=3|Tense=Pres|VerbForm=Fin|Voice=Act 5 aux _ _ 4 olla olla VERB V InfForm=1|Number=Sing|VerbForm=Inf|Voice=Act 5 cop _ _ 5 grande grande X Foreign Foreign=Foreign 0 root _ SpaceAfter=No 6 , , PUNCT Punct _ 5 punct _ _ 7 eikä ei VERB V Clitic=Ka|Negative=Neg|Number=Sing|Person=3|VerbForm=Fin|Voice=Act 5 cc _ _ 8 pienestä pieni ADJ A Case=Ela|Degree=Pos|Number=Sing 9 amod _ _ 9 lurituksesta luritus NOUN N Case=Ela|Number=Sing 10 nmod _ _ 10 tule tulla VERB V Connegative=Yes|Mood=Ind|Tense=Pres|VerbForm=Fin 5 conj _ _ 11 muuta muu PRON Pron Case=Par|Number=Sing|PronType=Ind 10 xcomp _ _ 12 kuin kuin SCONJ C _ 13 mark _ _ 13 kiukkuiseksi kiukkuinen ADJ A Case=Tra|Degree=Pos|Number=Sing 11 advcl _ SpaceAfter=No 14 . . PUNCT Punct _ 5 punct _ _ ~̃~~
CONJ
: coordinating conjunction
Definition
A coordinating conjunction is a word that links words or larger constituents without syntactically subordinating one to the other and expresses a semantic relationship between them.
Some coordinating conjunctions appear in pairs or groups of more than one: [fi] sekä … että “both … and”.
Co-ordination and enclitic particles
Many Uralic languages have enclitic particles that are used in co-ordination in addition to or instead of specific words. It is not clear what is the best way to analyse this in Uralic Dependencies.
Examples
- [fi] ja “and”, tai “or”, vai “or”, sekä “and”, mutta “but”
- [fi] eikä “and not”, enkä “and I do not”
~̃~~ conllu # sentence-text: Jäällä kävely avaa aina hauskoja ja erikoisia näkökulmia kaupunkiin. 1 Jäällä jää NOUN N Case=Ade|Number=Sing 2 nmod _ _ 2 kävely kävely NOUN N Case=Nom|Number=Sing 3 nsubj _ _ 3 avaa avata VERB V Mood=Ind|Number=Sing|Person=3|Tense=Pres|VerbForm=Fin|Voice=Act 0 root _ _ 4 aina aina ADV Adv _ 3 advmod _ _ 5 hauskoja hauska ADJ A Case=Par|Degree=Pos|Number=Plur 8 amod _ _ 6 ja ja CONJ C _ 5 cc _ _ 7 erikoisia erikoinen ADJ A Case=Par|Degree=Pos|Number=Plur 5 conj 8:amod _ 8 näkökulmia näkö#kulma NOUN N Case=Par|Number=Plur 3 dobj _ _ 9 kaupunkiin kaupunki NOUN N Case=Ill|Number=Sing 8 nmod _ SpaceAfter=No 10 . . PUNCT Punct _ 3 punct _ _ ~~~
DET
: determiner
Description
Determiners are words that modify nouns or noun phrases. Determiners are scarcely used in Uralic grammar tradition, however, Hungarian has proper articles that are analysed as determiners.
Some pronouns, demonstratives for example, could be re-analysed as determiners, as well as some non-inflecting adjectives.
Examples
- [fi] se “that”, yksi “one”
- [fi] mikä “what”
INTJ
: interjection
Description
An interjection is a word that is used most often as an exclamation or part of an exclamation. It typically expresses an emotional reaction, is not syntactically related to other accompanying expressions, and may include a combination of sounds not otherwise found in the language. Interjections are used for many spoken language tokens.
Enclitic discourse particles are not, at the moment, analysed at all.
Some interjections feature rich morphology or have counterparts in other parts of speech.
Examples
- [fi] nam “yum”, voi “oh”, vittu “fuck”
NOUN
: noun
Description
Nouns are a part of speech typically denoting a person, place, thing, animal or
idea. The NOUN
tag is intended for common nouns only. See PROPN
for proper
nouns and PRON
for pronouns. Nouns are as described in Universal rules.
Uralic nouns can be recognised from typically the case, number and etc.
inflection. Some noun forms lexicalise to Adverbs or adpositions. Nouns do not
typically have so strong tendency of comparative derivations like
adjectives do.
Examples
- [fi] talo “house”, tyttö “girl”
~~~ conllu # sentence-text: Oma jääkaappini oli aivan tyhjä ja koska kauppareissu tällä jalalla ei houkuttanut sitten alkuunkaan, turvauduin ystäväni apuun. 1 Oma oma ADJ A Case=Nom|Degree=Pos|Number=Sing 2 amod _ _ 2 jääkaappini jää#kaappi NOUN N Case=Nom|Number=Sing|Number[psor]=Sing|Person[psor]=1 5 nsubj:cop _ _ 3 oli olla VERB V Mood=Ind|Number=Sing|Person=3|Tense=Past|VerbForm=Fin|Voice=Act 5 cop _ _ 4 aivan aivan ADV Adv _ 5 advmod _ _ 5 tyhjä tyhjä ADJ A Case=Nom|Degree=Pos|Number=Sing 0 root _ _ 6 ja ja CONJ C _ 5 cc _ _ 7 koska koska SCONJ C _ 12 mark _ _ 8 kauppareissu kauppa#reissu NOUN N Case=Nom|Number=Sing 12 nsubj _ _ 9 tällä tämä PRON Pron Case=Ade|Number=Sing|PronType=Dem 10 det _ _ 10 jalalla jalka NOUN N Case=Ade|Number=Sing 8 nmod _ _ 11 ei ei VERB V Negative=Neg|Number=Sing|Person=3|VerbForm=Fin|Voice=Act 12 neg _ _ 12 houkuttanut houkuttaa VERB V Case=Nom|Degree=Pos|Number=Sing|PartForm=Past|VerbForm=Part|Voice=Act 16 advcl _ _ 13 sitten sitten ADV Adv _ 14 advmod _ _ 14 alkuunkaan alkuunkaan ADV Adv _ 12 advmod _ SpaceAfter=No 15 , , PUNCT Punct _ 12 punct _ _ 16 turvauduin turvautua VERB V Mood=Ind|Number=Sing|Person=1|Tense=Past|VerbForm=Fin|Voice=Act 5 conj _ _ 17 ystäväni ystävä NOUN N Case=Gen|Number=Sing|Number[psor]=Sing|Person[psor]=1 18 nmod:poss _ _ 18 apuun apu NOUN N Case=Ill|Number=Sing 16 nmod _ SpaceAfter=No 19 . . PUNCT Punct _ 5 punct _ _ ~~̃~
NUM
: numeral
Description
A numeral is a word, functioning most typically as a determiner, adjective or pronoun, that expresses a number and a relation to the number, such as quantity, sequence, frequency or fraction.
Ordinal numerals are classified as adjectives ADJ
if they act like adjectives
in syntax.
Examples
- [fi] yksi “one”, kaksi “two”, pari “few”
- [fi] 3, 20, 3.14
PART
: particle
Description
<
Particles are function words that must be associated with another word or
phrase to impart meaning and that do not satisfy definitions of other universal
parts of speech (e.g. adpositions, coordinating conjunctions, subordinating
conjunctions or auxiliary verbs). Particles should be used scarcely in uralic
dependencies. What is called particle in lot of Uralic literature is typically
adverb ADV
, interjection INTJ
or adposition ADP
in universal dependencies,
also CONJ
and SCONJ
in limited amounts.
Enclitic discourse particles are not analysed in the current versions of Uralic dependency schemes.
Examples
None in Finnish.
PRON
: pronoun
Description
Pronouns are words that substitute for nouns or noun phrases, whose meaning is recoverable from the linguistic or extralinguistic context. Pronouns are used for nominals references without intrinsic semantic value, some pronouns have defective case paradigms but usually resemble nouns in their paradigms.
Pro-words of other parts of speech than noun are common in Uralic languages, in the current Universal scheme they are still categorised under the respective non pro part of speech, i.e., ADJ for pro-adjectives, ADV for pro-adverb and so forth.
Examples
- [fi] minä “I”, me “we”, nämä “these”
- [fi] sama “same”, kaikki “all”
- [fi] joku “someone”, jokin “something”
- [fi] toinen “other”, itse “self”, jokainen “every”
- [fi] ei mikään “none, not any”
PROPN
: proper noun
Description
A proper noun is a noun (or nominal content word) that is the name (or part of
the name) of a specific individual, place, or object. Acronyms made of a
proper noun or proper nouns should also be tagged PROPN
.
Examples
- [fi] Pekka first name, Helsinki city name
- [fi] EU “European Union”
PUNCT
: punctuation
Description
Punctuation marks are non-alphabetical characters and character groups used in many languages to delimit linguistic units in printed text. Uralic languages share mostly the same limited set of punctuation marks.
Examples
- [fi] .,!,?,”,–
- [fi] …
~̃~~ conllu # sentence-text: Vähän samanlainen tunne kuin silloin, kun ystävämme vei meidät kerran ylöstuomiokirkon torniin. 1 Vähän vähän ADV Adv _ 2 advmod _ _ 2 samanlainen samanlainen ADJ A Case=Nom|Degree=Pos|Number=Sing 3 amod _ _ 3 tunne tunne NOUN N Case=Nom|Number=Sing 0 root _ _ 4 kuin kuin SCONJ C _ 5 mark _ _ 5 silloin silloin ADV Adv _ 2 advcl _ SpaceAfter=No 6 , , PUNCT Punct _ 9 punct _ _ 7 kun kun SCONJ C _ 9 mark _ _ 8 ystävämme ystävä NOUN N Case=Nom|Number=Sing|Number[psor]=Plur|Person[psor]=1 9 nsubj _ _ 9 vei viedä VERB V Mood=Ind|Number=Sing|Person=3|Tense=Past|VerbForm=Fin|Voice=Act 5 advcl _ _ 10 meidät minä PRON Pron Case=Acc|Number=Plur|Person=1|PronType=Prs 9 dobj _ _ 11 kerran kerran ADV Adv _ 9 advmod _ _ 12 ylös ylös ADV Adv _ 14 advmod _ SpaceAfter=No 13 tuomiokirkon tuomio#kirkko NOUN N Case=Gen|Number=Sing 14 nmod:poss _ _ 14 torniin torni NOUN N Case=Ill|Number=Sing 9 nmod _ SpaceAfter=No 15 . . PUNCT Punct _ 3 punct _ _ ~~~
SCONJ
: subordinating conjunction
Description
A subordinating conjunction is a conjunction that links constructions by making
one of them a constituent of the other. What some grammars call adverbial
conjunctions are also tagged SCONJ
in Uralic dependencies.
Like with co-ordinating conjunctions, there is no common best practice for annotating Uralic conjunction enclitics.
Examples
- [fi] että “that”, koska “because”, jos “if”
- [fi] parempi kuin “better than”
# sentence-text: Vähän samanlainen tunne kuin silloin, kun ystävämme vei meidät kerran ylöstuomiokirkon torniin.
1 Vähän vähän ADV Adv _ 2 advmod _ _
2 samanlainen samanlainen ADJ A Case=Nom|Degree=Pos|Number=Sing 3 amod _ _
3 tunne tunne NOUN N Case=Nom|Number=Sing 0 root _ _
4 kuin kuin SCONJ C _ 5 mark _ _
5 silloin silloin ADV Adv _ 2 advcl _ SpaceAfter=No
6 , , PUNCT Punct _ 9 punct _ _
7 kun kun SCONJ C _ 9 mark _ _
8 ystävämme ystävä NOUN N Case=Nom|Number=Sing|Number[psor]=Plur|Person[psor]=1 9 nsubj
_ _
9 vei viedä VERB V Mood=Ind|Number=Sing|Person=3|Tense=Past|VerbForm=Fin|Voice=Act 5 advcl
_ _
10 meidät minä PRON Pron Case=Acc|Number=Plur|Person=1|PronType=Prs 9 dobj _ _
11 kerran kerran ADV Adv _ 9 advmod _ _
12 ylös ylös ADV Adv _ 14 advmod _ SpaceAfter=No
13 tuomiokirkon tuomio#kirkko NOUN N Case=Gen|Number=Sing 14 nmod:poss _ _
14 torniin torni NOUN N Case=Ill|Number=Sing 9 nmod _ SpaceAfter=No
15 . . PUNCT Punct _ 3 punct _ _
SYM
: symbol
Description
A symbol is a word-like entity that differs from ordinary words by form, function, or both. Symbols are character sequences other than those in punct part of speech.
Symbols in many Uralic languages inflect like common nominals, e.g. with an intervening punctuation.
Examples
- [fi] $, :-), ☺, <3, ♡
- [fi] www.google.fi:ssä “in www.google.fi”
~̃~~ conllu # sentence-text: Tässä mallissa on sama 6,1 -litrainen Hemi -moottori kuin SRT -8:ssa, mutta väri on erikoinen “Detonator Yellow” ja siinä on mustat teippaukset. 1 Tässä tämä PRON Pron Case=Ine|Number=Sing|PronType=Dem 2 det _ _ 2 mallissa malli NOUN N Case=Ine|Number=Sing 3 nmod _ _ 3 on olla VERB V Mood=Ind|Number=Sing|Person=3|Tense=Pres|VerbForm=Fin|Voice=Act 0 root _ _ 4 sama sama PRON Pron Case=Nom|Number=Sing|PronType=Ind 8 det _ _ 5 6,1 6,1 NUM Num NumType=Card 6 nummod _ _ 6 -litrainen litra ADJ A Case=Nom|Degree=Pos|Derivation=Inen|Number=Sing 8 amod _ _ 7 Hemi Hemi PROPN N Case=Nom|Number=Sing 8 compound:nn _ _ 8 -moottori moottori NOUN N Case=Nom|Number=Sing 3 nsubj _ _ 9 kuin kuin SCONJ C _ 10 mark _ _ 10 SRT-8:ssa SRT#8 SYM Symb Case=Ine 4 advcl _ SpaceAfter=No 11 , , PUNCT Punct _ 3 punct _ _ 12 mutta mutta CONJ C _ 3 cc _ _ 13 väri väri NOUN N Case=Nom|Number=Sing 17 nsubj:cop _ _ 14 on olla VERB V Mood=Ind|Number=Sing|Person=3|Tense=Pres|VerbForm=Fin|Voice=Act 17 cop _ _ 15 erikoinen erikoinen ADJ A Case=Nom|Degree=Pos|Number=Sing 17 amod _ _ 16 “ “ PUNCT Punct _ 17 punct _ SpaceAfter=No 17 Detonator Detonator PROPN N _ 3 conj _ _ 18 Yellow Yellow PROPN N Case=Nom|Number=Sing 17 name _ SpaceAfter=No 19 “ “ PUNCT Punct _ 17 punct _ _ 20 ja ja CONJ C _ 17 cc _ _ 21 siinä se PRON Pron Case=Ine|Number=Sing|PronType=Dem 22 nmod _ _ 22 on olla VERB V Mood=Ind|Number=Sing|Person=3|Tense=Pres|VerbForm=Fin|Voice=Act 17 conj _ _ 23 mustat musta ADJ A Case=Nom|Degree=Pos|Number=Plur 24 amod _ _ 24 teippaukset teippaus NOUN N Case=Nom|Number=Plur 22 nsubj _ SpaceAfter=No 25 . . PUNCT Punct _ 3 punct _ _ ~~~
VERB
: verb
Description
Verbs typically inflect for tense, mood and person and signal events and actions. Verbs inflect with personal suffixes as well as tense, aspect, mood suffixes. Uralic verbs may often have large number of nominal forms with more or less full paradigms, typically annotated in grammars as infinitives and participles. There’s no best practice on selecting whether a word form should be a derived new word or a form of a verb.
Examples
- [fi] juosta “run”
- [fi] ei “no”, en “I do not”, älkää “you must not”
- [fi] syönyt “eaten”, olin syömässä “I was eating”, syövä mies “eating man”
X
: other
Description
The tag X
is used for words that for some reason cannot be assigned a real
part-of-speech category. X is typically used for foreign language material or
other un-processable data.
It is foreseeable that in many Uralic language data code-switching is more
common, the other language parts should be tagged X
throughout?
Examples
- [fi] cookie, open-source