home issue tracker

This page pertains to UD version 2.

POS tags

Open class words Closed class words Other

ADJ: adjective


Adjectives are words that typically modify nouns and specify their properties or attributes. They may also function as predicates. These include the categories known as 區別詞 / keoi1bit6ci4 and 形容詞 / jing4jung4ci4.

The adjective may by accompanied by the particle 嘅 / ge3 when functioning as a prenominal modifier (for either 區別詞 / keoi1bit6ci4 or 形容詞 / jing4jung4ci4), and often obligatorily when functioning as a predicate if it is a 區別詞 / keoi1bit6ci4.

Note that ordinal numerals such as 第一 / dai6jat1 “first” and 第三 / dai6saam1 “third” are to be treated as adjectives and tagged ADJ per UD specifications, even though they are traditionally classified as numerals in Chinese.


edit ADJ

ADP: adposition


The Cantonese ADP covers three categories of function words analyzed as adpositions: (1) prepositions, (2) valence markers, and (3) “localizers”/postpositions.

Prepositions introduce an extra argument to the event/main verb in a clause or give information about the time or the location or direction, etc.

Valence markers such as 將 / zoeng1 and the formal 被 / bei6 are also tagged ADP.

Localizers (also known as 方位詞 / fong1wai2ci4), typically indicate spatial information in relation to the noun preceding it. Some localizers have also grammaticalized into clausal markers indicating temporal information. Localizers with the clausal function are still tagged as ADP (but are labeled with the dependency relation mark).


edit ADP

ADV: adverb


Adverbs (副詞 / fu3ci4) typically modify verbs, adjectives, and other adverbs for such categories as time, manner, degree, frequency, or negation.

Some adverbs also modify clauses with conjunctive and discursive functions.

A small number of adverbs may also modify numerals and determiners, or nouns and pronouns.

There is a closed subclass of pronominal adverbs that refer to circumstances in context, rather than naming them directly; similarly to pronouns, these can be categorized as interrogative, demonstrative, etc. These should be treated as adverbs when modifying a predicate, but otherwise some of them can function as a nominal in the syntax, in which case they should be tagged PRON.

Note that although some adverbs express temporal information, many common time expressions (e.g., 今日 / gam1jat6 ‘today’, 舊年 / gau6nin4*2 “last year”, 夜晚 / je6maan5 “night”) are actually nouns and should be tagged NOUN.


edit ADV

AUX: auxiliary verb


An auxiliary is a word that accompanies the lexical verb of a verb phrase and expresses grammatical distinctions not carried by the lexical verb, such as person, number, tense, mood, aspect, and voice.

In Cantonese, auxiliaries can be divided into modal and modal-like auxiliaries which are mostly preverbal (except for the postverbal usage of 得 / dak1 “can”) and do not have to be adjacent to the verb, and aspect markers, which must come immediately after the verb and certain verb compounds. Note that some modal auxiliaries can also function as main verbs, usually when they have a direct object or full clausal complement.


edit AUX

CCONJ: coordinating conjunction


A coordinating conjunction is a word that links words or larger constituents without syntactically subordinating one to the other and expresses a semantic relationship between them.


edit CCONJ

DET: determiner


Determiners are words that modify nouns or noun phrases and express the reference of the noun phrase in context.

Note that Chinese does not traditionally define determiners as a separate word class, but categorizes them as pronouns and/or adjectives. For this reason, in the UD framework some words in Cantonese may function as determiners and be tagged DET in one syntactic context (i.e., when modifying a noun), and as pronouns (tagged PRON) when behaving as a nominal or the head of a nominal phrase.


edit DET

INTJ: interjection


An interjection is a word that is used most often as an exclamation or part of an exclamation. It typically expresses an emotional reaction, is not syntactically related to other accompanying expressions, and may include a combination of sounds not otherwise found in the language.

Note that words primarily belonging to another part of speech retain their original category when used in exclamations. For example, 係! / hai6 “Yes!” would still be tagged VERB.

Onomatopoeic words (擬聲詞 / ji4sing1ci4) should only be treated as interjections if they are used as an exclamation (e.g., 喵! / miau1 “Meow!”), otherwise they should be tagged according to their syntactic function in context (often as adverbs in Chinese, e.g., 佢哋吱吱喳喳噉叫 / keoi5dei5 zi1zi1za1za1 gam2 giu3 “They yell jijizhazha-ly (i.e., lively, noisily, etc.).”).


edit INTJ

NOUN: noun


Nouns are a part of speech typically denoting a person, place, thing, animal, or idea.

The NOUN tag is intended for common nouns only. See PROPN for proper nouns and PRON for pronouns.

As a special case, classifiers (量詞 / loeng6ci4) are also tagged NOUN per UD guidelines. Their classifier status may be preserved in the feature column (FEATS) as NounType=CLf.


edit NOUN

NUM: numeral


A numeral is a word, functioning most typically as a determiner, a pronoun or an adjective, that expresses a number and a relation to the number, such as quantity, sequence, frequency or fraction.

Cardinal numerals are covered by NUM regardless of syntactic function and regardless of whether they are expressed as words (五 / ng5 “five”) or digits (5). By contrast, ordinal numerals (such as 第一 / dai6jat1 “first”) are always tagged ADJ.


edit NUM

PART: particle


Particles are function words that must be associated with another word, phrase, or clause to impart meaning and that do not satisfy definitions of other universal parts of speech (such as ADP, AUX, CCONJ, or SCONJ).

In Cantonese, particles include the genitive/associative/relativizer/nominalizer marker 嘅 / ge3; 得 / dak1 and 到 / dou3 in V-得/到 extent/descriptive constructions (see compound:ext); the manner adverbializer 噉 / gam2; the “et cetera” marker 等(等) / dang2(dang2); sentence-final particles; the quantifiers 埋 / maai4 and 晒 / saai3; the adversative 親 / can1; and so on.


edit PART

PRON: pronoun


Pronouns are words that substitute for nouns or noun phrases, whose meaning is recoverable from the linguistic or extralinguistic context. Some pronouns – in particular certain demonstrative, total, indefinite, and interrogatives pronouns – may also function as determiners (DET) and are tagged as such when functioning as a modifier of a noun.


edit PRON

PROPN: proper noun


A proper noun is a noun (or nominal content word) that is the name (or part of the name) of a specific individual, place, or object. For institutional names that contain regular words belonging to other parts of speech such as nouns (e.g., 公司 / gung1si1 “company”, 大學 / daai6hok6 “university”, etc.), those words should be segmented as their own tokens and still tagged their native part of speech; only the proper nouns in such complex names should be tagged PROPN.


edit PROPN

PUNCT: punctuation


Punctuation marks are character groups used to delimit linguistic units in printed text.

Punctuation is not taken to include logograms such as $, %, and §, which are instead tagged as SYM.


edit PUNCT

SCONJ: subordinating conjunction


A subordinating conjunction is a conjunction that links constructions by making one of them a constituent of the other. The subordinating conjunction typically marks the incorporated constituent which has the status of a subordinate clause.

Subordinating conjunctions in Cantonese include all markers of subordinate clauses, including conditional clauses, purpose clauses, etc.

In paired clauses where both clauses are marked by a conjunctive word and the first is subordinate to the second, we treat the conjunctive word in the first clause as SCONJ, whereas the one in the second, main clause as an adverb (ADV) (e.g., 雖然/SCONJ… 但係/ADV… / seoi1jin4… daan6hai6… “Although… however…”).


edit SCONJ

SYM: symbol


A symbol is a word-like entity that differs from ordinary words by form, function, or both.

Many symbols are or contain special non-alphanumeric, non-standard logographic characters, similarly to punctuation. What makes them different from punctuation is that they can be substituted by normal words. This involves all currency symbols, e.g. $ 75 is identical to 七十五 蚊 / cat1sap6ng5 man1 “seventy-five dollars”.

Mathematical operators form another group of symbols.

Another group of symbols is emoticons and emoji.


edit SYM

VERB: verb


A verb is a member of the syntactic class of words that typically signal events and actions, can constitute a minimal predicate in a clause, and govern the number and types of other constituents which may occur in the clause.

Despite its use in copular constructions, 係 / hai6 “be” is tagged as a verb due to its other non-copular meanings and functions.


edit VERB

X: other


The tag X is used for words that for some reason cannot be assigned a real part-of-speech category. It should be used very restrictively.

A special usage of X is for cases of code-switching where it is not possible (or meaningful) to analyze the intervening language grammatically (and where the dependency relation foreign is typically used in the syntactic analysis). This usage does not extend to ordinary loan words which should be assigned a normal part-of-speech.


edit X