Layered universal features
In some languages, some features are marked more than once on the same word. We say that there are several layers of the feature. The exact meaning of individual layers is language-dependent.
For example, possessive adjectives, determiners and pronouns may have two different values of u-feat/Gender and two of u-feat/Number. One of the values is determined by agreement with the modified (possessed) noun. This is parallel to other (non-possessive) adjectives and determiners that agree in gender and number with the nouns they modify. The other value is determined lexically because it is a property of the possessor. The following table shows that English distinguishes only the possessor’s gender and number; Hindi distinguishes gender in agreement and number both in agreement and of the possessor (there is no neuter gender in Hindi); German distinguishes both features in both dimensions (more differences would be seen if we also showed German dative and accusative forms, not just nominatives).
|Possessor / Agreement||Sing Masc||Sing Fem||Sing Neut||Plur Masc||Plur Fem|
If a feature is (can be) layered in a language, the name of the feature must
indicate the layer. An additional identifier in square brackets is used to
distinguish layers, e.g.
Gender[psor] for the possessor’s gender.
We recommend that the layer identifiers consist of lowercase English letters
[a-z] and/or digits
The layers, their meaning and their
identifiers must be defined in a language-specific extension to this
documentation. For each layered feature, one layer may be defined as default
and the corresponding features then appear without identifier,
In the following, we list some examples of layered features attested in existing corpora. These may be used as inspiration or they may be used as-is in treebanks for which they are found appropriate. Note that even if a treebank uses a layered feature from this section, it should still be described in the language-specific documentation.
Possessive adjectives and pronouns may have two different genders: that of the possessed object (gender agreement with modified noun) and that of the possessor (lexical feature, inherent gender).
Gender[psor] feature captures the possessor’s gender.
In the Czech examples below, the masculine Gender[psor] implies using one of the suffixes -ův, -ova, -ovo, and the feminine Gender[psor] implies using one of -in, -ina, -ino.
Masc: masculine possessor
otcův syn (father’s son;
otcova dcera (father’s daughter;
otcovo dítě (father’s child;
Fem: feminine possessor
matčin syn (mother’s son;
matčina dcera (mother’s daughter;
matčino dítě (mother’s child;
In other languages (Hebrew, Arabic), the possessor’s gender and number are agreement rather than lexical features:
Examples: [he] HKPH FL HARC (perimeter of country).
Features of the two nouns are as follows:
The [psor] features of perimeter are dictated by agreement with the possessor, country.
(This is a partial description of this example. HKPH has many morphological analyses, some of them are masculine single-layered, some of them are feminine single-layered. You can only find the right morphosyntactic analysis if you detect the two layers of agreement features, and can identify this specific agreement pattern.)
may have two different numbers: that of the possessed object (number
agreement with modified noun) and that of the possessor. The
Number[psor] feature captures the possessor’s number.
Sing: singular possessor
my, his, her, its;
Plur: plural possessor
The possessor’s person is marked e.g. on Hungarian nouns. These noun forms would be translated to English as possessive pronoun + noun.
Note that it is reasonable to make this a layered feature even though
the default Person is normally not
marked on nouns. In relation to verbs (which may have to mark person
agreement with nouns), a noun is almost always in the third person.
So even if this default person is not explicitly marked morphologically,
and probably the default
Person does not appear among features of
the noun, we should not use the default layer of persons to mark the
possessor. If we abused the default layer, the annotation would no longer
be parallel to personal pronouns that could be substituted for the noun.
On the other hand, we probably do not want a separate
for the person of possessive determiners / pronouns.
They modify a noun, not a verb. Arguably they have only one
feature and it is lexical (while for the Hungarian nouns,
Person[psor] is inflectional).
They usually modify nouns, not verbs, and agreement with verbs does
not play any role.
Moreover, in some languages possessive pronouns are actually identical
to personal pronouns in the genitive case
and it is logical that they have the same
Person as in the nominative.
1: first person possessor
Examples: [hu] kutya = dog; kutyám = my dog; kutyánk = our dog.
2: second person possessor
Examples: [hu] kutya = dog; kutyád = your.Sing dog; kutyátok = your.Plur dog.
3: third person possessor
Examples: [hu] kutya = dog; kutyája = his/her/its dog; kutyájuk = their dog.
lit. John his-bone
lit. John his-bones
Péternek sok pénze van.
lit. to-Peter much his-money there-is
Peter has a lot of money.
This feature seems to be very specific to Hungarian. It denotes the possessee’s (possessed, owned noun phrase’s) number. Hungarian has three types of number in the nominal inflection:
- The number of the noun (inflectional, non-agreement).
- The number of owners that own the noun (inflectional, agreement with possessor that may or may not be pronounced).
- The number of the context-given referent, which is some possession of the noun, i.e. belongs to the noun (anaphoric possessive; in a sense, this is an agreement feature, but the head noun is not pronounced in the sentence).
Examples from the Multext-East Hungarian lexicon:
- könnyedén (SSS)
- könny = a tear (singular)
- könnyed = your tear (singular owner)
- könnyedé = (possession) of your tear (singular possession)
- könnyedén = (on the possession) of your tear (superessive case)
- ellenfeleié (PSS)
- ellenfél = an opponent (singular)
- ellenfele = his/her/its opponent (singular owner)
- ellenfelei = his/her/its opponents (core plural, singular owner)
- ellenfeleié = (possession) of his/her/its opponents (singular possession)
- életeké (SPS)
- él = point (singular)
- élek = points (plural)
- élén = his/her/its point (singular owner)
- élünk = our point (plural owner)
- életeké = (possession) of our point (singular possession)
- tárgyalópartnereinkét (PPS)
- tárgyalópartner = negotiator (singular)
- tárgyalópartnerei = his/her/its negotiators (plural, singular owner)
- tárgyalópartnereinkét = (possession) of our negotiators (plural, plural owner, singular possession, accusative case)
Words marked for plural possessions are very rare, though. Note that in the following example from Multext-East, Columbus is marked for plural possession, but not for his own owner.
- Kolumbusz = Columbus (singular)
- Kolumbuszéi = (possessions) of Columbus (plural possession)
- Kolumbuszéinál = (at the possessions) of Columbus (adessive case)
See also Éva Dékány (2014): The syntax of anaphoric possessives in Hungarian: In anaphoric possessives the possessed noun, the head of the whole nominal phrase, is not pronounced, and its reference has to be recovered from the context. The possessor in Hungarian anaphoric possessives has to bear the -é suffix.
Number[psee]=Plur is extremely rare, this feature is not so important
for distinguishing singular and plural possessions. However, the mere presence
Number[psee]=Sing informs that there is the -é suffix and thus that
there is an unpronounced possession.
Layered verb agreement in Basque
Some verbs in Basque must agree in person and number with up to three arguments: the absolutive argument (subject of intransitive verbs and object of transitive verbs), the ergative argument (subject of transitive verbs) and the dative argument (indirect object).
We could make the absolutive agreement the default, thus using
without layer identifiers.
If there is also one of the other two arguments, we will have
Example: nahi dizkiegu, lemma = nahi_izan,
(we want them to them).