This is part of archived UD v1 documentation. See http://universaldependencies.org/ for the current version.
home et/pos issue tracker

NOUN: noun

Definition

Nouns are a part of speech typically denoting a person, place, thing, animal or idea.
The postag NOUN is used only for tagging for common nouns.
Proper nouns are annotated as PROPN and pronouns as PRON.


Treebank Statistics (UD_Estonian)

There are 14039 NOUN lemmas (50%), 26541 NOUN types (52%) and 60337 NOUN tokens (26%). Out of 15 observed tags, the rank of NOUN is: 1 in number of lemmas, 1 in number of types and 1 in number of tokens.

The 10 most frequent NOUN lemmas: aasta, inimene, aeg, riik, mees, töö, sisse_tulek, eba_võrdsus, kord, asi

The 10 most frequent NOUN types: aasta, aastal, sissetulekute, raha, aastat, krooni, korda, osa, mees, ajal

The 10 most frequent ambiguous lemmas: kord (NOUN 256, ADV 44), pea (NOUN 118, ADV 20), koer (NOUN 72, ADJ 2), noor (ADJ 75, NOUN 62), pool (NUM 98, NOUN 49, ADV 20, ADP 17), vahe (NOUN 42, ADP 1), vara (NOUN 42, ADV 7), paik (NOUN 39, ADP 1), jõud (NOUN 33, VERB 4), laul (NOUN 32, VERB 2)

The 10 most frequent ambiguous types: krooni (NOUN 168, VERB 1), ajal (NOUN 133, ADP 54), tee (NOUN 49, VERB 24), aja (NOUN 51, VERB 2), kätte (NOUN 50, ADP 20, ADV 10), elus (NOUN 41, ADJ 13), pea (NOUN 36, AUX 29, ADV 16, VERB 12), teed (NOUN 38, VERB 12), käsi (NOUN 32, VERB 2), korral (ADP 54, NOUN 29)

Morphology

The form / lemma ratio of NOUN is 1.890519 (the average of all parts of speech is 1.839644).

The 1st highest number of forms (22) was observed with the lemma “riik”: riigi, riigi-, riigid, riigiga, riigiks, riigil, riigile, riigilt, riigina, riigini, riigis, riigist, riik, riike, riiki, riikide, riikidega, riikidel, riikidele, riikides, riikidesse, riikidest.

The 2nd highest number of forms (21) was observed with the lemma “aasta”: aasta, aastad, aastaga, aastaid, aastail, aastaks, aastal, aastale, aastalgi, aastani, aastas, aastasse, aastast, aastat, aastate, aastatega, aastateks, aastatel, aastateni, aastatest, aastatki.

The 3rd highest number of forms (19) was observed with the lemma “inimene”: inimene, inimese, inimesed, inimesega, inimesel, inimesele, inimeselt, inimesesse, inimesest, inimesi, inimest, inimeste, inimestega, inimestel, inimestele, inimestelt, inimestes, inimestesse, inimestest.

NOUN occurs with 10 features: Number (59475; 99% instances), Case (59466; 99% instances), Abbr (1100; 2% instances), Hyph (203; 0% instances), VerbForm (128; 0% instances), Tense (124; 0% instances), Voice (124; 0% instances), PronType (5; 0% instances), NumForm (3; 0% instances), Person (1; 0% instances)

NOUN occurs with 29 feature-value pairs: Abbr=Yes, Case=Abe, Case=Abl, Case=Add, Case=Ade, Case=All, Case=Com, Case=Ela, Case=Ess, Case=Gen, Case=Ill, Case=Ine, Case=Nom, Case=Par, Case=Ter, Case=Tra, Hyph=Yes, NumForm=Letter, NumForm=Roman, Number=Plur, Number=Sing, Person=3, PronType=Prs, PronType=Tot, Tense=Past, Tense=Pres, VerbForm=Part, Voice=Act, Voice=Pass

NOUN occurs with 99 feature combinations. The most frequent feature combination is Case=Nom|Number=Sing (12416 tokens). Examples: mees, inimene, naine, asi, riik, aeg, osa, president, enamik, ebavõrdsus

Relations

NOUN nodes are attached to their parents using 25 different relations: nmod (29739; 49% instances), nsubj (9699; 16% instances), dobj (9471; 16% instances), conj (4113; 7% instances), root (1819; 3% instances), nsubj:cop (1779; 3% instances), appos (1401; 2% instances), xcomp (679; 1% instances), advcl (534; 1% instances), parataxis (334; 1% instances), advmod:quant (286; 0% instances), dep (153; 0% instances), ccomp (79; 0% instances), vocative (62; 0% instances), acl:relcl (51; 0% instances), foreign (38; 0% instances), acl (32; 0% instances), amod (19; 0% instances), list (19; 0% instances), compound (8; 0% instances), name (8; 0% instances), csubj (6; 0% instances), cc:preconj (4; 0% instances), cc (3; 0% instances), compound:prt (1; 0% instances)

Parents of NOUN nodes belong to 15 different parts of speech: VERB (35580; 59% instances), NOUN (16341; 27% instances), ADJ (3797; 6% instances), ROOT (1819; 3% instances), PROPN (1592; 3% instances), ADV (399; 1% instances), NUM (343; 1% instances), PRON (332; 1% instances), AUX (56; 0% instances), ADP (45; 0% instances), SYM (17; 0% instances), X (8; 0% instances), CONJ (4; 0% instances), INTJ (2; 0% instances), SCONJ (2; 0% instances)

22927 (38%) NOUN nodes are leaves.

23637 (39%) NOUN nodes have one child.

8248 (14%) NOUN nodes have two children.

5525 (9%) NOUN nodes have three or more children.

The highest child degree of a NOUN node is 26.

Children of NOUN nodes are attached using 34 different relations: nmod (16309; 26% instances), amod (11848; 19% instances), punct (5774; 9% instances), conj (4015; 6% instances), case (3999; 6% instances), acl (3001; 5% instances), cc (2952; 5% instances), det (2889; 5% instances), advmod (2596; 4% instances), nummod (2542; 4% instances), acl:relcl (1330; 2% instances), cop (1160; 2% instances), nsubj:cop (1094; 2% instances), nmod:poss (843; 1% instances), mark (665; 1% instances), advmod:quant (655; 1% instances), appos (419; 1% instances), parataxis (337; 1% instances), advcl (191; 0% instances), cc:preconj (126; 0% instances), dep (66; 0% instances), xcomp (45; 0% instances), nsubj (44; 0% instances), csubj:cop (35; 0% instances), discourse (31; 0% instances), foreign (31; 0% instances), dobj (30; 0% instances), compound:prt (11; 0% instances), name (10; 0% instances), compound (9; 0% instances), vocative (7; 0% instances), list (6; 0% instances), aux (2; 0% instances), ccomp (1; 0% instances)

Children of NOUN nodes belong to 15 different parts of speech: NOUN (16341; 26% instances), ADJ (14628; 23% instances), PUNCT (5774; 9% instances), PRON (5173; 8% instances), PROPN (4403; 7% instances), ADP (4018; 6% instances), VERB (3313; 5% instances), ADV (3165; 5% instances), CONJ (2947; 5% instances), NUM (2584; 4% instances), SCONJ (640; 1% instances), SYM (36; 0% instances), INTJ (31; 0% instances), X (18; 0% instances), AUX (2; 0% instances)


NOUN in other languages: [bg] [cs] [de] [el] [en] [es] [eu] [fa] [fi] [fr] [ga] [he] [hu] [it] [ja] [ko] [sv] [u]