This is part of archived UD v1 documentation. See http://universaldependencies.org/ for the current version.
home cs/pos issue tracker

NOUN: noun

Definition

Nouns are a part of speech typically denoting a person, place, thing, animal or idea.

The NOUN tag is intended for common nouns only. See PROPN for proper nouns and PRON for pronouns.

Czech nouns have the lexical feature cs-feat/Gender. Furthermore, the nouns inflect for cs-feat/Number and cs-feat/Case.

A verbal noun can be derived productively from almost every verb (e.g. dělat  “to do” → dělání  “doing”). While in other languages a corresponding form may be called gerund and tagged VERB, in Czech it is tagged NOUN. It has always the neuter gender and the full number-case inflectional paradigm.

Examples

References


Treebank Statistics (UD_Czech)

There are 17839 NOUN lemmas (30%), 40084 NOUN types (31%) and 372366 NOUN tokens (25%). Out of 17 observed tags, the rank of NOUN is: 1 in number of lemmas, 1 in number of types and 1 in number of tokens.

The 10 most frequent NOUN lemmas: rok, strana, člověk, společnost, firma, doba, země, cena, koruna, zákon

The 10 most frequent NOUN types: roku, korun, let, roce, strany, procent, společnosti, době, případě, firmy

The 10 most frequent ambiguous lemmas: stát (VERB 1542, NOUN 1446), den (NOUN 1193, ADJ 1), místo (NOUN 1144, ADP 191, SCONJ 10, ADV 5), bod (NOUN 708, PROPN 3), klub (NOUN 506, PROPN 1), růst (NOUN 353, VERB 149), kontakt (NOUN 331, PROPN 2), tisíc (NUM 539, NOUN 330, ADV 1), televize (NOUN 320, PROPN 3), sto (NOUN 304, NUM 41)

The 10 most frequent ambiguous types: r (NOUN 433, ADV 1, PROPN 1), září (NOUN 449, VERB 2), s (ADP 8728, NOUN 381, PART 21, ADJ 9), místo (NOUN 359, ADP 140, SCONJ 7, ADV 5), stát (NOUN 301, VERB 226), p (NOUN 199, ADJ 4), den (NOUN 303, ADJ 1), m (NOUN 235, ADJ 2, ADP 1), tel (NOUN 273, ADJ 2), klubu (NOUN 218, PROPN 1)

Morphology

The form / lemma ratio of NOUN is 2.246987 (the average of all parts of speech is 2.195930).

The 1st highest number of forms (18) was observed with the lemma “rok”: l, let, letech, lety, letům, léta, létech, léty, létům, r, roce, rok, roka, rokem, roku, roky, roků, rokům.

The 2nd highest number of forms (15) was observed with the lemma “milión”: mil, milion, milionech, milionem, milionu, miliony, milionů, milionům, milión, miliónech, miliónem, miliónu, milióny, miliónů, miliónům.

The 3rd highest number of forms (14) was observed with the lemma “druh”: druh, druha, druhem, druhové, druhu, druhy, druhů, druhům, druzi, druzích, družka, družkou, družku, družky.

NOUN occurs with 9 features: cs-feat/Negative (372366; 100% instances), cs-feat/Gender (371970; 100% instances), cs-feat/Number (363302; 98% instances), cs-feat/Case (362915; 97% instances), cs-feat/Animacy (163546; 44% instances), cs-feat/Abbr (5768; 2% instances), cs-feat/Foreign (1813; 0% instances), cs-feat/Style (137; 0% instances), cs-feat/Aspect (2; 0% instances)

NOUN occurs with 22 feature-value pairs: Abbr=Yes, Animacy=Anim, Animacy=Inan, Aspect=Perf, Case=Acc, Case=Dat, Case=Gen, Case=Ins, Case=Loc, Case=Nom, Case=Voc, Foreign=Foreign, Gender=Fem, Gender=Masc, Gender=Neut, Negative=Neg, Negative=Pos, Number=Dual, Number=Plur, Number=Sing, Style=Arch, Style=Coll

NOUN occurs with 201 feature combinations. The most frequent feature combination is Case=Gen|Gender=Fem|Negative=Pos|Number=Sing (29588 tokens). Examples: strany, vlády, společnosti, firmy, práce, republiky, doby, země, rady, banky

Relations

NOUN nodes are attached to their parents using 26 different relations: cs-dep/nmod (189946; 51% instances), cs-dep/dobj (60134; 16% instances), cs-dep/nsubj (53738; 14% instances), cs-dep/conj (27838; 7% instances), cs-dep/root (11507; 3% instances), cs-dep/nsubjpass (6150; 2% instances), cs-dep/appos (4761; 1% instances), cs-dep/iobj (4566; 1% instances), cs-dep/dep (4086; 1% instances), cs-dep/mwe (2321; 1% instances), cs-dep/xcomp (1906; 1% instances), cs-dep/advmod (1579; 0% instances), cs-dep/advcl (1231; 0% instances), cs-dep/acl (705; 0% instances), cs-dep/case (645; 0% instances), cs-dep/ccomp (644; 0% instances), cs-dep/foreign (275; 0% instances), cs-dep/csubj (94; 0% instances), cs-dep/parataxis (89; 0% instances), cs-dep/vocative (59; 0% instances), cs-dep/det:nummod (37; 0% instances), cs-dep/cc (27; 0% instances), cs-dep/csubjpass (21; 0% instances), cs-dep/advmod:emph (4; 0% instances), cs-dep/mark (2; 0% instances), cs-dep/discourse (1; 0% instances)

Parents of NOUN nodes belong to 16 different parts of speech: VERB (176681; 47% instances), NOUN (144052; 39% instances), ADJ (15022; 4% instances), PROPN (12757; 3% instances), ROOT (11507; 3% instances), NUM (4421; 1% instances), ADV (3269; 1% instances), ADP (2339; 1% instances), PRON (1719; 0% instances), SYM (320; 0% instances), PART (120; 0% instances), DET (71; 0% instances), PUNCT (48; 0% instances), CONJ (21; 0% instances), INTJ (15; 0% instances), SCONJ (4; 0% instances)

68688 (18%) NOUN nodes are leaves.

128557 (35%) NOUN nodes have one child.

106503 (29%) NOUN nodes have two children.

68618 (18%) NOUN nodes have three or more children.

The highest child degree of a NOUN node is 47.

Children of NOUN nodes are attached using 34 different relations: cs-dep/amod (150933; 25% instances), cs-dep/nmod (137711; 23% instances), cs-dep/case (113794; 19% instances), cs-dep/punct (40359; 7% instances), cs-dep/det (25982; 4% instances), cs-dep/conj (25774; 4% instances), cs-dep/cc (19341; 3% instances), cs-dep/acl (17612; 3% instances), cs-dep/nummod (17443; 3% instances), cs-dep/advmod:emph (13690; 2% instances), cs-dep/cop (7524; 1% instances), cs-dep/nummod:gov (7179; 1% instances), cs-dep/nsubj (5813; 1% instances), cs-dep/appos (5038; 1% instances), cs-dep/mark (4703; 1% instances), cs-dep/dep (3974; 1% instances), cs-dep/advmod (2410; 0% instances), cs-dep/xcomp (1438; 0% instances), cs-dep/foreign (1041; 0% instances), cs-dep/det:numgov (956; 0% instances), cs-dep/csubj (730; 0% instances), cs-dep/det:nummod (569; 0% instances), cs-dep/advcl (397; 0% instances), cs-dep/parataxis (321; 0% instances), cs-dep/aux (203; 0% instances), cs-dep/dobj (154; 0% instances), cs-dep/neg (140; 0% instances), cs-dep/mwe (44; 0% instances), cs-dep/ccomp (39; 0% instances), cs-dep/discourse (32; 0% instances), cs-dep/vocative (9; 0% instances), cs-dep/auxpass:reflex (3; 0% instances), cs-dep/expl (3; 0% instances), cs-dep/nsubjpass (1; 0% instances)

Children of NOUN nodes belong to 17 different parts of speech: ADJ (153531; 25% instances), NOUN (144052; 24% instances), ADP (112932; 19% instances), PUNCT (40365; 7% instances), DET (27491; 5% instances), VERB (26838; 4% instances), PROPN (26226; 4% instances), NUM (26036; 4% instances), CONJ (22100; 4% instances), ADV (11203; 2% instances), PRON (6319; 1% instances), SCONJ (4747; 1% instances), PART (3107; 1% instances), AUX (203; 0% instances), SYM (168; 0% instances), INTJ (38; 0% instances), X (4; 0% instances)


Treebank Statistics (UD_Czech-CAC)

There are 11136 NOUN lemmas (39%), 23495 NOUN types (37%) and 136182 NOUN tokens (28%). Out of 16 observed tags, the rank of NOUN is: 1 in number of lemmas, 1 in number of types and 1 in number of tokens.

The 10 most frequent NOUN lemmas: rok, práce, člověk, doba, úkol, pracovník, oblast, vztah, výroba, výsledek

The 10 most frequent NOUN types: práce, roce, let, práci, roku, oblasti, době, společnosti, hlediska, činnosti

The 10 most frequent ambiguous lemmas: místo (NOUN 366, ADP 50, SCONJ 2), potřeba (NOUN 303, ADJ 14), vedoucí (NOUN 187, ADJ 90), třída (NOUN 176, PROPN 1), stát (VERB 348, NOUN 169), růst (NOUN 104, VERB 58), výše (NOUN 80, ADV 25), ulice (NOUN 75, PROPN 2), doprava (NOUN 68, ADV 2), fakt (NOUN 64, ADV 1)

The 10 most frequent ambiguous types: prací (NOUN 144, ADJ 13), oddělení (NOUN 139, ADJ 1), vedoucí (NOUN 86, ADJ 51), místo (NOUN 99, ADP 42, SCONJ 2), září (NOUN 73, VERB 1), vědomí (NOUN 61, ADJ 1), fronty (NOUN 48, PROPN 1), věda (NOUN 38, VERB 1), růst (NOUN 40, VERB 5), pracích (NOUN 41, ADJ 3)

Morphology

The form / lemma ratio of NOUN is 2.109824 (the average of all parts of speech is 2.206260).

The 1st highest number of forms (11) was observed with the lemma “mistr”: Mistře, mistr, mistra, mistrem, mistrovi, mistrovou, mistrová, mistru, mistry, mistrů, mistři.

The 2nd highest number of forms (11) was observed with the lemma “řád”: řad, řadem, řadu, řady, řadů, řád, řádech, řádem, řádu, řády, řádů.

The 3rd highest number of forms (10) was observed with the lemma “den”: den, dne, dnech, dnem, dni, dnu, dny, dní, dnů, dnům.

NOUN occurs with 9 features: cs-feat/Negative (136182; 100% instances), cs-feat/Gender (136143; 100% instances), cs-feat/Number (135047; 99% instances), cs-feat/Case (135026; 99% instances), cs-feat/Animacy (56383; 41% instances), cs-feat/Abbr (982; 1% instances), cs-feat/Foreign (257; 0% instances), cs-feat/Style (8; 0% instances), cs-feat/Aspect (6; 0% instances)

NOUN occurs with 22 feature-value pairs: Abbr=Yes, Animacy=Anim, Animacy=Inan, Aspect=Imp, Aspect=Perf, Case=Acc, Case=Dat, Case=Gen, Case=Ins, Case=Loc, Case=Nom, Case=Voc, Foreign=Foreign, Gender=Fem, Gender=Masc, Gender=Neut, Negative=Neg, Negative=Pos, Number=Dual, Number=Plur, Number=Sing, Style=Coll

NOUN occurs with 127 feature combinations. The most frequent feature combination is Case=Gen|Gender=Fem|Negative=Pos|Number=Sing (12879 tokens). Examples: práce, společnosti, výroby, strany, činnosti, doby, vědy, vody, techniky, teorie

Relations

NOUN nodes are attached to their parents using 25 different relations: cs-dep/nmod (69149; 51% instances), cs-dep/dobj (19980; 15% instances), cs-dep/nsubj (17143; 13% instances), cs-dep/conj (16534; 12% instances), cs-dep/nsubjpass (3004; 2% instances), cs-dep/root (2675; 2% instances), cs-dep/appos (1685; 1% instances), cs-dep/iobj (1367; 1% instances), cs-dep/mwe (1340; 1% instances), cs-dep/dep (962; 1% instances), cs-dep/xcomp (884; 1% instances), cs-dep/advcl (411; 0% instances), cs-dep/advmod (299; 0% instances), cs-dep/acl (255; 0% instances), cs-dep/case (194; 0% instances), cs-dep/ccomp (116; 0% instances), cs-dep/vocative (48; 0% instances), cs-dep/foreign (32; 0% instances), cs-dep/csubj (31; 0% instances), cs-dep/parataxis (28; 0% instances), cs-dep/cop (15; 0% instances), cs-dep/csubjpass (12; 0% instances), cs-dep/det:nummod (10; 0% instances), cs-dep/cc (7; 0% instances), cs-dep/auxpass (1; 0% instances)

Parents of NOUN nodes belong to 16 different parts of speech: NOUN (63413; 47% instances), VERB (57829; 42% instances), ADJ (6476; 5% instances), ROOT (2675; 2% instances), PROPN (1356; 1% instances), ADP (1348; 1% instances), ADV (1103; 1% instances), PRON (781; 1% instances), NUM (572; 0% instances), SYM (550; 0% instances), DET (22; 0% instances), PART (21; 0% instances), CONJ (16; 0% instances), SCONJ (15; 0% instances), AUX (4; 0% instances), INTJ (1; 0% instances)

25243 (19%) NOUN nodes are leaves.

45395 (33%) NOUN nodes have one child.

38278 (28%) NOUN nodes have two children.

27266 (20%) NOUN nodes have three or more children.

The highest child degree of a NOUN node is 56.

Children of NOUN nodes are attached using 32 different relations: cs-dep/amod (60288; 26% instances), cs-dep/nmod (50983; 22% instances), cs-dep/case (40482; 18% instances), cs-dep/conj (15422; 7% instances), cs-dep/punct (13327; 6% instances), cs-dep/cc (11088; 5% instances), cs-dep/det (10419; 5% instances), cs-dep/acl (6024; 3% instances), cs-dep/advmod:emph (4274; 2% instances), cs-dep/nummod (4127; 2% instances), cs-dep/cop (2545; 1% instances), cs-dep/nsubj (2080; 1% instances), cs-dep/mark (1848; 1% instances), cs-dep/appos (1594; 1% instances), cs-dep/nummod:gov (1157; 1% instances), cs-dep/dep (867; 0% instances), cs-dep/advmod (823; 0% instances), cs-dep/xcomp (787; 0% instances), cs-dep/det:numgov (358; 0% instances), cs-dep/det:nummod (194; 0% instances), cs-dep/parataxis (175; 0% instances), cs-dep/csubj (174; 0% instances), cs-dep/advcl (126; 0% instances), cs-dep/foreign (81; 0% instances), cs-dep/dobj (66; 0% instances), cs-dep/aux (58; 0% instances), cs-dep/neg (22; 0% instances), cs-dep/mwe (19; 0% instances), cs-dep/expl (17; 0% instances), cs-dep/discourse (10; 0% instances), cs-dep/ccomp (4; 0% instances), cs-dep/vocative (3; 0% instances)

Children of NOUN nodes belong to 16 different parts of speech: NOUN (63413; 28% instances), ADJ (61026; 27% instances), ADP (39935; 17% instances), PUNCT (13332; 6% instances), CONJ (11541; 5% instances), DET (10966; 5% instances), VERB (8844; 4% instances), NUM (5425; 2% instances), PROPN (4203; 2% instances), ADV (3911; 2% instances), PRON (2598; 1% instances), SCONJ (1915; 1% instances), SYM (1169; 1% instances), PART (1107; 0% instances), AUX (56; 0% instances), INTJ (1; 0% instances)


Treebank Statistics (UD_Czech-CLTT)

There are 862 NOUN lemmas (32%), 1669 NOUN types (35%) and 11303 NOUN tokens (32%). Out of 15 observed tags, the rank of NOUN is: 1 in number of lemmas, 1 in number of types and 1 in number of tokens.

The 10 most frequent NOUN lemmas: jednotka, majetek, položka, závěrka, den, období, záznam, ocenění, závazek, účetnictví

The 10 most frequent NOUN types: jednotky, jednotka, majetku, období, ocenění, účetnictví, položka, závěrky, dni, ustanovení

The 10 most frequent ambiguous lemmas: stát (NOUN 40, VERB 7), účetní (ADJ 1467, NOUN 22), provozní (ADJ 17, NOUN 3), místo (NOUN 2, ADP 1)

The 10 most frequent ambiguous types: ustanovení (NOUN 63, ADJ 1), výše (NOUN 35, ADV 6), účetní (ADJ 873, NOUN 21), celkem (ADV 18, NOUN 2), daní (NOUN 2, VERB 1), koupí (NOUN 2, VERB 2), provozní (ADJ 13, NOUN 2), místo (NOUN 1, ADP 1), prostřednictvím (ADP 17, NOUN 1), provozních (ADJ 2, NOUN 1)

Morphology

The form / lemma ratio of NOUN is 1.936195 (the average of all parts of speech is 1.764161).

The 1st highest number of forms (9) was observed with the lemma “jednotka”: jednotce, jednotek, jednotka, jednotkami, jednotkou, jednotku, jednotky, jednotkách, jednotkám.

The 2nd highest number of forms (9) was observed with the lemma “osoba”: osob, osoba, osobami, osobou, osobu, osoby, osobách, osobám, osobě.

The 3rd highest number of forms (9) was observed with the lemma “změna”: změn, změna, změnami, změnou, změnu, změny, změnách, změnám, změně.

NOUN occurs with 6 features: cs-feat/Gender (11303; 100% instances), cs-feat/Negative (11303; 100% instances), cs-feat/Case (11245; 99% instances), cs-feat/Number (11245; 99% instances), cs-feat/Animacy (4548; 40% instances), cs-feat/Abbr (27; 0% instances)

NOUN occurs with 16 feature-value pairs: Abbr=Yes, Animacy=Anim, Animacy=Inan, Case=Acc, Case=Dat, Case=Gen, Case=Ins, Case=Loc, Case=Nom, Gender=Fem, Gender=Masc, Gender=Neut, Negative=Neg, Negative=Pos, Number=Plur, Number=Sing

NOUN occurs with 57 feature combinations. The most frequent feature combination is Animacy=Inan|Case=Gen|Gender=Masc|Negative=Pos|Number=Sing (1135 tokens). Examples: majetku, dne, odstavce, zákona, zisku, státu, záznamu, rejstříku, předpisu, podniku

Relations

NOUN nodes are attached to their parents using 19 different relations: cs-dep/nmod (5853; 52% instances), cs-dep/conj (1909; 17% instances), cs-dep/dobj (1397; 12% instances), cs-dep/nsubj (1008; 9% instances), cs-dep/nsubjpass (349; 3% instances), cs-dep/mwe (258; 2% instances), cs-dep/root (169; 1% instances), cs-dep/dep (89; 1% instances), cs-dep/appos (68; 1% instances), cs-dep/acl (64; 1% instances), cs-dep/xcomp (48; 0% instances), cs-dep/advcl (42; 0% instances), cs-dep/iobj (29; 0% instances), cs-dep/cop (7; 0% instances), cs-dep/case (5; 0% instances), cs-dep/advmod (3; 0% instances), cs-dep/auxpass (2; 0% instances), cs-dep/parataxis (2; 0% instances), cs-dep/ccomp (1; 0% instances)

Parents of NOUN nodes belong to 12 different parts of speech: NOUN (6306; 56% instances), VERB (3404; 30% instances), ADJ (889; 8% instances), ADP (260; 2% instances), X (172; 2% instances), ROOT (169; 1% instances), ADV (44; 0% instances), NUM (25; 0% instances), PRON (23; 0% instances), SYM (8; 0% instances), DET (2; 0% instances), SCONJ (1; 0% instances)

1601 (14%) NOUN nodes are leaves.

3601 (32%) NOUN nodes have one child.

3486 (31%) NOUN nodes have two children.

2615 (23%) NOUN nodes have three or more children.

The highest child degree of a NOUN node is 48.

Children of NOUN nodes are attached using 27 different relations: cs-dep/amod (6074; 28% instances), cs-dep/nmod (4870; 23% instances), cs-dep/case (3420; 16% instances), cs-dep/conj (1864; 9% instances), cs-dep/cc (1371; 6% instances), cs-dep/punct (1273; 6% instances), cs-dep/acl (615; 3% instances), cs-dep/det (592; 3% instances), cs-dep/nummod (284; 1% instances), cs-dep/advmod:emph (231; 1% instances), cs-dep/dep (179; 1% instances), cs-dep/nsubj (136; 1% instances), cs-dep/cop (135; 1% instances), cs-dep/mark (102; 0% instances), cs-dep/appos (68; 0% instances), cs-dep/advmod (51; 0% instances), cs-dep/xcomp (49; 0% instances), cs-dep/parataxis (28; 0% instances), cs-dep/nummod:gov (24; 0% instances), cs-dep/dobj (20; 0% instances), cs-dep/advcl (7; 0% instances), cs-dep/csubj (4; 0% instances), cs-dep/auxpass:reflex (3; 0% instances), cs-dep/ccomp (2; 0% instances), cs-dep/expl (2; 0% instances), cs-dep/det:nummod (1; 0% instances), cs-dep/nsubjpass (1; 0% instances)

Children of NOUN nodes belong to 14 different parts of speech: NOUN (6306; 29% instances), ADJ (6150; 29% instances), ADP (3399; 16% instances), CONJ (1331; 6% instances), PUNCT (1273; 6% instances), VERB (730; 3% instances), X (672; 3% instances), DET (594; 3% instances), NUM (337; 2% instances), ADV (302; 1% instances), PRON (166; 1% instances), SCONJ (114; 1% instances), PART (27; 0% instances), SYM (5; 0% instances)


NOUN in other languages: [bg] [cs] [de] [el] [en] [es] [eu] [fa] [fi] [fr] [ga] [he] [hu] [it] [ja] [ko] [sv] [u]