home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Czech-CLTT: POS Tags: NOUN

There are 858 NOUN lemmas (30%), 1665 NOUN types (34%) and 11292 NOUN tokens (31%). Out of 15 observed tags, the rank of NOUN is: 1 in number of lemmas, 1 in number of types and 1 in number of tokens.

The 10 most frequent NOUN lemmas: jednotka, majetek, položka, závěrka, den, období, záznam, ocenění, závazek, účetnictví

The 10 most frequent NOUN types: jednotky, jednotka, majetku, období, ocenění, účetnictví, položka, závěrky, dni, ustanovení

The 10 most frequent ambiguous lemmas: jednotka (NOUN 713, X 9), majetek (NOUN 309, X 17), závěrka (NOUN 281, X 1), den (NOUN 238, X 1), období (NOUN 204, X 11), ocenění (NOUN 156, X 5), hodnota (NOUN 125, X 1), osoba (NOUN 117, X 6), přeměna (NOUN 101, X 1), společnost (NOUN 101, X 1)

The 10 most frequent ambiguous types: jednotky (NOUN 295, X 1), jednotka (NOUN 226, X 9), majetku (NOUN 225, X 11), období (NOUN 190, X 11), ocenění (NOUN 141, X 5), ustanovení (NOUN 63, ADJ 1), společnosti (NOUN 79, X 1), položek (NOUN 70, X 4), majetek (NOUN 65, X 17), náklady (NOUN 61, X 15)

Morphology

The form / lemma ratio of NOUN is 1.940559 (the average of all parts of speech is 1.723629).

The 1st highest number of forms (9) was observed with the lemma “jednotka”: jednotce, jednotek, jednotka, jednotkami, jednotkou, jednotku, jednotky, jednotkách, jednotkám.

The 2nd highest number of forms (9) was observed with the lemma “osoba”: osob, osoba, osobami, osobou, osobu, osoby, osobách, osobám, osobě.

The 3rd highest number of forms (9) was observed with the lemma “změna”: změn, změna, změnami, změnou, změnu, změny, změnách, změnám, změně.

NOUN occurs with 6 features: Gender (11292; 100% instances), Polarity (11292; 100% instances), Case (11245; 100% instances), Number (11245; 100% instances), Animacy (4548; 40% instances), Abbr (27; 0% instances)

NOUN occurs with 16 feature-value pairs: Abbr=Yes, Animacy=Anim, Animacy=Inan, Case=Acc, Case=Dat, Case=Gen, Case=Ins, Case=Loc, Case=Nom, Gender=Fem, Gender=Masc, Gender=Neut, Number=Plur, Number=Sing, Polarity=Neg, Polarity=Pos

NOUN occurs with 57 feature combinations. The most frequent feature combination is Animacy=Inan|Case=Gen|Gender=Masc|Number=Sing|Polarity=Pos (1135 tokens). Examples: majetku, dne, odstavce, zákona, zisku, státu, záznamu, rejstříku, předpisu, podniku

Relations

NOUN nodes are attached to their parents using 20 different relations: nmod (4236; 38% instances), conj (1891; 17% instances), obl (1626; 14% instances), nsubj (1010; 9% instances), obj (806; 7% instances), obl:arg (608; 5% instances), nsubj:pass (347; 3% instances), fixed (258; 2% instances), root (169; 1% instances), dep (89; 1% instances), appos (67; 1% instances), acl:relcl (61; 1% instances), xcomp (48; 0% instances), advcl (44; 0% instances), orphan (18; 0% instances), case (5; 0% instances), acl (3; 0% instances), parataxis (3; 0% instances), iobj (2; 0% instances), ccomp (1; 0% instances)

Parents of NOUN nodes belong to 12 different parts of speech: NOUN (6308; 56% instances), VERB (3068; 27% instances), ADJ (1186; 11% instances), ADP (260; 2% instances), X (170; 2% instances), (169; 1% instances), ADV (46; 0% instances), NUM (25; 0% instances), DET (24; 0% instances), AUX (23; 0% instances), SYM (8; 0% instances), PRON (5; 0% instances)

1082 (10%) NOUN nodes are leaves.

3700 (33%) NOUN nodes have one child.

3779 (33%) NOUN nodes have two children.

2731 (24%) NOUN nodes have three or more children.

The highest child degree of a NOUN node is 25.

Children of NOUN nodes are attached using 30 different relations: amod (6118; 28% instances), nmod (4842; 22% instances), case (3420; 16% instances), conj (1859; 9% instances), cc (1365; 6% instances), punct (1343; 6% instances), det (644; 3% instances), acl:relcl (481; 2% instances), nummod (284; 1% instances), advmod:emph (229; 1% instances), dep (179; 1% instances), cop (135; 1% instances), nsubj (135; 1% instances), mark (106; 0% instances), acl (75; 0% instances), appos (67; 0% instances), obl (56; 0% instances), xcomp (49; 0% instances), advmod (45; 0% instances), parataxis (29; 0% instances), nummod:gov (24; 0% instances), orphan (14; 0% instances), advcl (9; 0% instances), obl:arg (9; 0% instances), expl:pass (3; 0% instances), ccomp (2; 0% instances), csubj (2; 0% instances), expl:pv (2; 0% instances), det:nummod (1; 0% instances), nsubj:pass (1; 0% instances)

Children of NOUN nodes belong to 15 different parts of speech: NOUN (6308; 29% instances), ADJ (6238; 29% instances), ADP (3408; 16% instances), CCONJ (1352; 6% instances), PUNCT (1343; 6% instances), X (722; 3% instances), DET (712; 3% instances), VERB (475; 2% instances), NUM (338; 2% instances), ADV (305; 1% instances), AUX (142; 1% instances), SCONJ (115; 1% instances), PRON (36; 0% instances), PART (29; 0% instances), SYM (5; 0% instances)