home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Czech-CLTT: POS Tags: NOUN

There are 848 NOUN lemmas (30%), 1640 NOUN types (34%) and 11062 NOUN tokens (31%). Out of 15 observed tags, the rank of NOUN is: 1 in number of lemmas, 1 in number of types and 1 in number of tokens.

The 10 most frequent NOUN lemmas: jednotka, položka, majetek, závěrka, den, období, záznam, závazek, ocenění, účetnictví

The 10 most frequent NOUN types: jednotky, jednotka, majetku, období, ocenění, účetnictví, položka, dni, závěrky, ustanovení

The 10 most frequent ambiguous lemmas: jednotka (NOUN 699, X 9), majetek (NOUN 303, X 17), závěrka (NOUN 276, X 1), den (NOUN 237, X 1), období (NOUN 204, X 11), ocenění (NOUN 151, X 5), hodnota (NOUN 125, X 1), osoba (NOUN 116, X 6), přeměna (NOUN 100, X 1), společnost (NOUN 100, X 1)

The 10 most frequent ambiguous types: jednotky (NOUN 286, X 1), jednotka (NOUN 225, X 9), majetku (NOUN 221, X 11), období (NOUN 190, X 11), ocenění (NOUN 138, X 5), ustanovení (NOUN 63, ADJ 1), společnosti (NOUN 78, X 1), položek (NOUN 70, X 4), majetek (NOUN 64, X 17), náklady (NOUN 58, X 15)

Morphology

The form / lemma ratio of NOUN is 1.933962 (the average of all parts of speech is 1.713272).

The 1st highest number of forms (9) was observed with the lemma “jednotka”: jednotce, jednotek, jednotka, jednotkami, jednotkou, jednotku, jednotky, jednotkách, jednotkám.

The 2nd highest number of forms (9) was observed with the lemma “osoba”: osob, osoba, osobami, osobou, osobu, osoby, osobách, osobám, osobě.

The 3rd highest number of forms (9) was observed with the lemma “změna”: změn, změna, změnami, změnou, změnu, změny, změnách, změnám, změně.

NOUN occurs with 6 features: Gender (11062; 100% instances), Polarity (11062; 100% instances), Case (11016; 100% instances), Number (11016; 100% instances), Animacy (4459; 40% instances), Abbr (27; 0% instances)

NOUN occurs with 16 feature-value pairs: Abbr=Yes, Animacy=Anim, Animacy=Inan, Case=Acc, Case=Dat, Case=Gen, Case=Ins, Case=Loc, Case=Nom, Gender=Fem, Gender=Masc, Gender=Neut, Number=Plur, Number=Sing, Polarity=Neg, Polarity=Pos

NOUN occurs with 57 feature combinations. The most frequent feature combination is Animacy=Inan|Case=Gen|Gender=Masc|Number=Sing|Polarity=Pos (1107 tokens). Examples: majetku, dne, odstavce, zákona, zisku, státu, záznamu, rejstříku, předpisu, zápisu

Relations

NOUN nodes are attached to their parents using 19 different relations: nmod (4149; 38% instances), conj (1871; 17% instances), obl (1558; 14% instances), nsubj (988; 9% instances), obj (804; 7% instances), obl:arg (601; 5% instances), nsubj:pass (340; 3% instances), fixed (251; 2% instances), root (170; 2% instances), dep (82; 1% instances), appos (65; 1% instances), acl:relcl (62; 1% instances), xcomp (48; 0% instances), advcl (45; 0% instances), orphan (17; 0% instances), case (5; 0% instances), acl (3; 0% instances), parataxis (2; 0% instances), ccomp (1; 0% instances)

Parents of NOUN nodes belong to 12 different parts of speech: NOUN (6202; 56% instances), VERB (3030; 27% instances), ADJ (1165; 11% instances), ADP (255; 2% instances), (170; 2% instances), X (130; 1% instances), ADV (46; 0% instances), NUM (25; 0% instances), DET (23; 0% instances), SYM (8; 0% instances), PRON (5; 0% instances), AUX (3; 0% instances)

1062 (10%) NOUN nodes are leaves.

3602 (33%) NOUN nodes have one child.

3692 (33%) NOUN nodes have two children.

2706 (24%) NOUN nodes have three or more children.

The highest child degree of a NOUN node is 25.

Children of NOUN nodes are attached using 30 different relations: amod (6005; 28% instances), nmod (4489; 21% instances), case (3377; 16% instances), conj (1846; 9% instances), cc (1342; 6% instances), punct (1337; 6% instances), det (627; 3% instances), acl:relcl (474; 2% instances), obl (335; 2% instances), nummod (281; 1% instances), advmod:emph (222; 1% instances), dep (177; 1% instances), cop (140; 1% instances), nsubj (136; 1% instances), mark (107; 1% instances), acl (74; 0% instances), appos (65; 0% instances), xcomp (49; 0% instances), advmod (45; 0% instances), nummod:gov (24; 0% instances), parataxis (24; 0% instances), orphan (13; 0% instances), advcl (9; 0% instances), obl:arg (9; 0% instances), expl:pass (3; 0% instances), ccomp (2; 0% instances), csubj (2; 0% instances), expl:pv (2; 0% instances), det:nummod (1; 0% instances), nsubj:pass (1; 0% instances)

Children of NOUN nodes belong to 15 different parts of speech: NOUN (6202; 29% instances), ADJ (6125; 29% instances), ADP (3366; 16% instances), PUNCT (1337; 6% instances), CCONJ (1328; 6% instances), X (740; 3% instances), DET (694; 3% instances), VERB (467; 2% instances), NUM (332; 2% instances), ADV (300; 1% instances), AUX (143; 1% instances), SCONJ (113; 1% instances), PRON (35; 0% instances), PART (31; 0% instances), SYM (5; 0% instances)