home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Czech-CLTT: POS Tags: NOUN

There are 862 NOUN lemmas (32%), 1669 NOUN types (35%) and 11303 NOUN tokens (32%). Out of 15 observed tags, the rank of NOUN is: 1 in number of lemmas, 1 in number of types and 1 in number of tokens.

The 10 most frequent NOUN lemmas: jednotka, majetek, položka, závěrka, den, období, záznam, ocenění, závazek, účetnictví

The 10 most frequent NOUN types: jednotky, jednotka, majetku, období, ocenění, účetnictví, položka, závěrky, dni, ustanovení

The 10 most frequent ambiguous lemmas: stát (NOUN 40, VERB 7), účetní (ADJ 1467, NOUN 22), provozní (ADJ 17, NOUN 3), místo (NOUN 2, ADP 1)

The 10 most frequent ambiguous types: ustanovení (NOUN 63, ADJ 1), výše (NOUN 35, ADV 6), účetní (ADJ 873, NOUN 21), standardy (NOUN 14, X 1), ministerstvo (NOUN 6, X 1), celkem (ADV 18, NOUN 2), daní (NOUN 2, VERB 1), koupí (NOUN 2, VERB 2), ložisko (NOUN 1, X 1), provozní (ADJ 13, NOUN 2)

Morphology

The form / lemma ratio of NOUN is 1.936195 (the average of all parts of speech is 1.766716).

The 1st highest number of forms (9) was observed with the lemma “jednotka”: jednotce, jednotek, jednotka, jednotkami, jednotkou, jednotku, jednotky, jednotkách, jednotkám.

The 2nd highest number of forms (9) was observed with the lemma “osoba”: osob, osoba, osobami, osobou, osobu, osoby, osobách, osobám, osobě.

The 3rd highest number of forms (9) was observed with the lemma “změna”: změn, změna, změnami, změnou, změnu, změny, změnách, změnám, změně.

NOUN occurs with 6 features: Gender (11303; 100% instances), Polarity (11303; 100% instances), Case (11245; 99% instances), Number (11245; 99% instances), Animacy (4548; 40% instances), Abbr (27; 0% instances)

NOUN occurs with 16 feature-value pairs: Abbr=Yes, Animacy=Anim, Animacy=Inan, Case=Acc, Case=Dat, Case=Gen, Case=Ins, Case=Loc, Case=Nom, Gender=Fem, Gender=Masc, Gender=Neut, Number=Plur, Number=Sing, Polarity=Neg, Polarity=Pos

NOUN occurs with 57 feature combinations. The most frequent feature combination is Animacy=Inan|Case=Gen|Gender=Masc|Number=Sing|Polarity=Pos (1135 tokens). Examples: majetku, dne, odstavce, zákona, zisku, státu, záznamu, rejstříku, předpisu, podniku

Relations

NOUN nodes are attached to their parents using 22 different relations: nmod (4227; 37% instances), conj (1891; 17% instances), obl (1629; 14% instances), obj (1054; 9% instances), nsubj (1008; 9% instances), nsubj:pass (349; 3% instances), obl:arg (298; 3% instances), fixed (258; 2% instances), root (169; 1% instances), dep (89; 1% instances), appos (68; 1% instances), acl (64; 1% instances), obl:agent (59; 1% instances), xcomp (48; 0% instances), advcl (42; 0% instances), orphan (18; 0% instances), iobj (15; 0% instances), cop (7; 0% instances), case (5; 0% instances), aux:pass (2; 0% instances), parataxis (2; 0% instances), ccomp (1; 0% instances)

Parents of NOUN nodes belong to 12 different parts of speech: NOUN (6313; 56% instances), VERB (3088; 27% instances), ADJ (1190; 11% instances), ADP (260; 2% instances), X (174; 2% instances), (169; 1% instances), ADV (46; 0% instances), NUM (25; 0% instances), DET (24; 0% instances), SYM (8; 0% instances), PRON (5; 0% instances), SCONJ (1; 0% instances)

1091 (10%) NOUN nodes are leaves.

3713 (33%) NOUN nodes have one child.

3786 (33%) NOUN nodes have two children.

2713 (24%) NOUN nodes have three or more children.

The highest child degree of a NOUN node is 25.

Children of NOUN nodes are attached using 30 different relations: amod (6117; 29% instances), nmod (4781; 22% instances), case (3420; 16% instances), conj (1859; 9% instances), cc (1364; 6% instances), punct (1290; 6% instances), det (644; 3% instances), acl (556; 3% instances), nummod (284; 1% instances), advmod:emph (231; 1% instances), dep (179; 1% instances), nsubj (138; 1% instances), cop (135; 1% instances), mark (102; 0% instances), appos (68; 0% instances), obl (53; 0% instances), advmod (51; 0% instances), xcomp (49; 0% instances), parataxis (28; 0% instances), nummod:gov (24; 0% instances), obj (15; 0% instances), orphan (14; 0% instances), advcl (7; 0% instances), obl:arg (5; 0% instances), expl:pass (3; 0% instances), ccomp (2; 0% instances), csubj (2; 0% instances), expl:pv (2; 0% instances), det:nummod (1; 0% instances), nsubj:pass (1; 0% instances)

Children of NOUN nodes belong to 15 different parts of speech: NOUN (6313; 29% instances), ADJ (6249; 29% instances), ADP (3399; 16% instances), CCONJ (1321; 6% instances), PUNCT (1290; 6% instances), DET (741; 3% instances), X (672; 3% instances), VERB (482; 2% instances), NUM (338; 2% instances), ADV (305; 1% instances), AUX (133; 1% instances), SCONJ (114; 1% instances), PRON (36; 0% instances), PART (27; 0% instances), SYM (5; 0% instances)