Treebank Statistics: UD_Czech-PDT: POS Tags: NOUN
There are 9039 NOUN
lemmas (33%), 18229 NOUN
types (34%) and 83173 NOUN
tokens (25%).
Out of 17 observed tags, the rank of NOUN
is: 1 in number of lemmas, 1 in number of types and 1 in number of tokens.
The 10 most frequent NOUN
lemmas: rok, strana, léta, cena, firma, doba, vláda, zákon, společnost, země
The 10 most frequent NOUN
types: p, let, roku, korun, roce, Kč, r, strany, firmy, případě
The 10 most frequent ambiguous lemmas: bod (NOUN 338, PROPN 1), stát (VERB 344, NOUN 328), den (NOUN 272, X 1), místo (NOUN 222, ADP 45, ADV 6), a (CCONJ 7162, NOUN 17, X 6), teplo (NOUN 91, ADV 1), pravda (NOUN 69, PART 2), s (ADP 2504, NOUN 27, X 10, PART 6), růst (NOUN 60, VERB 26), x (NOUN 32, SYM 19)
The 10 most frequent ambiguous types: p (NOUN 163, ADJ 2), s (ADP 1960, NOUN 72, X 10, PART 6), a (CCONJ 6945, ADJ 32, NOUN 17, X 6), září (NOUN 102, VERB 2), j (NOUN 9, ADJ 1), bod (NOUN 87, PROPN 1), stát (NOUN 75, VERB 50), den (NOUN 70, X 1), místo (NOUN 69, ADP 34, ADV 4), x (NOUN 32, SYM 19)
- p
- s
- a
- CCONJ 6945: Je naprosto bezbřehý a nevypočitatelný .
- ADJ 32: a . s . Malostranské nám . 2 118 00 Praha 1 Tel . / fax : 684 62 55
- NOUN 17: Ušetříte téměř 90 % proti variantě a ) .
- X 6: V gigantickém výstavním komplexu Porte de Versailles začal včera v Paříži jeden z největších a nejprestižnějších světových veletrhů módy Pret a Porter .
- září
- j
- bod
- stát
- den
- místo
- x
Morphology
The form / lemma ratio of NOUN
is 2.016705 (the average of all parts of speech is 1.961704).
The 1st highest number of forms (11) was observed with the lemma “strana”: s, str, stran, strana, stranami, stranou, stranu, strany, stranách, stranám, straně.
The 2nd highest number of forms (10) was observed with the lemma “hodina”: Hodina, h, hod, hodin, hodinami, hodinou, hodinu, hodiny, hodinách, hodině.
The 3rd highest number of forms (10) was observed with the lemma “ministr”: ministr, ministra, ministrem, ministrovi, ministru, ministry, ministrů, ministrům, ministře, ministři.
NOUN
occurs with 9 features: Gender (79711; 96% instances), Case (78979; 95% instances), Number (78979; 95% instances), Animacy (34831; 42% instances), VerbForm (5750; 7% instances), Abbr (4056; 5% instances), Style (80; 0% instances), Typo (18; 0% instances), Foreign (1; 0% instances)
NOUN
occurs with 23 feature-value pairs: Abbr=Yes
, Animacy=Anim
, Animacy=Inan
, Case=Acc
, Case=Dat
, Case=Gen
, Case=Ins
, Case=Loc
, Case=Nom
, Case=Voc
, Foreign=Yes
, Gender=Fem
, Gender=Masc
, Gender=Neut
, Number=Dual
, Number=Plur
, Number=Sing
, Style=Coll
, Style=Expr
, Style=Slng
, Style=Vrnc
, Typo=Yes
, VerbForm=Vnoun
NOUN
occurs with 130 feature combinations.
The most frequent feature combination is Case=Gen|Gender=Fem|Number=Sing
(6423 tokens).
Examples: strany, práce, vlády, společnosti, firmy, republiky, rady, přímky, doby, obrany
Relations
NOUN
nodes are attached to their parents using 27 different relations: nmod (27075; 33% instances), obl (14235; 17% instances), nsubj (13058; 16% instances), obj (9252; 11% instances), conj (5858; 7% instances), obl:arg (5217; 6% instances), root (2762; 3% instances), nsubj:pass (1419; 2% instances), appos (1018; 1% instances), dep (914; 1% instances), fixed (493; 1% instances), xcomp (470; 1% instances), advcl (344; 0% instances), orphan (298; 0% instances), ccomp (202; 0% instances), acl:relcl (154; 0% instances), case (139; 0% instances), acl (88; 0% instances), iobj (61; 0% instances), csubj (32; 0% instances), flat (29; 0% instances), parataxis (27; 0% instances), vocative (18; 0% instances), csubj:pass (5; 0% instances), advmod (3; 0% instances), amod (1; 0% instances), discourse (1; 0% instances)
Parents of NOUN
nodes belong to 17 different parts of speech: VERB (35924; 43% instances), NOUN (33274; 40% instances), ADJ (6318; 8% instances), (2762; 3% instances), PROPN (1459; 2% instances), NUM (980; 1% instances), ADV (912; 1% instances), ADP (497; 1% instances), DET (355; 0% instances), PRON (218; 0% instances), AUX (155; 0% instances), SYM (128; 0% instances), X (116; 0% instances), PART (62; 0% instances), CCONJ (10; 0% instances), INTJ (2; 0% instances), SCONJ (1; 0% instances)
13297 (16%) NOUN
nodes are leaves.
28956 (35%) NOUN
nodes have one child.
24606 (30%) NOUN
nodes have two children.
16314 (20%) NOUN
nodes have three or more children.
The highest child degree of a NOUN
node is 17.
Children of NOUN
nodes are attached using 36 different relations: amod (33105; 24% instances), nmod (29721; 21% instances), case (25090; 18% instances), punct (9915; 7% instances), det (6573; 5% instances), conj (5678; 4% instances), cc (4480; 3% instances), nummod (3670; 3% instances), advmod:emph (3051; 2% instances), acl:relcl (2878; 2% instances), cop (2305; 2% instances), flat (2288; 2% instances), nsubj (1814; 1% instances), nummod:gov (1790; 1% instances), mark (1199; 1% instances), appos (1028; 1% instances), acl (982; 1% instances), dep (953; 1% instances), advmod (607; 0% instances), obl (498; 0% instances), xcomp (344; 0% instances), orphan (235; 0% instances), det:numgov (214; 0% instances), csubj (184; 0% instances), det:nummod (115; 0% instances), advcl (99; 0% instances), aux (84; 0% instances), parataxis (77; 0% instances), obl:arg (27; 0% instances), discourse (15; 0% instances), fixed (13; 0% instances), ccomp (10; 0% instances), obj (8; 0% instances), flat:foreign (6; 0% instances), vocative (2; 0% instances), expl:pv (1; 0% instances)
Children of NOUN
nodes belong to 17 different parts of speech: ADJ (33662; 24% instances), NOUN (33274; 24% instances), ADP (24868; 18% instances), PUNCT (9915; 7% instances), DET (7582; 5% instances), PROPN (6881; 5% instances), NUM (5816; 4% instances), CCONJ (5203; 4% instances), VERB (3978; 3% instances), AUX (2444; 2% instances), ADV (2297; 2% instances), SCONJ (1214; 1% instances), PART (980; 1% instances), X (510; 0% instances), PRON (372; 0% instances), SYM (61; 0% instances), INTJ (2; 0% instances)