home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Belarusian-HSE: POS Tags: NOUN

There are 9109 NOUN lemmas (30%), 18525 NOUN types (35%) and 72552 NOUN tokens (24%). Out of 17 observed tags, the rank of NOUN is: 1 in number of lemmas, 1 in number of types and 1 in number of tokens.

The 10 most frequent NOUN lemmas: год, чалавек, дзень, час, мова, беларус, гурт, сядзіба, варта, арт

The 10 most frequent NOUN types: дзень, людзей, чалавек, арт, годзе, гадоў, час, людзі, года, год

The 10 most frequent ambiguous lemmas: год (NOUN 1642, PRON 1), варта (NOUN 315, VERB 61), справа (NOUN 264, ADV 5), тысяча (NOUN 210, ADV 1), раз (NOUN 176, SCONJ 1), частка (NOUN 144, ADV 2), жанчына (NOUN 136, ADJ 1), імя (NOUN 108, ADP 1), клуб (NOUN 97, SCONJ 1), праўда (NOUN 80, ADV 1)

The 10 most frequent ambiguous types: г. (NOUN 150, ADV 18, ADJ 1, PRON 1), варта (NOUN 58, VERB 52), свабода (NOUN 8, PROPN 1), раз (NOUN 84, SCONJ 1), імя (NOUN 79, ADP 1), варты (NOUN 51, ADJ 3), BYN (NOUN 36, X 1), справа (NOUN 30, ADV 2), ахвяраў (NOUN 32, VERB 1), дома (ADV 39, NOUN 24)

Morphology

The form / lemma ratio of NOUN is 2.033703 (the average of all parts of speech is 1.753662).

The 1st highest number of forms (19) was observed with the lemma “чалавек”: людей, людзi, людзей, людзмі, людзт, людзьмі, людзям, людзямі, людзях, людзі, чал., чалаве, чалавек, чалавека, чалавекам, чалавекаў, чалавеку, чалавекі, чалавеча.

The 2nd highest number of forms (18) was observed with the lemma “год”: г, г., гадамі, гадах, гадоу, гадох, гадоў, гады, гг, гг., го, год, года, годам, годдзе, годзе, году, годы.

The 3rd highest number of forms (16) was observed with the lemma “улада”: улада, уладай, уладам, уладамі, уладаў, уладзе, уладу, улады, ўлад, ўлада, ўладай, ўладамі, ўладаў, ўладзе, ўладу, ўлады.

NOUN occurs with 14 features: Number (71319; 98% instances), Case (71318; 98% instances), Gender (71309; 98% instances), Animacy (71308; 98% instances), Abbr (1016; 1% instances), Foreign (194; 0% instances), Typo (36; 0% instances), Degree (12; 0% instances), Person (2; 0% instances), Aspect (1; 0% instances), Mood (1; 0% instances), Tense (1; 0% instances), VerbForm (1; 0% instances), Voice (1; 0% instances)

NOUN occurs with 24 feature-value pairs: Abbr=Yes, Animacy=Anim, Animacy=Inan, Aspect=Imp, Case=Acc, Case=Dat, Case=Gen, Case=Ins, Case=Loc, Case=Nom, Case=Voc, Degree=Pos, Foreign=Yes, Gender=Fem, Gender=Masc, Gender=Neut, Mood=Ind, Number=Plur, Number=Sing, Person=3, Tense=Past, Typo=Yes, VerbForm=Fin, Voice=Act

NOUN occurs with 107 feature combinations. The most frequent feature combination is Animacy=Inan|Case=Gen|Gender=Masc|Number=Sing (6269 tokens). Examples: года, году, сакавіка, красавіка, лістапада, траўня, гурта, жніўня, дня, часу

Relations

NOUN nodes are attached to their parents using 33 different relations: nmod (20048; 28% instances), obl (13485; 19% instances), nsubj (10659; 15% instances), obj (8748; 12% instances), conj (5762; 8% instances), root (5446; 8% instances), appos (1610; 2% instances), flat (1284; 2% instances), iobj (1247; 2% instances), parataxis (1074; 1% instances), nsubj:pass (781; 1% instances), xcomp (499; 1% instances), compound (399; 1% instances), list (337; 0% instances), obl:agent (243; 0% instances), fixed (207; 0% instances), nummod (146; 0% instances), orphan (131; 0% instances), vocative (119; 0% instances), ccomp (107; 0% instances), advcl (70; 0% instances), acl:relcl (40; 0% instances), acl (31; 0% instances), nummod:gov (29; 0% instances), csubj (23; 0% instances), flat:foreign (8; 0% instances), dep (5; 0% instances), dislocated (4; 0% instances), discourse (3; 0% instances), flat:name (3; 0% instances), reparandum (2; 0% instances), case (1; 0% instances), goeswith (1; 0% instances)

Parents of NOUN nodes belong to 18 different parts of speech: VERB (33024; 46% instances), NOUN (27138; 37% instances), (5446; 8% instances), ADJ (3071; 4% instances), PROPN (1246; 2% instances), ADV (812; 1% instances), PRON (450; 1% instances), DET (327; 0% instances), NUM (323; 0% instances), X (266; 0% instances), ADP (184; 0% instances), SYM (137; 0% instances), AUX (61; 0% instances), PART (35; 0% instances), SCONJ (18; 0% instances), INTJ (9; 0% instances), CCONJ (3; 0% instances), PUNCT (2; 0% instances)

13341 (18%) NOUN nodes are leaves.

24591 (34%) NOUN nodes have one child.

20396 (28%) NOUN nodes have two children.

14224 (20%) NOUN nodes have three or more children.

The highest child degree of a NOUN node is 13.

Children of NOUN nodes are attached using 40 different relations: nmod (25576; 22% instances), case (21393; 18% instances), amod (20460; 17% instances), punct (16383; 14% instances), conj (5589; 5% instances), appos (5018; 4% instances), det (5018; 4% instances), cc (3492; 3% instances), parataxis (2219; 2% instances), advmod (1910; 2% instances), acl:relcl (1829; 2% instances), nummod:gov (1715; 1% instances), nsubj (1707; 1% instances), nummod (1410; 1% instances), acl (820; 1% instances), dep (646; 1% instances), compound (413; 0% instances), cop (402; 0% instances), list (364; 0% instances), mark (257; 0% instances), obl (209; 0% instances), iobj (202; 0% instances), expl (138; 0% instances), orphan (133; 0% instances), advcl (71; 0% instances), discourse (67; 0% instances), csubj (60; 0% instances), flat:foreign (51; 0% instances), vocative (43; 0% instances), goeswith (34; 0% instances), flat (9; 0% instances), aux (8; 0% instances), ccomp (4; 0% instances), flat:name (4; 0% instances), dislocated (3; 0% instances), reparandum (3; 0% instances), aux:pass (2; 0% instances), obj (2; 0% instances), xcomp (2; 0% instances), fixed (1; 0% instances)

Children of NOUN nodes belong to 17 different parts of speech: NOUN (27138; 23% instances), ADJ (21240; 18% instances), ADP (21165; 18% instances), PUNCT (16383; 14% instances), PROPN (8417; 7% instances), DET (5177; 4% instances), VERB (3947; 3% instances), NUM (3548; 3% instances), CCONJ (3434; 3% instances), X (2377; 2% instances), ADV (1374; 1% instances), PRON (987; 1% instances), PART (975; 1% instances), SYM (593; 1% instances), SCONJ (463; 0% instances), AUX (423; 0% instances), INTJ (26; 0% instances)