home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Erzya-JR: POS Tags: NOUN

There are 1327 NOUN lemmas (40%), 2926 NOUN types (43%) and 5120 NOUN tokens (25%). Out of 16 observed tags, the rank of NOUN is: 1 in number of lemmas, 1 in number of types and 1 in number of tokens.

The 10 most frequent NOUN lemmas: ланго, кудо, веле, ломань, шка, бандит, кедь, чи, пря, тев

The 10 most frequent NOUN types: лангс, ёнов, лангсо, бандитэсь, партизантнэ, ланга, ялгат, кедензэ, кудов, прянзо

The 10 most frequent ambiguous lemmas: ён (NOUN 55, ADJ 2, ADV 1), ведь (NOUN 24, PART 4), ашо (NOUN 14, ADJ 9), ве (NUM 18, NOUN 13, DET 10, ADV 2), ни (NOUN 13, X 2), чокшне (NOUN 11, ADV 1), экше (NOUN 8, ADJ 1), ков (ADV 21, NOUN 7), пандя (NOUN 7, VERB 1), валдо (ADJ 6, NOUN 6)

The 10 most frequent ambiguous types: лангс (NOUN 55, ADV 2), ланга (NOUN 19, ADV 2), ведь (NOUN 13, PART 1), пельде (NOUN 9, ADV 3, ADP 2), ладсо (NOUN 6, ADP 4), пиземе (NOUN 6, VERB 1), валт (NOUN 5, VERB 1), пелев (ADV 5, NOUN 5), потс (NOUN 5, ADV 3), ков (ADV 11, NOUN 4)

Morphology

The form / lemma ratio of NOUN is 2.204974 (the average of all parts of speech is 2.080547).

The 1st highest number of forms (24) was observed with the lemma “кудо”: Кудотнеде, кудо, кудов, кудованть, кудодо, кудозо, кудозонзо, кудонзо, кудонтень, кудонть, кудонь, кудос, кудосо, кудосонть, кудост, кудосто, кудостонть, кудось, кудоськак, кудот, кудоткак, кудотне, кудотнеяк, кудояк.

The 2nd highest number of forms (23) was observed with the lemma “веле”: Велентькак, Велесэнек, веле, велев, велева, велеванть, веледенть, велекс, велем, веленек, велентень, веленть, велень, велес, велестэ, велестэнть, велесь, велесэ, велесэнк, велесэнть, велетненень, велетнень, велетнестэ.

The 3rd highest number of forms (22) was observed with the lemma “кедь”: кедезэ, кедезэнзэ, кедензэ, кеденть, кедень, кедест, кедеть, кедте, кедтнеде, кедть, кедтькак, кедь, кедьс, кедьстэ, кедьстэнзэ, кедьсэ, кедьсэнзэ, кетьнесэ, кецтэнзэ, кецэ, кецэст, кецэтькак.

NOUN occurs with 22 features: Case (5081; 99% instances), Number (5060; 99% instances), Definite (4278; 84% instances), Number[psor] (800; 16% instances), Person[psor] (800; 16% instances), NounType (203; 4% instances), Clitic (114; 2% instances), Animacy (81; 2% instances), Nomzr (50; 1% instances), Number[subj] (33; 1% instances), Person[subj] (33; 1% instances), Tense (33; 1% instances), AdvType (29; 1% instances), Derivation (18; 0% instances), Degree (14; 0% instances), VerbForm (14; 0% instances), Typo (9; 0% instances), Abbr (8; 0% instances), ExtPos (8; 0% instances), Style (6; 0% instances), NameType (4; 0% instances), NumType (1; 0% instances)

NOUN occurs with 52 feature-value pairs: Abbr=Yes, AdvType=Loc, AdvType=Tim, Animacy=Anim, Animacy=Hum, Case=Abe, Case=Abl, Case=Cmp, Case=Com, Case=Dat, Case=Ela, Case=Gen, Case=Ill, Case=Ine, Case=Lat, Case=Loc, Case=Nom, Case=Prl, Case=Tem, Case=Tra, Clitic=Add, Definite=Def, Definite=Ind, Degree=Dim, Derivation=Omka, Derivation=Voc, Derivation=VocKaj, ExtPos=ADV, NameType=Geo, NameType=Sur, Nomzr=Ag, NounType=Relat, NumType=Frac, Number=Plur, Number=Plur,Sing, Number=Sing, Number[psor]=Plur, Number[psor]=Sing, Number[subj]=Plur, Number[subj]=Sing, Person[psor]=1, Person[psor]=2, Person[psor]=3, Person[subj]=1, Person[subj]=2, Person[subj]=3, Style=Arch, Tense=Past, Tense=Pres, Typo=Yes, VerbForm=Part, VerbForm=Vnoun

NOUN occurs with 266 feature combinations. The most frequent feature combination is Case=Nom|Definite=Ind|Number=Sing (876 tokens). Examples: тев, ломань, тол, ведь, сельме, сёвонь, веле, атя, вирь, гудок

Relations

NOUN nodes are attached to their parents using 34 different relations: obl (1514; 30% instances), nsubj (1044; 20% instances), obj (764; 15% instances), nmod (727; 14% instances), conj (240; 5% instances), compound (176; 3% instances), root (173; 3% instances), appos (70; 1% instances), vocative (60; 1% instances), nsubj:cop (50; 1% instances), xcomp (35; 1% instances), obl:tmod (31; 1% instances), nmod:poss (30; 1% instances), nmod:gobj (27; 1% instances), fixed (26; 1% instances), obl:cmp (23; 0% instances), advcl (19; 0% instances), orphan (19; 0% instances), amod (12; 0% instances), acl (10; 0% instances), discourse (10; 0% instances), dislocated (8; 0% instances), flat:name (8; 0% instances), nmod:gsubj (8; 0% instances), parataxis (8; 0% instances), ccomp (7; 0% instances), obl:agent (5; 0% instances), compound:nn (4; 0% instances), flat (4; 0% instances), acl:relcl (2; 0% instances), advmod (2; 0% instances), obl:own (2; 0% instances), csubj (1; 0% instances), nummod (1; 0% instances)

Parents of NOUN nodes belong to 12 different parts of speech: VERB (3244; 63% instances), NOUN (1256; 25% instances), ADJ (200; 4% instances), (173; 3% instances), PRON (92; 2% instances), ADV (82; 2% instances), PROPN (33; 1% instances), AUX (17; 0% instances), DET (8; 0% instances), ADP (7; 0% instances), NUM (5; 0% instances), INTJ (3; 0% instances)

2292 (45%) NOUN nodes are leaves.

1889 (37%) NOUN nodes have one child.

599 (12%) NOUN nodes have two children.

340 (7%) NOUN nodes have three or more children.

The highest child degree of a NOUN node is 8.

Children of NOUN nodes are attached using 43 different relations: nmod (835; 19% instances), punct (817; 19% instances), amod (581; 13% instances), det (296; 7% instances), case (284; 7% instances), conj (235; 5% instances), compound (186; 4% instances), acl (135; 3% instances), nummod (130; 3% instances), advmod (109; 2% instances), nsubj (100; 2% instances), nmod:poss (96; 2% instances), cc (75; 2% instances), appos (57; 1% instances), obl (57; 1% instances), acl:relcl (54; 1% instances), compound:nn (45; 1% instances), advcl (43; 1% instances), aux:neg (33; 1% instances), cop (28; 1% instances), parataxis (24; 1% instances), discourse (22; 1% instances), nsubj:cop (20; 0% instances), mark (17; 0% instances), orphan (17; 0% instances), flat:name (11; 0% instances), obj (10; 0% instances), vocative (10; 0% instances), fixed (8; 0% instances), flat (5; 0% instances), nmod:gobj (4; 0% instances), cc:preconj (3; 0% instances), obl:tmod (3; 0% instances), xcomp (3; 0% instances), aux (2; 0% instances), aux:opt (2; 0% instances), ccomp (2; 0% instances), expl (2; 0% instances), aux:aspect (1; 0% instances), csubj (1; 0% instances), csubj:cop (1; 0% instances), dislocated (1; 0% instances), nmod:gsubj (1; 0% instances)

Children of NOUN nodes belong to 15 different parts of speech: NOUN (1256; 29% instances), PUNCT (817; 19% instances), ADJ (568; 13% instances), VERB (319; 7% instances), PRON (288; 7% instances), ADP (268; 6% instances), DET (193; 4% instances), PROPN (182; 4% instances), ADV (161; 4% instances), NUM (132; 3% instances), CCONJ (75; 2% instances), AUX (67; 2% instances), PART (18; 0% instances), INTJ (15; 0% instances), SCONJ (7; 0% instances)