home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Erzya-JR: POS Tags: NOUN

There are 1151 NOUN lemmas (41%), 2480 NOUN types (43%) and 4286 NOUN tokens (25%). Out of 16 observed tags, the rank of NOUN is: 1 in number of lemmas, 1 in number of types and 1 in number of tokens.

The 10 most frequent NOUN lemmas: ланго, кудо, веле, бандит, ломань, шка, кедь, ён, пря, тев

The 10 most frequent NOUN types: лангс, ёнов, бандитэсь, лангсо, партизантнэ, ялгат, кедензэ, ланга, ёндо, прянзо

The 10 most frequent ambiguous lemmas: ён (NOUN 51, ADJ 2, ADV 1), тев (NOUN 50, ADV 4), ведь (NOUN 20, PART 2), ашо (NOUN 14, ADJ 7), ве (NUM 18, DET 10, NOUN 10), ни (NOUN 9, X 2), экше (NOUN 8, ADJ 1), паро (ADJ 27, NOUN 6), валдо (NOUN 5, ADJ 3), ков (ADV 17, NOUN 5)

The 10 most frequent ambiguous types: лангс (NOUN 49, ADV 2), ланга (NOUN 18, ADV 2), тев (NOUN 11, ADV 4), пельде (NOUN 8, ADV 2), валт (NOUN 5, VERB 1), пелев (ADV 5, NOUN 5), таркас (ADP 4, NOUN 4), келес (NOUN 3, ADV 2), ков (ADV 7, NOUN 3), потс (ADV 3, NOUN 3)

Morphology

The form / lemma ratio of NOUN is 2.154648 (the average of all parts of speech is 2.044845).

The 1st highest number of forms (23) was observed with the lemma “кудо”: Кудотнеде, кудо, кудов, кудованть, кудодо, кудозо, кудозонзо, кудонзо, кудонтень, кудонть, кудонь, кудос, кудосо, кудосонть, кудост, кудосто, кудостонть, кудось, кудоськак, кудот, кудоткак, кудотне, кудояк.

The 2nd highest number of forms (22) was observed with the lemma “веле”: Велентькак, Велесэнек, веле, велев, велева, велеванть, веледенть, велекс, веленек, велентень, веленть, велень, велес, велестэ, велестэнть, велесь, велесэ, велесэнк, велесэнть, велетненень, велетнень, велетнестэ.

The 3rd highest number of forms (21) was observed with the lemma “кедь”: кедезэ, кедезэнзэ, кедензэ, кеденть, кедень, кедест, кедеть, кедтнеде, кедть, кедтькак, кедь, кедьс, кедьстэ, кедьстэнзэ, кедьсэ, кедьсэнзэ, кетьнесэ, кецтэнзэ, кецэ, кецэст, кецэтькак.

NOUN occurs with 19 features: Case (4264; 99% instances), Number (4245; 99% instances), Definite (3590; 84% instances), Number[psor] (677; 16% instances), Person[psor] (677; 16% instances), Clitic (102; 2% instances), Animacy (80; 2% instances), NounType (57; 1% instances), Derivation (46; 1% instances), Number[subj] (24; 1% instances), Person[subj] (24; 1% instances), Tense (24; 1% instances), Abbr (6; 0% instances), Style (6; 0% instances), VerbForm (6; 0% instances), AdvType (4; 0% instances), NameType (4; 0% instances), NumType (1; 0% instances), Typo (1; 0% instances)

NOUN occurs with 51 feature-value pairs: Abbr=Yes, AdvType=Loc, AdvType=Tim, Animacy=Anim, Animacy=Hum, Case=Abe, Case=Abl, Case=Cmp, Case=Com, Case=Dat, Case=Ela, Case=Gen, Case=Ill, Case=Ine, Case=Lat, Case=Loc, Case=Nom, Case=Prl, Case=Tem, Case=Tra, Clitic=Add, Definite=Def, Definite=Ind, Derivation=Dimin, Derivation=NomAg, Derivation=Omka, Derivation=VerbYcja, Derivation=Voc, Derivation=VocKaj, NameType=Geo, NameType=Sur, NounType=Relat, NumType=Frac, Number=Plur, Number=Plur,Sing, Number=Sing, Number[psor]=Plur, Number[psor]=Sing, Number[subj]=Plur, Number[subj]=Sing, Person[psor]=1, Person[psor]=2, Person[psor]=3, Person[subj]=1, Person[subj]=2, Person[subj]=3, Style=Arch, Tense=Past, Tense=Pres, Typo=Yes, VerbForm=Vnoun

NOUN occurs with 216 feature combinations. The most frequent feature combination is Case=Nom|Definite=Ind|Number=Sing (748 tokens). Examples: тол, ведь, ломань, тев, сёвонь, веле, сельме, гудок, револьвер, атя

Relations

NOUN nodes are attached to their parents using 37 different relations: nsubj (894; 21% instances), obl (892; 21% instances), obj (667; 16% instances), nmod (590; 14% instances), conj (211; 5% instances), compound (184; 4% instances), root (149; 3% instances), obl:lmod (104; 2% instances), obl:lmp (84; 2% instances), obl:lto (69; 2% instances), obl:inst (67; 2% instances), appos (53; 1% instances), obl:tmod (37; 1% instances), vocative (35; 1% instances), xcomp (29; 1% instances), obl:lfrom (27; 1% instances), nmod:poss (23; 1% instances), fixed (21; 0% instances), nsubj:cop (18; 0% instances), nmod:comp (17; 0% instances), advcl (13; 0% instances), orphan (12; 0% instances), amod (11; 0% instances), nmod:gobj (11; 0% instances), flat (10; 0% instances), acl (9; 0% instances), flat:name (7; 0% instances), nmod:gsubj (7; 0% instances), ccomp (6; 0% instances), discourse (6; 0% instances), dislocated (6; 0% instances), parataxis (6; 0% instances), nmod:lmod (5; 0% instances), obl:agent (3; 0% instances), acl:relcl (1; 0% instances), csubj (1; 0% instances), nummod (1; 0% instances)

Parents of NOUN nodes belong to 12 different parts of speech: VERB (2705; 63% instances), NOUN (1075; 25% instances), (149; 3% instances), ADJ (107; 2% instances), PRON (75; 2% instances), ADV (74; 2% instances), PROPN (49; 1% instances), AUX (32; 1% instances), DET (8; 0% instances), ADP (5; 0% instances), NUM (5; 0% instances), INTJ (2; 0% instances)

1912 (45%) NOUN nodes are leaves.

1597 (37%) NOUN nodes have one child.

483 (11%) NOUN nodes have two children.

294 (7%) NOUN nodes have three or more children.

The highest child degree of a NOUN node is 8.

Children of NOUN nodes are attached using 52 different relations: nmod (683; 19% instances), punct (676; 18% instances), amod (479; 13% instances), det (286; 8% instances), case (249; 7% instances), conj (202; 5% instances), compound (189; 5% instances), nummod (122; 3% instances), acl (120; 3% instances), nsubj (96; 3% instances), cc (65; 2% instances), advmod (57; 2% instances), appos (52; 1% instances), acl:relcl (50; 1% instances), obl (38; 1% instances), advcl (37; 1% instances), nmod:poss (35; 1% instances), cop (34; 1% instances), aux:neg (27; 1% instances), parataxis (20; 1% instances), discourse (18; 0% instances), orphan (15; 0% instances), advmod:tmod (13; 0% instances), fixed (12; 0% instances), vocative (11; 0% instances), flat:name (10; 0% instances), obj (10; 0% instances), mark (9; 0% instances), obl:lmod (9; 0% instances), advmod:lmod (8; 0% instances), advmod:foc (7; 0% instances), advmod:eval (5; 0% instances), flat (5; 0% instances), nmod:lmod (5; 0% instances), nmod:gobj (4; 0% instances), nsubj:cop (4; 0% instances), obl:tmod (4; 0% instances), advmod:deg (3; 0% instances), cc:preconj (3; 0% instances), xcomp (3; 0% instances), advmod:lto (2; 0% instances), aux:opt (2; 0% instances), expl (2; 0% instances), advmod:lmp (1; 0% instances), aux:aspect (1; 0% instances), csubj (1; 0% instances), csubj:cop (1; 0% instances), dislocated (1; 0% instances), nmod:gsubj (1; 0% instances), obl:inst (1; 0% instances), obl:lmp (1; 0% instances), obl:lto (1; 0% instances)

Children of NOUN nodes belong to 15 different parts of speech: NOUN (1075; 29% instances), PUNCT (676; 18% instances), ADJ (459; 12% instances), VERB (288; 8% instances), PRON (248; 7% instances), ADP (236; 6% instances), DET (154; 4% instances), ADV (136; 4% instances), NUM (124; 3% instances), PROPN (123; 3% instances), AUX (71; 2% instances), CCONJ (65; 2% instances), INTJ (15; 0% instances), PART (14; 0% instances), SCONJ (6; 0% instances)