home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Old_East_Slavic-TOROT: POS Tags: NOUN

There are 2852 NOUN lemmas (31%), 9966 NOUN types (30%) and 31705 NOUN tokens (21%). Out of 14 observed tags, the rank of NOUN is: 1 in number of lemmas, 2 in number of types and 1 in number of tokens.

The 10 most frequent NOUN lemmas: лѣто, кънязь, богъ, дьнь, земля, градъ, отьць, братъ, сынъ, людие

The 10 most frequent NOUN types: лѣт, лѣт҃, б҃ъ, дн҃ь, землю, б҃а, кнѧзь, земли, дн҃и, половци

The 10 most frequent ambiguous lemmas: грьчинъ (NOUN 96, PROPN 1), вѣсть (NOUN 54, ADV 1), вечеръ (NOUN 43, ADV 2), добро (NOUN 38, ADV 4), печера (NOUN 37, PROPN 2), словѣнинъ (NOUN 33, PROPN 1), рѣчь (NOUN 32, ADV 4), переяславьць (NOUN 19, PROPN 6), вятичи (NOUN 18, PROPN 1), вьрхъ (NOUN 16, ADP 2, PROPN 1)

The 10 most frequent ambiguous types: б҃ъ (NOUN 304, PROPN 1), дн҃ь (NOUN 275, ADV 1), б҃у (NOUN 66, PROPN 1), градѣ (NOUN 65, PROPN 1), имѧ (NOUN 59, PRON 1, VERB 1), зло (NOUN 45, ADJ 4), дѣти (NOUN 43, VERB 2), зла (NOUN 44, ADJ 15), кнѧже (NOUN 41, ADJ 1), вѣсть (NOUN 39, VERB 8, ADV 1)

Morphology

The form / lemma ratio of NOUN is 3.494390 (the average of all parts of speech is 3.571475).

The 1st highest number of forms (80) was observed with the lemma “кънязь”: кнзе, кнзеи, кнзем, кнземъ, кнзи, кнзмъ, кнзь, кнзьми, кнзю, кнзя, кнз҃емь, кнз҃и, кнз҃мь, кнз҃ь, кнз҃ю, кнз҃ѧ, кнз’, княже, княз, князеи, князей, княземъ, князи, князь, князю, князя, кнѕи, кнѧ, кнѧже, кнѧж҃, кнѧз, кнѧзем, кнѧзема, кнѧземъ, кнѧземь, кнѧзем҃, кнѧзех, кнѧзехъ, кнѧзи, кнѧзии, кнѧзъ, кнѧзь, кнѧзьмь, кнѧзю, кнѧзѣ, кнѧзѧ, кнѧзꙗ, кн҃же, кн҃зеи, кн҃зема, кн҃земъ, кн҃земь, кн҃зи, кн҃зихъ, кн҃змь, кн҃зь, кн҃зьма, кн҃зю, кн҃зѣмь, кн҃зѣхъ, кн҃зѧ, кн҃ѧзеи, кн҃ѧзь, кн҃ѧзю, кн҃ѧзѧ, кнꙗзи, кнꙗзь, кнꙗзьмь, кнꙗзю, кнꙗзѧ, кнꙗзꙗ, къ[н]ѧз[ь, кънѧже, кънѧзь, кънѧзьхъ, кънѧзю, кънѧзѧ, кънѧзѹ, кънꙗземъ, к҃нзь.

The 2nd highest number of forms (56) was observed with the lemma “цьркы”: церквеи, церкви, церквии, церковь, церкы, црвкъ, црквамъ, цркве, црквь, цркв҃и, цркв҃ь, црксви, црк҃вамъ, црк҃вахъ, црк҃ве, црк҃ви, црк҃вии, црк҃вма, црк҃въ, црк҃вь, црк҃вью, црк҃и, црк҃кꙑ, црк҃ъви, црк҃ы, црк҃ꙑ, црькви, црьковь, црькꙑ, цр҃вь, цр҃квам, цр҃квамъ, цр҃кве, цр҃кви, цр҃квии, цр҃квию, цр҃квхъ, цр҃квь, цр҃кв҃и, цр҃ки, цр҃ковь, цр҃къвь, цр҃кы, цр҃кꙑ, цр҃ькви, цр҃ькъвь, цьркви, цьрковь, цьркъви, ц҃рквам, ц҃рквамъ, ц҃ркве, ц҃ркви, ц҃рквь, ц҃рки, чрк҃ꙑ.

The 3rd highest number of forms (55) was observed with the lemma “богъ”: ба, ба҃, бв҃и, бгм҃ь, бгу, бгъ, бгъ҃, бг҃а, бг҃ви, бг҃мь, бг҃ѹ, бе, бж҃е, бз҃и, бз҃ѣ, бм҃ь, бога, богмь, бого, богу, богъ, богѹ, боже, бо҃мь, бу, бу҃, бъ, бъгъмь, бъ҃, бъ҃мь, бь҃, бѹ, бѹ҃, б҃, б҃а, б҃ви, б҃га, б҃гу, б҃гъ, б҃гꙋ, б҃е, б҃зи, б҃мъ, б҃мь, б҃о, б҃овъ, б҃омь, б҃оу, б҃у, б҃ъ, б҃ъмъ, б҃ы, б҃ь, б҃ѹ, б҃ꙋ.

NOUN occurs with 3 features: Case (31604; 100% instances), Gender (31604; 100% instances), Number (31604; 100% instances)

NOUN occurs with 15 feature-value pairs: Case=Acc, Case=Dat, Case=Dat,Gen, Case=Gen, Case=Ins, Case=Loc, Case=Nom, Case=Voc, Gender=Fem, Gender=Fem,Masc, Gender=Masc, Gender=Neut, Number=Dual, Number=Plur, Number=Sing

NOUN occurs with 69 feature combinations. The most frequent feature combination is Case=Gen|Gender=Masc|Number=Sing (3079 tokens). Examples: б҃а, града, брата, сн҃а, кнѧзѧ, мс҃цѧ, мсца, мцса, города, ѡц҃а

Relations

NOUN nodes are attached to their parents using 19 different relations: obl (9507; 30% instances), obj (5567; 18% instances), nsubj (5077; 16% instances), nmod (3173; 10% instances), conj (2969; 9% instances), appos (1751; 6% instances), iobj (1297; 4% instances), root (770; 2% instances), vocative (508; 2% instances), xcomp (247; 1% instances), orphan (211; 1% instances), nsubj:pass (187; 1% instances), obl:agent (162; 1% instances), advcl (98; 0% instances), dislocated (87; 0% instances), dep (47; 0% instances), ccomp (35; 0% instances), flat (8; 0% instances), parataxis (4; 0% instances)

Parents of NOUN nodes belong to 14 different parts of speech: VERB (20439; 64% instances), NOUN (5734; 18% instances), PROPN (1016; 3% instances), ADJ (983; 3% instances), NUM (897; 3% instances), (770; 2% instances), AUX (687; 2% instances), ADV (547; 2% instances), CCONJ (342; 1% instances), PRON (243; 1% instances), INTJ (21; 0% instances), ADP (16; 0% instances), X (9; 0% instances), DET (1; 0% instances)

9493 (30%) NOUN nodes are leaves.

12152 (38%) NOUN nodes have one child.

6797 (21%) NOUN nodes have two children.

3263 (10%) NOUN nodes have three or more children.

The highest child degree of a NOUN node is 16.

Children of NOUN nodes are attached using 27 different relations: case (10498; 28% instances), amod (5867; 15% instances), nmod (5323; 14% instances), det (3652; 10% instances), conj (2753; 7% instances), cc (2707; 7% instances), appos (2320; 6% instances), nummod (1217; 3% instances), advmod (805; 2% instances), acl (777; 2% instances), nsubj (557; 1% instances), cop (542; 1% instances), orphan (246; 1% instances), discourse (236; 1% instances), obl (168; 0% instances), mark (94; 0% instances), advcl (67; 0% instances), ccomp (47; 0% instances), vocative (45; 0% instances), dislocated (23; 0% instances), obj (19; 0% instances), iobj (17; 0% instances), aux (15; 0% instances), flat (8; 0% instances), obl:agent (6; 0% instances), parataxis (5; 0% instances), xcomp (1; 0% instances)

Children of NOUN nodes belong to 14 different parts of speech: ADP (10511; 28% instances), ADJ (9227; 24% instances), NOUN (5734; 15% instances), CCONJ (2722; 7% instances), DET (2469; 6% instances), PROPN (1946; 5% instances), PRON (1539; 4% instances), NUM (1300; 3% instances), ADV (950; 2% instances), VERB (917; 2% instances), AUX (582; 2% instances), SCONJ (94; 0% instances), INTJ (20; 0% instances), X (4; 0% instances)