Treebank Statistics: UD_Old_East_Slavic-TOROT: POS Tags: NOUN
There are 2852 NOUN
lemmas (31%), 9966 NOUN
types (30%) and 31705 NOUN
tokens (21%).
Out of 14 observed tags, the rank of NOUN
is: 1 in number of lemmas, 2 in number of types and 1 in number of tokens.
The 10 most frequent NOUN
lemmas: лѣто, кънязь, богъ, дьнь, земля, градъ, отьць, братъ, сынъ, людие
The 10 most frequent NOUN
types: лѣт, лѣт҃, б҃ъ, дн҃ь, землю, б҃а, кнѧзь, земли, дн҃и, половци
The 10 most frequent ambiguous lemmas: грьчинъ (NOUN 96, PROPN 1), вѣсть (NOUN 54, ADV 1), вечеръ (NOUN 43, ADV 2), добро (NOUN 38, ADV 4), печера (NOUN 37, PROPN 2), словѣнинъ (NOUN 33, PROPN 1), рѣчь (NOUN 32, ADV 4), переяславьць (NOUN 19, PROPN 6), вятичи (NOUN 18, PROPN 1), вьрхъ (NOUN 16, ADP 2, PROPN 1)
The 10 most frequent ambiguous types: б҃ъ (NOUN 304, PROPN 1), дн҃ь (NOUN 275, ADV 1), б҃у (NOUN 66, PROPN 1), градѣ (NOUN 65, PROPN 1), имѧ (NOUN 59, PRON 1, VERB 1), зло (NOUN 45, ADJ 4), дѣти (NOUN 43, VERB 2), зла (NOUN 44, ADJ 15), кнѧже (NOUN 41, ADJ 1), вѣсть (NOUN 39, VERB 8, ADV 1)
- б҃ъ
- дн҃ь
- б҃у
- градѣ
- имѧ
- зло
- дѣти
- зла
- кнѧже
- вѣсть
Morphology
The form / lemma ratio of NOUN
is 3.494390 (the average of all parts of speech is 3.571475).
The 1st highest number of forms (80) was observed with the lemma “кънязь”: кнзе, кнзеи, кнзем, кнземъ, кнзи, кнзмъ, кнзь, кнзьми, кнзю, кнзя, кнз҃емь, кнз҃и, кнз҃мь, кнз҃ь, кнз҃ю, кнз҃ѧ, кнз’, княже, княз, князеи, князей, княземъ, князи, князь, князю, князя, кнѕи, кнѧ, кнѧже, кнѧж҃, кнѧз, кнѧзем, кнѧзема, кнѧземъ, кнѧземь, кнѧзем҃, кнѧзех, кнѧзехъ, кнѧзи, кнѧзии, кнѧзъ, кнѧзь, кнѧзьмь, кнѧзю, кнѧзѣ, кнѧзѧ, кнѧзꙗ, кн҃же, кн҃зеи, кн҃зема, кн҃земъ, кн҃земь, кн҃зи, кн҃зихъ, кн҃змь, кн҃зь, кн҃зьма, кн҃зю, кн҃зѣмь, кн҃зѣхъ, кн҃зѧ, кн҃ѧзеи, кн҃ѧзь, кн҃ѧзю, кн҃ѧзѧ, кнꙗзи, кнꙗзь, кнꙗзьмь, кнꙗзю, кнꙗзѧ, кнꙗзꙗ, къ[н]ѧз[ь, кънѧже, кънѧзь, кънѧзьхъ, кънѧзю, кънѧзѧ, кънѧзѹ, кънꙗземъ, к҃нзь.
The 2nd highest number of forms (56) was observed with the lemma “цьркы”: церквеи, церкви, церквии, церковь, церкы, црвкъ, црквамъ, цркве, црквь, цркв҃и, цркв҃ь, црксви, црк҃вамъ, црк҃вахъ, црк҃ве, црк҃ви, црк҃вии, црк҃вма, црк҃въ, црк҃вь, црк҃вью, црк҃и, црк҃кꙑ, црк҃ъви, црк҃ы, црк҃ꙑ, црькви, црьковь, црькꙑ, цр҃вь, цр҃квам, цр҃квамъ, цр҃кве, цр҃кви, цр҃квии, цр҃квию, цр҃квхъ, цр҃квь, цр҃кв҃и, цр҃ки, цр҃ковь, цр҃къвь, цр҃кы, цр҃кꙑ, цр҃ькви, цр҃ькъвь, цьркви, цьрковь, цьркъви, ц҃рквам, ц҃рквамъ, ц҃ркве, ц҃ркви, ц҃рквь, ц҃рки, чрк҃ꙑ.
The 3rd highest number of forms (55) was observed with the lemma “богъ”: ба, ба҃, бв҃и, бгм҃ь, бгу, бгъ, бгъ҃, бг҃а, бг҃ви, бг҃мь, бг҃ѹ, бе, бж҃е, бз҃и, бз҃ѣ, бм҃ь, бога, богмь, бого, богу, богъ, богѹ, боже, бо҃мь, бу, бу҃, бъ, бъгъмь, бъ҃, бъ҃мь, бь҃, бѹ, бѹ҃, б҃, б҃а, б҃ви, б҃га, б҃гу, б҃гъ, б҃гꙋ, б҃е, б҃зи, б҃мъ, б҃мь, б҃о, б҃овъ, б҃омь, б҃оу, б҃у, б҃ъ, б҃ъмъ, б҃ы, б҃ь, б҃ѹ, б҃ꙋ.
NOUN
occurs with 3 features: Case (31604; 100% instances), Gender (31604; 100% instances), Number (31604; 100% instances)
NOUN
occurs with 15 feature-value pairs: Case=Acc
, Case=Dat
, Case=Dat,Gen
, Case=Gen
, Case=Ins
, Case=Loc
, Case=Nom
, Case=Voc
, Gender=Fem
, Gender=Fem,Masc
, Gender=Masc
, Gender=Neut
, Number=Dual
, Number=Plur
, Number=Sing
NOUN
occurs with 69 feature combinations.
The most frequent feature combination is Case=Gen|Gender=Masc|Number=Sing
(3079 tokens).
Examples: б҃а, града, брата, сн҃а, кнѧзѧ, мс҃цѧ, мсца, мцса, города, ѡц҃а
Relations
NOUN
nodes are attached to their parents using 19 different relations: obl (9507; 30% instances), obj (5567; 18% instances), nsubj (5077; 16% instances), nmod (3173; 10% instances), conj (2969; 9% instances), appos (1751; 6% instances), iobj (1297; 4% instances), root (770; 2% instances), vocative (508; 2% instances), xcomp (247; 1% instances), orphan (211; 1% instances), nsubj:pass (187; 1% instances), obl:agent (162; 1% instances), advcl (98; 0% instances), dislocated (87; 0% instances), dep (47; 0% instances), ccomp (35; 0% instances), flat (8; 0% instances), parataxis (4; 0% instances)
Parents of NOUN
nodes belong to 14 different parts of speech: VERB (20439; 64% instances), NOUN (5734; 18% instances), PROPN (1016; 3% instances), ADJ (983; 3% instances), NUM (897; 3% instances), (770; 2% instances), AUX (687; 2% instances), ADV (547; 2% instances), CCONJ (342; 1% instances), PRON (243; 1% instances), INTJ (21; 0% instances), ADP (16; 0% instances), X (9; 0% instances), DET (1; 0% instances)
9493 (30%) NOUN
nodes are leaves.
12152 (38%) NOUN
nodes have one child.
6797 (21%) NOUN
nodes have two children.
3263 (10%) NOUN
nodes have three or more children.
The highest child degree of a NOUN
node is 16.
Children of NOUN
nodes are attached using 27 different relations: case (10498; 28% instances), amod (5867; 15% instances), nmod (5323; 14% instances), det (3652; 10% instances), conj (2753; 7% instances), cc (2707; 7% instances), appos (2320; 6% instances), nummod (1217; 3% instances), advmod (805; 2% instances), acl (777; 2% instances), nsubj (557; 1% instances), cop (542; 1% instances), orphan (246; 1% instances), discourse (236; 1% instances), obl (168; 0% instances), mark (94; 0% instances), advcl (67; 0% instances), ccomp (47; 0% instances), vocative (45; 0% instances), dislocated (23; 0% instances), obj (19; 0% instances), iobj (17; 0% instances), aux (15; 0% instances), flat (8; 0% instances), obl:agent (6; 0% instances), parataxis (5; 0% instances), xcomp (1; 0% instances)
Children of NOUN
nodes belong to 14 different parts of speech: ADP (10511; 28% instances), ADJ (9227; 24% instances), NOUN (5734; 15% instances), CCONJ (2722; 7% instances), DET (2469; 6% instances), PROPN (1946; 5% instances), PRON (1539; 4% instances), NUM (1300; 3% instances), ADV (950; 2% instances), VERB (917; 2% instances), AUX (582; 2% instances), SCONJ (94; 0% instances), INTJ (20; 0% instances), X (4; 0% instances)