home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Russian-SynTagRus: POS Tags: ADJ

There are 10891 ADJ lemmas (20%), 34232 ADJ types (24%) and 140593 ADJ tokens (9%). Out of 17 observed tags, the rank of ADJ is: 2 in number of lemmas, 3 in number of types and 5 in number of tokens.

The 10 most frequent ADJ lemmas: новый, первый, должен, российский, большой, нужный, последний, политический, главный, второй

The 10 most frequent ADJ types: нужно, должны, должен, первый, второй, новые, российской, последние, должна, большой

The 10 most frequent ambiguous lemmas: первый (ADJ 1946, NOUN 1), хороший (ADJ 742, ADV 2), военный (ADJ 537, NOUN 90), русский (ADJ 537, NOUN 87), простой (ADJ 424, NOUN 3), местный (ADJ 341, NOUN 1), малый (ADJ 257, NOUN 2, ADV 1), плохой (ADJ 257, ADV 1), рабочий (ADJ 237, NOUN 97), дорогой (ADJ 185, NOUN 5)

The 10 most frequent ambiguous types: первый (ADJ 312, NOUN 1), невозможно (ADJ 248, ADV 5), необходимо (ADJ 211, ADV 2), должно (ADJ 229, ADV 4), важно (ADJ 194, ADV 6), лучше (ADJ 194, ADV 159), трудно (ADJ 182, ADV 17), равно (ADJ 194, ADV 26), понятно (ADJ 106, ADV 15), первым (ADJ 127, NOUN 1)

Morphology

The form / lemma ratio of ADJ is 3.143146 (the average of all parts of speech is 2.668075).

The 1st highest number of forms (29) was observed with the lemma “серьезный”: посерьезнее, посерьезней, серьезен, серьезна, серьезная, серьезнее, серьезней, серьезно, серьезного, серьезное, серьезной, серьезном, серьезному, серьезную, серьезны, серьезные, серьезный, серьезным, серьезными, серьезных, серьёзного, серьёзное, серьёзному, серьёзную, серьёзные, серьёзным, серьёзными, серьёзных, сурьезный.

The 2nd highest number of forms (22) was observed with the lemma “тяжелый”: тяжела, тяжелая, тяжелее, тяжело, тяжелого, тяжелое, тяжелой, тяжелом, тяжелому, тяжелую, тяжелы, тяжелые, тяжелый, тяжелым, тяжелыми, тяжелых, тяжёлого, тяжёлое, тяжёлой, тяжёлом, тяжёлые, тяжёлым.

The 3rd highest number of forms (21) was observed with the lemma “жесткий”: жесткая, жесткие, жесткий, жестким, жесткими, жестких, жестко, жесткого, жесткое, жесткой, жестком, жесткому, жесткую, жестче, жёсткая, жёсткие, жёсткий, жёстким, жёстких, жёсткого, жёсткой.

ADJ occurs with 11 features: Degree (134775; 96% instances), Number (133505; 95% instances), Case (121367; 86% instances), Gender (91801; 65% instances), Animacy (12703; 9% instances), Variant (12141; 9% instances), NumForm (5607; 4% instances), NumType (5607; 4% instances), Abbr (53; 0% instances), Typo (5; 0% instances), Foreign (1; 0% instances)

ADJ occurs with 25 feature-value pairs: Abbr=Yes, Animacy=Anim, Animacy=Inan, Case=Acc, Case=Dat, Case=Gen, Case=Ins, Case=Loc, Case=Nom, Degree=Cmp, Degree=Pos, Degree=Sup, Foreign=Yes, Gender=Fem, Gender=Masc, Gender=Neut, NumForm=Combi, NumForm=Digit, NumForm=Roman, NumForm=Word, NumType=Ord, Number=Plur, Number=Sing, Typo=Yes, Variant=Short

ADJ occurs with 101 feature combinations. The most frequent feature combination is Case=Gen|Degree=Pos|Number=Plur (13938 tokens). Examples: новых, российских, разных, научных, политических, последних, различных, экономических, крупных, государственных

Relations

ADJ nodes are attached to their parents using 33 different relations: amod (109750; 78% instances), root (8349; 6% instances), conj (7818; 6% instances), obl (4636; 3% instances), parataxis (1848; 1% instances), acl (1200; 1% instances), ccomp (967; 1% instances), nmod (942; 1% instances), compound (834; 1% instances), nsubj (740; 1% instances), fixed (723; 1% instances), acl:relcl (689; 0% instances), advcl (630; 0% instances), xcomp (541; 0% instances), obj (357; 0% instances), appos (171; 0% instances), orphan (102; 0% instances), iobj (96; 0% instances), nsubj:pass (54; 0% instances), list (42; 0% instances), csubj (41; 0% instances), parataxis:discourse (15; 0% instances), obl:tmod (13; 0% instances), obl:pronmod (11; 0% instances), flat (7; 0% instances), vocative (6; 0% instances), advmod (3; 0% instances), csubj:pass (2; 0% instances), flat:foreign (2; 0% instances), flat:name (1; 0% instances), nsubj:outer (1; 0% instances), obl:depict (1; 0% instances), obl:float (1; 0% instances)

Parents of ADJ nodes belong to 16 different parts of speech: NOUN (110967; 79% instances), (8349; 6% instances), VERB (8269; 6% instances), ADJ (7870; 6% instances), PROPN (2052; 1% instances), PRON (1247; 1% instances), ADV (458; 0% instances), ADP (453; 0% instances), NUM (411; 0% instances), DET (322; 0% instances), X (89; 0% instances), SYM (61; 0% instances), PART (37; 0% instances), INTJ (5; 0% instances), CCONJ (2; 0% instances), SCONJ (1; 0% instances)

101587 (72%) ADJ nodes are leaves.

17400 (12%) ADJ nodes have one child.

6773 (5%) ADJ nodes have two children.

14833 (11%) ADJ nodes have three or more children.

The highest child degree of a ADJ node is 12.

Children of ADJ nodes are attached using 41 different relations: punct (24716; 27% instances), advmod (12689; 14% instances), nsubj (9378; 10% instances), conj (8349; 9% instances), cc (6258; 7% instances), obl (5950; 6% instances), parataxis (3135; 3% instances), cop (3082; 3% instances), mark (2550; 3% instances), csubj (2475; 3% instances), xcomp (2256; 2% instances), case (2042; 2% instances), det (1772; 2% instances), nmod (1716; 2% instances), iobj (1569; 2% instances), advcl (1216; 1% instances), compound (926; 1% instances), ccomp (726; 1% instances), flat (378; 0% instances), aux (233; 0% instances), amod (164; 0% instances), acl:relcl (132; 0% instances), acl (110; 0% instances), appos (102; 0% instances), nsubj:pass (98; 0% instances), orphan (88; 0% instances), discourse (50; 0% instances), expl (49; 0% instances), flat:foreign (42; 0% instances), parataxis:discourse (41; 0% instances), nummod:gov (39; 0% instances), nummod (35; 0% instances), obl:tmod (25; 0% instances), vocative (12; 0% instances), obj (11; 0% instances), nsubj:outer (4; 0% instances), csubj:pass (3; 0% instances), fixed (3; 0% instances), aux:pass (1; 0% instances), dislocated (1; 0% instances), flat:name (1; 0% instances)

Children of ADJ nodes belong to 17 different parts of speech: PUNCT (24716; 27% instances), NOUN (13981; 15% instances), ADV (10298; 11% instances), VERB (9280; 10% instances), ADJ (7870; 9% instances), CCONJ (6146; 7% instances), PRON (4692; 5% instances), PART (3701; 4% instances), AUX (3317; 4% instances), DET (2536; 3% instances), SCONJ (2481; 3% instances), ADP (2095; 2% instances), PROPN (915; 1% instances), NUM (277; 0% instances), X (73; 0% instances), SYM (42; 0% instances), INTJ (7; 0% instances)