home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Russian-Taiga: POS Tags: ADJ

There are 3295 ADJ lemmas (16%), 7212 ADJ types (19%) and 16885 ADJ tokens (9%). Out of 17 observed tags, the rank of ADJ is: 3 in number of lemmas, 3 in number of types and 4 in number of tokens.

The 10 most frequent ADJ lemmas: хороший, большой, нужный, отличный, удобный, вкусный, первый, вежливый, новый, неплохой

The 10 most frequent ADJ types: хороший, большой, нужно, отличный, лучше, неплохой, хорошая, хорошо, хорошее, вежливый

The 10 most frequent ambiguous lemmas: любимый (ADJ 56, NOUN 5), русский (ADJ 47, NOUN 10), лучше (ADJ 41, ADV 3), возможный (ADJ 40, ADV 1), святой (ADJ 38, NOUN 11), старший (ADJ 32, NOUN 10), супер (ADJ 30, PART 2, ADV 1), больной (ADJ 26, NOUN 11), близкий (ADJ 23, NOUN 2), золотой (ADJ 23, NOUN 1)

The 10 most frequent ambiguous types: лучше (ADJ 88, ADV 21), хорошо (ADJ 60, ADV 45, PART 3), дорого (ADJ 49, ADV 6), вкусно (ADJ 43, ADV 10), удобно (ADJ 48, ADV 4), чисто (ADJ 26, ADV 16), интересно (ADJ 32, ADV 3), общем (ADJ 44, NOUN 5), уютно (ADJ 21, ADV 1), красиво (ADJ 19, ADV 10)

Morphology

The form / lemma ratio of ADJ is 2.188771 (the average of all parts of speech is 1.879397).

The 1st highest number of forms (22) was observed with the lemma “хороший”: Хорлший, Хорошое, Шорошая, лучше, получше, хоро, хорош, хороша, хорошая, хорошего, хорошее, хорошей, хорошем, хороши, хорошие, хорошии, хороший, хорошим, хорошими, хороших, хорошо, хорошую.

The 2nd highest number of forms (20) was observed with the lemma “большой”: Бооольшой, Юольшой, блльшой, большая, больше, большие, большим, большими, больших, большого, большое, большой, большом, большому, большую, велик, велика, велики, по, побольше.

The 3rd highest number of forms (19) was observed with the lemma “черный”: Черная, черна, чернее, черного, черной, черном, черную, черные, черный, черных, чёрен, чёрная, чёрного, чёрное, чёрной, чёрном, чёрные, чёрный, чёрных.

ADJ occurs with 12 features: Degree (16287; 96% instances), Number (15894; 94% instances), Case (13543; 80% instances), Gender (12181; 72% instances), Variant (2362; 14% instances), Animacy (1187; 7% instances), NumForm (717; 4% instances), NumType (717; 4% instances), Abbr (163; 1% instances), Typo (128; 1% instances), Poss (93; 1% instances), Foreign (3; 0% instances)

ADJ occurs with 27 feature-value pairs: Abbr=Yes, Animacy=Anim, Animacy=Inan, Case=Acc, Case=Dat, Case=Gen, Case=Ins, Case=Loc, Case=Nom, Degree=Cmp, Degree=Pos, Degree=Sup, Foreign=Yes, Gender=Fem, Gender=Masc, Gender=Neut, NumForm=Combi, NumForm=Digit, NumForm=Roman, NumForm=Word, NumType=Frac, NumType=Ord, Number=Plur, Number=Sing, Poss=Yes, Typo=Yes, Variant=Short

ADJ occurs with 172 feature combinations. The most frequent feature combination is Case=Nom|Degree=Pos|Gender=Masc|Number=Sing (2766 tokens). Examples: хороший, большой, отличный, неплохой, вежливый, обычный, добрый, красивый, широкий, единственный

Relations

ADJ nodes are attached to their parents using 28 different relations: amod (11258; 67% instances), root (2120; 13% instances), conj (1579; 9% instances), parataxis (359; 2% instances), obl (293; 2% instances), xcomp (177; 1% instances), nmod (171; 1% instances), acl (155; 1% instances), ccomp (145; 1% instances), nsubj (114; 1% instances), advcl (104; 1% instances), fixed (83; 0% instances), acl:relcl (79; 0% instances), obj (79; 0% instances), compound (46; 0% instances), appos (36; 0% instances), list (24; 0% instances), iobj (20; 0% instances), csubj (12; 0% instances), orphan (10; 0% instances), vocative (6; 0% instances), nsubj:pass (5; 0% instances), flat:name (4; 0% instances), advmod (2; 0% instances), case (1; 0% instances), flat (1; 0% instances), flat:foreign (1; 0% instances), mark (1; 0% instances)

Parents of ADJ nodes belong to 17 different parts of speech: NOUN (11568; 69% instances), (2120; 13% instances), VERB (1317; 8% instances), ADJ (1115; 7% instances), PROPN (265; 2% instances), PRON (258; 2% instances), NUM (63; 0% instances), DET (52; 0% instances), ADV (50; 0% instances), ADP (31; 0% instances), X (16; 0% instances), AUX (10; 0% instances), PART (8; 0% instances), INTJ (5; 0% instances), SYM (5; 0% instances), CCONJ (1; 0% instances), SCONJ (1; 0% instances)

10798 (64%) ADJ nodes are leaves.

2182 (13%) ADJ nodes have one child.

1427 (8%) ADJ nodes have two children.

2478 (15%) ADJ nodes have three or more children.

The highest child degree of a ADJ node is 9.

Children of ADJ nodes are attached using 40 different relations: punct (3879; 27% instances), nsubj (2068; 14% instances), advmod (1966; 14% instances), conj (1635; 11% instances), cc (957; 7% instances), obl (894; 6% instances), csubj (431; 3% instances), parataxis (425; 3% instances), mark (330; 2% instances), case (306; 2% instances), cop (281; 2% instances), iobj (238; 2% instances), det (215; 1% instances), advcl (144; 1% instances), xcomp (128; 1% instances), nmod (111; 1% instances), discourse (97; 1% instances), ccomp (89; 1% instances), flat (78; 1% instances), amod (32; 0% instances), vocative (32; 0% instances), aux (26; 0% instances), compound (24; 0% instances), obj (17; 0% instances), acl:relcl (15; 0% instances), appos (12; 0% instances), expl (12; 0% instances), goeswith (9; 0% instances), orphan (9; 0% instances), acl (8; 0% instances), nsubj:pass (8; 0% instances), list (5; 0% instances), nummod (5; 0% instances), fixed (4; 0% instances), nummod:gov (3; 0% instances), aux:pass (2; 0% instances), dislocated (2; 0% instances), flat:name (2; 0% instances), dep (1; 0% instances), obl:agent (1; 0% instances)

Children of ADJ nodes belong to 17 different parts of speech: PUNCT (3879; 27% instances), NOUN (2692; 19% instances), ADV (1428; 10% instances), VERB (1315; 9% instances), ADJ (1115; 8% instances), CCONJ (956; 7% instances), PRON (908; 6% instances), PART (687; 5% instances), SCONJ (321; 2% instances), AUX (315; 2% instances), ADP (302; 2% instances), DET (268; 2% instances), PROPN (123; 1% instances), SYM (85; 1% instances), NUM (69; 0% instances), X (23; 0% instances), INTJ (15; 0% instances)