home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Russian-GSD: POS Tags: X

There are 1205 X lemmas (6%), 1215 X types (4%) and 1505 X tokens (2%). Out of 16 observed tags, the rank of X is: 5 in number of lemmas, 5 in number of types and 12 in number of tokens.

The 10 most frequent X lemmas: the, of, _, a, and, Airlines, company, de, music, to

The 10 most frequent X types: the, of, a, and, Airlines, Music, Records, company, de, to

The 10 most frequent ambiguous lemmas: a (X 6, NOUN 2), де (PART 6, X 5), ISO (X 4, PROPN 2), windows (X 4, NOUN 1), F (X 3, NOUN 2), MTV (X 3, PROPN 1), i (NOUN 2, X 1), а (CCONJ 275, NOUN 10, X 3), и (CCONJ 2230, PART 119, X 3), стрит (X 2, NOUN 1)

The 10 most frequent ambiguous types: a (X 4, NOUN 2), де (PART 26, X 4), же (PART 115, X 5), C (X 4, ADP 2, NOUN 2), ISO (X 4, PROPN 2), Windows (X 4, NOUN 1), F (X 3, NOUN 2), I (ADJ 22, X 3), MTV (X 3, PROPN 1), а (CCONJ 261, X 3, NOUN 1)

Morphology

The form / lemma ratio of X is 1.008299 (the average of all parts of speech is 1.598617).

The 1st highest number of forms (8) was observed with the lemma “_”: ru, ЗЗ, бы, же, западе, нибудь, соm, таки.

The 2nd highest number of forms (2) was observed with the lemma “Hume”: Hume, Hume''s.

The 3rd highest number of forms (2) was observed with the lemma “boy”: Boy, Boys.

X occurs with 3 features: Foreign (1483; 99% instances), Abbr (2; 0% instances), Typo (1; 0% instances)

X occurs with 3 feature-value pairs: Abbr=Yes, Foreign=Yes, Typo=Yes

X occurs with 4 feature combinations. The most frequent feature combination is Foreign=Yes (1482 tokens). Examples: the, of, a, and, Airlines, Music, Records, company, de, to

Relations

X nodes are attached to their parents using 21 different relations: flat:foreign (635; 42% instances), appos (368; 24% instances), conj (138; 9% instances), nmod (99; 7% instances), nsubj (74; 5% instances), obl (40; 3% instances), flat:name (39; 3% instances), list (14; 1% instances), compound (12; 1% instances), goeswith (12; 1% instances), obj (11; 1% instances), orphan (11; 1% instances), parataxis (10; 1% instances), amod (9; 1% instances), nsubj:pass (8; 1% instances), xcomp (7; 0% instances), obl:agent (5; 0% instances), root (5; 0% instances), iobj (4; 0% instances), case (3; 0% instances), cc (1; 0% instances)

Parents of X nodes belong to 14 different parts of speech: X (846; 56% instances), NOUN (452; 30% instances), VERB (121; 8% instances), PROPN (48; 3% instances), ADJ (13; 1% instances), NUM (6; 0% instances), (5; 0% instances), CCONJ (4; 0% instances), SYM (4; 0% instances), ADP (2; 0% instances), ADV (1; 0% instances), DET (1; 0% instances), PART (1; 0% instances), SCONJ (1; 0% instances)

790 (52%) X nodes are leaves.

291 (19%) X nodes have one child.

155 (10%) X nodes have two children.

269 (18%) X nodes have three or more children.

The highest child degree of a X node is 17.

Children of X nodes are attached using 25 different relations: punct (680; 36% instances), flat:foreign (647; 34% instances), conj (145; 8% instances), appos (82; 4% instances), case (81; 4% instances), cc (51; 3% instances), flat:name (38; 2% instances), nummod:entity (36; 2% instances), nmod (23; 1% instances), amod (21; 1% instances), list (16; 1% instances), advmod (10; 1% instances), nummod (7; 0% instances), acl (6; 0% instances), dep (6; 0% instances), parataxis (6; 0% instances), acl:relcl (5; 0% instances), nummod:gov (4; 0% instances), orphan (3; 0% instances), advcl (2; 0% instances), det (2; 0% instances), nsubj (2; 0% instances), obl (2; 0% instances), compound (1; 0% instances), goeswith (1; 0% instances)

Children of X nodes belong to 13 different parts of speech: X (846; 45% instances), PUNCT (680; 36% instances), ADP (73; 4% instances), NOUN (67; 4% instances), NUM (55; 3% instances), CCONJ (48; 3% instances), PROPN (34; 2% instances), ADJ (27; 1% instances), VERB (18; 1% instances), SYM (13; 1% instances), ADV (9; 0% instances), DET (4; 0% instances), PART (3; 0% instances)