home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Russian-SynTagRus: POS Tags: X

There are 1860 X lemmas (3%), 1860 X types (1%) and 3495 X tokens (0%). Out of 17 observed tags, the rank of X is: 6 in number of lemmas, 6 in number of types and 15 in number of tokens.

The 10 most frequent X lemmas: MBA, the, of, ButtKicker, Facebook, FIA, Iridium, and, RoboCup, Apple

The 10 most frequent X types: MBA, the, of, ButtKicker, Facebook, FIA, Iridium, and, RoboCup, Apple

The 10 most frequent ambiguous lemmas: x (SYM 4, X 1), daily (X 3, ADJ 1), а (CCONJ 8346, INTJ 16, NOUN 7, PART 6, X 4), б (NOUN 4, X 3), и (CCONJ 35005, PART 6267, NOUN 4, X 3, VERB 1), * (X 2, PUNCT 1), S. (PROPN 2, X 2), Х (PROPN 9, X 2), аль (PART 10, X 2), грата (X 2, NOUN 1)

The 10 most frequent ambiguous types: X (X 12, ADJ 11, NUM 1), daily (X 3, ADJ 1), а (CCONJ 5759, INTJ 5, X 4, NOUN 3, PART 2), б (AUX 23, X 3), и (CCONJ 31845, PART 6208, NOUN 4, X 3), * (X 2, PUNCT 1), S. (PROPN 2, X 2), Х (PROPN 3, X 2), аль (PART 9, X 2), грата (X 2, NOUN 1)

Morphology

The form / lemma ratio of X is 1.000000 (the average of all parts of speech is 2.668075).

The 1st highest number of forms (1) was observed with the lemma “&”: &.

The 2nd highest number of forms (1) was observed with the lemma “*”: *.

The 3rd highest number of forms (1) was observed with the lemma “.doc”: .doc.

X occurs with 2 features: Foreign (3489; 100% instances), Abbr (1; 0% instances)

X occurs with 2 feature-value pairs: Abbr=Yes, Foreign=Yes

X occurs with 3 feature combinations. The most frequent feature combination is Foreign=Yes (3489 tokens). Examples: MBA, the, of, ButtKicker, Facebook, FIA, Iridium, and, RoboCup, Apple

Relations

X nodes are attached to their parents using 25 different relations: flat:foreign (2317; 66% instances), appos (288; 8% instances), nsubj (265; 8% instances), nmod (167; 5% instances), parataxis (89; 3% instances), root (78; 2% instances), conj (75; 2% instances), obl (66; 2% instances), obj (31; 1% instances), flat (24; 1% instances), compound (22; 1% instances), nsubj:pass (20; 1% instances), iobj (12; 0% instances), flat:name (8; 0% instances), vocative (5; 0% instances), amod (4; 0% instances), orphan (4; 0% instances), xcomp (4; 0% instances), advcl (3; 0% instances), case (3; 0% instances), fixed (3; 0% instances), acl (2; 0% instances), csubj (2; 0% instances), list (2; 0% instances), ccomp (1; 0% instances)

Parents of X nodes belong to 14 different parts of speech: NOUN (2055; 59% instances), X (523; 15% instances), VERB (514; 15% instances), PROPN (187; 5% instances), (78; 2% instances), ADJ (73; 2% instances), NUM (18; 1% instances), ADV (16; 0% instances), DET (16; 0% instances), PRON (6; 0% instances), PART (4; 0% instances), CCONJ (2; 0% instances), SYM (2; 0% instances), ADP (1; 0% instances)

2217 (63%) X nodes are leaves.

701 (20%) X nodes have one child.

322 (9%) X nodes have two children.

255 (7%) X nodes have three or more children.

The highest child degree of a X node is 23.

Children of X nodes are attached using 36 different relations: punct (1262; 50% instances), flat:foreign (478; 19% instances), case (134; 5% instances), appos (98; 4% instances), amod (72; 3% instances), conj (70; 3% instances), parataxis (70; 3% instances), cc (55; 2% instances), nsubj (46; 2% instances), advmod (40; 2% instances), nmod (36; 1% instances), flat (29; 1% instances), obl (24; 1% instances), det (17; 1% instances), acl:relcl (15; 1% instances), acl (13; 1% instances), list (8; 0% instances), mark (8; 0% instances), obj (8; 0% instances), orphan (7; 0% instances), advcl (6; 0% instances), cop (5; 0% instances), nummod (5; 0% instances), nsubj:pass (4; 0% instances), aux:pass (3; 0% instances), fixed (3; 0% instances), flat:name (3; 0% instances), parataxis:discourse (2; 0% instances), ccomp (1; 0% instances), compound (1; 0% instances), csubj:pass (1; 0% instances), expl (1; 0% instances), iobj (1; 0% instances), nummod:entity (1; 0% instances), nummod:gov (1; 0% instances), xcomp (1; 0% instances)

Children of X nodes belong to 16 different parts of speech: PUNCT (1262; 50% instances), X (523; 21% instances), NOUN (178; 7% instances), ADP (134; 5% instances), VERB (90; 4% instances), ADJ (89; 4% instances), CCONJ (53; 2% instances), ADV (43; 2% instances), NUM (30; 1% instances), PROPN (30; 1% instances), SYM (22; 1% instances), DET (21; 1% instances), PRON (18; 1% instances), PART (15; 1% instances), SCONJ (13; 1% instances), AUX (8; 0% instances)