home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Russian-Taiga: POS Tags: PART

There are 178 PART lemmas (0%), 214 PART types (0%) and 53493 PART tokens (3%). Out of 17 observed tags, the rank of PART is: 10 in number of lemmas, 12 in number of types and 11 in number of tokens.

The 10 most frequent PART lemmas: не, и, же, только, вот, даже, ли, ну, это, ни

The 10 most frequent PART types: не, и, же, только, вот, даже, ли, ну, это, ни

The 10 most frequent ambiguous lemmas: не (PART 21392, X 14, CCONJ 3), и (CCONJ 47408, PART 5895, X 27, NOUN 26), же (PART 4293, X 4, CCONJ 1), только (PART 3290, CCONJ 129, SCONJ 11), вот (PART 2237, ADV 1, X 1), даже (PART 1549, CCONJ 1), ли (PART 1540, X 2, CCONJ 1), ну (PART 1189, INTJ 10, VERB 4, X 4), это (PRON 6178, PART 1170, DET 2), ни (PART 1051, CCONJ 769, VERB 3, X 1)

The 10 most frequent ambiguous types: не (PART 19740, VERB 34, X 13, ADV 3, ADJ 2, CCONJ 2, NUM 1, PRON 1), и (CCONJ 42913, PART 5881, X 27, NOUN 19, ADP 2), же (PART 4292, X 12, CCONJ 1), только (PART 2915, CCONJ 75, SCONJ 9), вот (PART 1116, ADV 1), ли (PART 1537, X 2, NOUN 1), ну (PART 190, VERB 4, X 4, INTJ 2), это (PRON 3287, PART 1038, DET 866), ни (PART 954, CCONJ 707, VERB 3, PRON 2, ADV 1, DET 1, X 1), нет (VERB 860, PART 365)

Morphology

The form / lemma ratio of PART is 1.202247 (the average of all parts of speech is 2.706171).

The 1st highest number of forms (8) was observed with the lemma “не”: Неееее, е, н-не, на, не, неее, ни, нп.

The 2nd highest number of forms (8) was observed with the lemma “пожалуйста”: Пожалоста, Пожалуйстаааа, пж, пжж, подалуйста, пожалуйста, пожалуйстааа, пожалюйста.

The 3rd highest number of forms (5) was observed with the lemma “да”: Д-да, Да-а, Даа, да, даааа.

PART occurs with 5 features: Polarity (23245; 43% instances), ExtPos (1244; 2% instances), Foreign (273; 1% instances), Typo (43; 0% instances), Abbr (6; 0% instances)

PART occurs with 11 feature-value pairs: Abbr=Yes, ExtPos=ADV, ExtPos=CCONJ, ExtPos=DET, ExtPos=INTJ, ExtPos=PART, ExtPos=SCONJ, ExtPos=VERB, Foreign=Yes, Polarity=Neg, Typo=Yes

PART occurs with 21 feature combinations. The most frequent feature combination is _ (29479 tokens). Examples: и, же, только, вот, даже, ли, это, ну, то, ведь

Relations

PART nodes are attached to their parents using 30 different relations: advmod (43181; 81% instances), fixed (3699; 7% instances), discourse (1487; 3% instances), root (1252; 2% instances), expl (1165; 2% instances), cc (817; 2% instances), parataxis:discourse (765; 1% instances), conj (347; 1% instances), flat:name (251; 0% instances), parataxis (224; 0% instances), mark (132; 0% instances), appos (44; 0% instances), orphan (44; 0% instances), nmod (14; 0% instances), advcl (12; 0% instances), obl (12; 0% instances), compound (8; 0% instances), nsubj (7; 0% instances), obj (7; 0% instances), case (6; 0% instances), ccomp (4; 0% instances), dislocated (3; 0% instances), dep (2; 0% instances), flat (2; 0% instances), flat:goeswith (2; 0% instances), vocative (2; 0% instances), acl (1; 0% instances), csubj (1; 0% instances), det (1; 0% instances), iobj (1; 0% instances)

Parents of PART nodes belong to 17 different parts of speech: VERB (23373; 44% instances), NOUN (9333; 17% instances), ADV (5620; 11% instances), ADJ (3836; 7% instances), PRON (3208; 6% instances), DET (2694; 5% instances), PART (1806; 3% instances), (1252; 2% instances), PROPN (887; 2% instances), NUM (644; 1% instances), CCONJ (319; 1% instances), SCONJ (296; 1% instances), ADP (80; 0% instances), AUX (55; 0% instances), INTJ (47; 0% instances), X (40; 0% instances), SYM (3; 0% instances)

46440 (87%) PART nodes are leaves.

5321 (10%) PART nodes have one child.

875 (2%) PART nodes have two children.

857 (2%) PART nodes have three or more children.

The highest child degree of a PART node is 13.

Children of PART nodes are attached using 35 different relations: punct (5061; 49% instances), advmod (1548; 15% instances), fixed (1308; 13% instances), parataxis (478; 5% instances), nsubj (473; 5% instances), conj (310; 3% instances), cc (308; 3% instances), flat:name (239; 2% instances), discourse (225; 2% instances), vocative (99; 1% instances), iobj (51; 0% instances), parataxis:discourse (45; 0% instances), mark (33; 0% instances), obl (23; 0% instances), advcl (17; 0% instances), case (17; 0% instances), csubj (10; 0% instances), cop (8; 0% instances), det (8; 0% instances), goeswith (8; 0% instances), aux (6; 0% instances), flat (6; 0% instances), obj (6; 0% instances), orphan (5; 0% instances), acl:relcl (4; 0% instances), appos (4; 0% instances), nmod (4; 0% instances), amod (3; 0% instances), obl:tmod (3; 0% instances), xcomp (2; 0% instances), acl (1; 0% instances), ccomp (1; 0% instances), expl (1; 0% instances), nsubj:outer (1; 0% instances), nummod:gov (1; 0% instances)

Children of PART nodes belong to 17 different parts of speech: PUNCT (5061; 49% instances), PART (1806; 18% instances), ADV (973; 9% instances), VERB (575; 6% instances), NOUN (505; 5% instances), CCONJ (348; 3% instances), PROPN (294; 3% instances), PRON (278; 3% instances), SCONJ (197; 2% instances), AUX (73; 1% instances), INTJ (61; 1% instances), ADJ (50; 0% instances), DET (36; 0% instances), ADP (30; 0% instances), X (14; 0% instances), SYM (12; 0% instances), NUM (4; 0% instances)