home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Romanian-SiMoNERo: POS Tags: X

There are 624 X lemmas (5%), 627 X types (3%) and 1273 X tokens (1%). Out of 16 observed tags, the rank of X is: 5 in number of lemmas, 5 in number of types and 12 in number of tokens.

The 10 most frequent X lemmas: ß, jeun, GH, à, T1DM, beta, in, Ca2, of, et

The 10 most frequent X types: ß, jeun, GH, à, T1DM, in, beta, Ca2, of, et

The 10 most frequent ambiguous lemmas: ß (X 26, NOUN 7), GH (X 21, NOUN 8), T1DM (X 19, NOUN 7), beta (X 19, NOUN 7, ADJ 1), in (X 19, NOUN 5), T2DM (NOUN 24, X 13), SF-36 (X 12, NOUN 3), post (X 10, NOUN 1), al (DET 2868, X 7, NOUN 2), factor (NOUN 229, X 7)

The 10 most frequent ambiguous types: ß (X 26, NOUN 7), GH (X 21, NOUN 8), T1DM (X 19, NOUN 7), in (X 18, NOUN 5), beta (X 18, NOUN 6, ADJ 1), T2DM (NOUN 24, X 13), SF-36 (X 12, NOUN 3), &b.beta; (X 10, ADJ 1), al (DET 569, X 7, NOUN 2), factor (NOUN 51, X 6)

Morphology

The form / lemma ratio of X is 1.004808 (the average of all parts of speech is 1.666637).

The 1st highest number of forms (5) was observed with the lemma “_”: celulare, chirurgical, compensarea, jeun, mult.

The 2nd highest number of forms (2) was observed with the lemma “beta”: beta, 𝛽.

The 3rd highest number of forms (2) was observed with the lemma “sine”: se, sine.

X occurs with 2 features: Abbr (24; 2% instances), Typo (1; 0% instances)

X occurs with 2 feature-value pairs: Abbr=Yes, Typo=Yes

X occurs with 3 feature combinations. The most frequent feature combination is _ (1248 tokens). Examples: ß, jeun, à, T1DM, in, beta, Ca2, of, et, T2DM

Relations

X nodes are attached to their parents using 22 different relations: nmod (569; 45% instances), flat (255; 20% instances), appos (104; 8% instances), amod (91; 7% instances), conj (78; 6% instances), nsubj (34; 3% instances), obl (33; 3% instances), obj (18; 1% instances), case (16; 1% instances), dep (16; 1% instances), fixed (11; 1% instances), obl:agent (9; 1% instances), parataxis (9; 1% instances), advmod (6; 0% instances), root (6; 0% instances), goeswith (5; 0% instances), nsubj:pass (3; 0% instances), obl:pmod (3; 0% instances), acl (2; 0% instances), iobj (2; 0% instances), xcomp (2; 0% instances), csubj (1; 0% instances)

Parents of X nodes belong to 9 different parts of speech: NOUN (635; 50% instances), X (445; 35% instances), VERB (93; 7% instances), PROPN (47; 4% instances), ADJ (31; 2% instances), PRON (8; 1% instances), (6; 0% instances), NUM (5; 0% instances), ADV (3; 0% instances)

703 (55%) X nodes are leaves.

245 (19%) X nodes have one child.

145 (11%) X nodes have two children.

180 (14%) X nodes have three or more children.

The highest child degree of a X node is 13.

Children of X nodes are attached using 25 different relations: punct (337; 26% instances), nmod (208; 16% instances), flat (202; 16% instances), case (133; 10% instances), amod (98; 8% instances), conj (92; 7% instances), cc (57; 4% instances), det (32; 2% instances), nummod (30; 2% instances), appos (29; 2% instances), acl (20; 2% instances), advmod (12; 1% instances), fixed (10; 1% instances), cop (6; 0% instances), nsubj (6; 0% instances), dep (3; 0% instances), parataxis (3; 0% instances), aux (2; 0% instances), advcl (1; 0% instances), cc:preconj (1; 0% instances), ccomp:pmod (1; 0% instances), goeswith (1; 0% instances), mark (1; 0% instances), obl (1; 0% instances), xcomp (1; 0% instances)

Children of X nodes belong to 14 different parts of speech: X (445; 35% instances), PUNCT (337; 26% instances), NOUN (136; 11% instances), ADP (118; 9% instances), ADJ (69; 5% instances), CCONJ (54; 4% instances), DET (34; 3% instances), NUM (34; 3% instances), ADV (19; 1% instances), VERB (19; 1% instances), PROPN (11; 1% instances), AUX (8; 1% instances), PRON (2; 0% instances), PART (1; 0% instances)