home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Romanian-SiMoNERo: POS Tags: X

There are 622 X lemmas (5%), 623 X types (3%) and 1271 X tokens (1%). Out of 16 observed tags, the rank of X is: 5 in number of lemmas, 5 in number of types and 12 in number of tokens.

The 10 most frequent X lemmas: ß, jeun, GH, à, T1DM, beta, in, Ca2, of, et

The 10 most frequent X types: ß, jeun, GH, à, T1DM, in, beta, Ca2, of, et

The 10 most frequent ambiguous lemmas: ß (X 26, NOUN 7), GH (X 21, NOUN 8), T1DM (X 19, NOUN 7), beta (X 19, NOUN 7, ADJ 1), in (X 19, NOUN 5), T2DM (NOUN 24, X 13), SF-36 (X 12, NOUN 3), post (X 10, NOUN 1), al (DET 2868, X 7, NOUN 2), factor (NOUN 229, X 7)

The 10 most frequent ambiguous types: ß (X 26, NOUN 7), GH (X 21, NOUN 8), T1DM (X 19, NOUN 7), in (X 18, NOUN 5), beta (X 18, NOUN 6, ADJ 1), T2DM (NOUN 24, X 13), SF-36 (X 12, NOUN 3), al (DET 569, X 7, NOUN 2), factor (NOUN 51, X 6), per (X 7, ADP 6)

Morphology

The form / lemma ratio of X is 1.001608 (the average of all parts of speech is 1.666462).

The 1st highest number of forms (2) was observed with the lemma “beta”: beta, 𝛽.

The 2nd highest number of forms (2) was observed with the lemma “sine”: se, sine.

The 3rd highest number of forms (1) was observed with the lemma “&b.alpha”: &b.alpha.

X occurs with 1 features: Abbr (24; 2% instances)

X occurs with 1 feature-value pairs: Abbr=Yes

X occurs with 2 feature combinations. The most frequent feature combination is _ (1247 tokens). Examples: ß, jeun, à, T1DM, in, beta, Ca2, of, et, T2DM

Relations

X nodes are attached to their parents using 22 different relations: nmod (569; 45% instances), flat (254; 20% instances), appos (104; 8% instances), amod (92; 7% instances), conj (78; 6% instances), nsubj (35; 3% instances), obl (33; 3% instances), obj (18; 1% instances), dep (17; 1% instances), case (16; 1% instances), fixed (11; 1% instances), obl:agent (9; 1% instances), parataxis (9; 1% instances), advmod (6; 0% instances), root (6; 0% instances), nsubj:pass (3; 0% instances), obl:pmod (3; 0% instances), acl (2; 0% instances), iobj (2; 0% instances), xcomp (2; 0% instances), csubj (1; 0% instances), goeswith (1; 0% instances)

Parents of X nodes belong to 9 different parts of speech: NOUN (636; 50% instances), X (444; 35% instances), VERB (94; 7% instances), PROPN (47; 4% instances), ADJ (29; 2% instances), PRON (8; 1% instances), (6; 0% instances), NUM (5; 0% instances), ADV (2; 0% instances)

701 (55%) X nodes are leaves.

246 (19%) X nodes have one child.

144 (11%) X nodes have two children.

180 (14%) X nodes have three or more children.

The highest child degree of a X node is 13.

Children of X nodes are attached using 26 different relations: punct (337; 26% instances), nmod (206; 16% instances), flat (200; 16% instances), case (133; 10% instances), amod (97; 8% instances), conj (92; 7% instances), cc (57; 4% instances), det (30; 2% instances), nummod (30; 2% instances), appos (29; 2% instances), acl (18; 1% instances), advmod (12; 1% instances), fixed (11; 1% instances), cop (6; 0% instances), nsubj (6; 0% instances), dep (3; 0% instances), goeswith (3; 0% instances), parataxis (3; 0% instances), aux (2; 0% instances), advcl (1; 0% instances), cc:preconj (1; 0% instances), ccomp:pmod (1; 0% instances), mark (1; 0% instances), nsubj:pass (1; 0% instances), obl (1; 0% instances), xcomp (1; 0% instances)

Children of X nodes belong to 14 different parts of speech: X (444; 35% instances), PUNCT (337; 26% instances), NOUN (134; 10% instances), ADP (118; 9% instances), ADJ (70; 5% instances), CCONJ (54; 4% instances), NUM (34; 3% instances), DET (33; 3% instances), ADV (19; 1% instances), VERB (17; 1% instances), PROPN (11; 1% instances), AUX (8; 1% instances), PRON (2; 0% instances), PART (1; 0% instances)