This is part of archived UD v1 documentation. See http://universaldependencies.org/ for the current version.
home ga/pos issue tracker

X: other

Description

The tag X is used for words that for some reason cannot be assigned a real part-of-speech category.

Foreign words (see Foreign) are also tagged X.

Examples


Treebank Statistics (UD_Irish)

There are 151 X lemmas (4%), 152 X types (3%) and 265 X tokens (1%). Out of 16 observed tags, the rank of X is: 5 in number of lemmas, 7 in number of types and 14 in number of tokens.

The 10 most frequent X lemmas: sin, (2), (a), (b), seo, (1), (c), (3), (4), Co.

The 10 most frequent X types: san, (2), (a), (b), so, (1), (c), (3), (4), Co.

The 10 most frequent ambiguous lemmas: sin (PRON 109, DET 106, X 16, VERB 2), (2) (X 12, NUM 1), seo (DET 114, PRON 26, X 10, VERB 4), (1) (X 9, NUM 2), (3) (X 5, NUM 1), (4) (X 5, NUM 1), a (PART 864, DET 182, X 4, ADP 1), dein (X 5, VERB 1), faoi (ADP 106, X 3, PART 1), fás (VERB 4, X 2, NOUN 1)

The 10 most frequent ambiguous types: san (ADP 45, X 16, PRON 4), (2) (X 12, NUM 1), (1) (X 9, NUM 2), (3) (X 5, NUM 1), (4) (X 5, NUM 1), a (PART 855, DET 187, X 1, ADP 1), chan (X 3, VERB 1), dhein (X 2, VERB 1), (5) (X 1, NUM 1), I (ADP 18, X 1)

Morphology

The form / lemma ratio of X is 1.006623 (the average of all parts of speech is 1.449988).

The 1st highest number of forms (3) was observed with the lemma “dein”: dein, deineadh, dhein.

The 2nd highest number of forms (1) was observed with the lemma “(1)”: (1).

The 3rd highest number of forms (1) was observed with the lemma “(11)”: (11).

X occurs with 14 features: ga-feat/Dialect (49; 18% instances), ga-feat/PronType (27; 10% instances), ga-feat/Number (10; 4% instances), ga-feat/Negative (8; 3% instances), ga-feat/Person (8; 3% instances), ga-feat/Mood (7; 3% instances), ga-feat/PartType (6; 2% instances), ga-feat/Gender (5; 2% instances), ga-feat/Tense (5; 2% instances), ga-feat/Form (4; 2% instances), ga-feat/Case (2; 1% instances), ga-feat/Definite (1; 0% instances), ga-feat/VerbForm (1; 0% instances), ga-feat/Voice (1; 0% instances)

X occurs with 23 feature-value pairs: Case=Com, Definite=Def, Dialect=Connaught, Dialect=Munster, Dialect=Ulster, Form=Emph, Form=Len, Gender=Fem, Gender=Masc, Mood=Imp, Mood=Ind, Negative=Neg, Number=Sing, PartType=Vb, Person=1, Person=2, Person=3, PronType=Dem, PronType=Rel, Tense=Past, Tense=Pres, VerbForm=Cop, Voice=Auto

X occurs with 17 feature combinations. The most frequent feature combination is _ (214 tokens). Examples: (2), (a), (b), (1), (c), (3), (4), Co., a, Uimh.

Relations

X nodes are attached to their parents using 21 different relations: ga-dep/nummod (81; 31% instances), ga-dep/nmod (34; 13% instances), ga-dep/compound (31; 12% instances), ga-dep/det (25; 9% instances), ga-dep/foreign (17; 6% instances), ga-dep/nsubj (10; 4% instances), ga-dep/root (10; 4% instances), ga-dep/list (9; 3% instances), ga-dep/conj (8; 3% instances), ga-dep/dobj (7; 3% instances), ga-dep/appos (6; 2% instances), ga-dep/amod (4; 2% instances), ga-dep/case (4; 2% instances), ga-dep/ccomp (4; 2% instances), ga-dep/mark:prt (4; 2% instances), ga-dep/name (4; 2% instances), ga-dep/advmod (2; 1% instances), ga-dep/xcomp:pred (2; 1% instances), ga-dep/acl:relcl (1; 0% instances), ga-dep/parataxis (1; 0% instances), ga-dep/punct (1; 0% instances)

Parents of X nodes belong to 11 different parts of speech: NOUN (139; 52% instances), VERB (55; 21% instances), X (21; 8% instances), ADJ (13; 5% instances), PROPN (10; 4% instances), ROOT (10; 4% instances), NUM (6; 2% instances), ADP (5; 2% instances), PRON (4; 2% instances), CONJ (1; 0% instances), SCONJ (1; 0% instances)

146 (55%) X nodes are leaves.

54 (20%) X nodes have one child.

36 (14%) X nodes have two children.

29 (11%) X nodes have three or more children.

The highest child degree of a X node is 8.

Children of X nodes are attached using 25 different relations: ga-dep/punct (75; 32% instances), ga-dep/case (26; 11% instances), ga-dep/nmod (16; 7% instances), ga-dep/nummod (14; 6% instances), ga-dep/compound (13; 6% instances), ga-dep/conj (12; 5% instances), ga-dep/det (12; 5% instances), ga-dep/appos (11; 5% instances), ga-dep/cc (9; 4% instances), ga-dep/dobj (7; 3% instances), ga-dep/amod (6; 3% instances), ga-dep/foreign (5; 2% instances), ga-dep/nsubj (5; 2% instances), ga-dep/xcomp (5; 2% instances), ga-dep/name (3; 1% instances), ga-dep/acl:relcl (2; 1% instances), ga-dep/advcl (2; 1% instances), ga-dep/advmod (2; 1% instances), ga-dep/csubj:cleft (2; 1% instances), ga-dep/mark:prt (2; 1% instances), ga-dep/neg (2; 1% instances), ga-dep/nmod:prep (2; 1% instances), ga-dep/cop (1; 0% instances), ga-dep/mark (1; 0% instances), ga-dep/xcomp:pred (1; 0% instances)

Children of X nodes belong to 14 different parts of speech: PUNCT (75; 32% instances), NOUN (42; 18% instances), ADP (28; 12% instances), X (21; 9% instances), NUM (16; 7% instances), DET (12; 5% instances), CONJ (10; 4% instances), PROPN (9; 4% instances), VERB (9; 4% instances), ADJ (5; 2% instances), PART (5; 2% instances), PRON (2; 1% instances), ADV (1; 0% instances), SCONJ (1; 0% instances)


X in other languages: [bg] [cs] [de] [el] [en] [es] [eu] [fa] [fi] [fr] [ga] [he] [hu] [it] [ja] [ko] [sv] [u]