X

This is part of archived UD v1 documentation. See http://universaldependencies.org/ for the current version.

home sl/pos issue tracker

`X`: other

Definition

The X tag is used for words that for some reason cannot be assigned a real part-of-speech category.

In Slovenian UD Treebank, this tag is mostly used for cases of code-switching where it was not meaningful to analyze the intervening language, such as Europe of knowledge, La connaissance de soi, Bundesvereinigung det Deutschen Arbeitgeberverbände. In cases where foreign-language sequences include both foreign and loan words, only foreign words are assigned the X tag, as in The Life of Brian, where both Life and Brian are marked as NOUN and PROPN respectively.

Other subcategories marked with X include abbreviations with dots (dr.), URL addresses (www.radenska.si), news author abbreviations (sta) and tokens with alpha-numerical combinations (6230i).

Conversion from JOS

All tokens with tag Residual are converted to X. Additionally, all abreviations are also converted to X.

Treebank Statistics (UD_Slovenian)

There are 164 X lemmas (1%), 165 X types (1%) and 339 X tokens (0%). Out of 16 observed tags, the rank of X is: 7 in number of lemmas, 9 in number of types and 15 in number of tokens.

The 10 most frequent X lemmas: dr., t., d., sv., P., i., of, oz., the, M.

The 10 most frequent X types: dr., t., d., sv., P., i., of, oz., the, M.

The 10 most frequent ambiguous lemmas: V. (X 3, NUM 1), da (SCONJ 1772, PART 8, X 2), les (NOUN 9, X 2), a (CONJ 96, ADV 2, X 1), do (ADP 353, X 1), in (CONJ 3242, ADV 5, X 1), life (X 1, NOUN 1), on (PRON 1561, X 1), pa (CONJ 957, X 1)

The 10 most frequent ambiguous types: de (X 5, VERB 1), sta (AUX 165, VERB 36, X 4), V. (X 3, NUM 1), Les (PROPN 2, X 2), da (SCONJ 1726, VERB 9, X 2, PART 1), mu (PRON 158, X 2), A (CONJ 31, NOUN 7, ADV 1, X 1), Art (PROPN 1, X 1), Life (NOUN 1, X 1), National (PROPN 2, X 1)

de
- X 5: Madame de La Motte je brez razmisleka pritrdila .
- VERB 1: Nič ne de , vsi jemo prigrizke .
sta
- AUX 165: Spominjam se dveh sosedov , ki sta bila med seboj skregana .
- VERB 36: Zakonodaja in trg delovne sile sta med seboj tesno povezana .
- X 4: ( sta )
V.
- X 3: M. V.
- NUM 1: Nikolaj V. je , ko je nastopil papeško službo , plačal več kot 1000 funtov pesniku Francescu Filelfu , da je napisal knjigo zgodb , ki so jih opisali kot “ najbolj nagnusna dela , ki jih je kdaj ustvarila surova kljubovalnost in umazana domišljija “ .
Les
- PROPN 2: Justin pripelje Janet v zanemarjen stanovanjski blok , kjer živi Les , ki izdeluje maske .
- X 2: Njegov prvi uspeh je bil roman Les Chouans ( 1829 ) , ki mu je sledilo še več kot 90 romanov in kratkih zgodb .
da
- SCONJ 1726: Škoda je , da slovenski uporabniki iščejo informacije na tujih straneh .
- VERB 9: Iz pogodbe se ne da razbrati načina plačila .
- X 2: 1489 Vasco da Gama , ki izpluje iz Portugalske , se na poti v Indijo ustavi v vzhodnoafriškem pristanišču Mombasa .
- PART 1: To je ideja , da !
mu
- PRON 158: » Potem pa le pazi name , « se mu je v odgovor nasmehnila Regina .
- X 2: ( mu )
A
- CONJ 31: A to ni le zavodska krinka , pod katero bi se skrivalo vse kaj drugega .
- NOUN 7: V obliki črke A .
- ADV 1: A veste , koliko žensk takole zapusti svoje partnerje in se potem vrnejo … ?
- X 1: MEMOIRS OF A WOMAN TRAVELLING ALONE .
Art
- PROPN 1: Janez Pipan , ravnatelj ljubljanske Drame , mi je zaupal režijo gledališke uspešnice Art , ki jo je napisala naturalizirana Francozinja Yasmina Reza .
- X 1: Danes je zvrst in vse njene izpeljave bolj znana pod izrazom performans ( Performance Art ) .
Life
- NOUN 1: Harrisonov prijatelj Michael Palin iz ekipe Leteči cirkus Montyja Pytona mu je nekoč namreč potožil , da mu je zmanjkalo denarja za film The Life of Brian .
- X 1: Louise Hay v svoji uspešnici You Can Heal Your Life ( Svoje življenje lahko ozdravite sami ) piše :
National
- PROPN 2: Ljubiteljev konjeniškega športa je namreč v Angliji še vedno veliko in za neposredni prenos dirke Grand National jih niso želeli prikrajšati .
- X 1: Volilna taktika Pauline Howard utegne njegovo koalicijo , sestavljeno iz torijcev ( Liberal Party ) in kmetovalcev ( National Party ) veljati precej marginalnih sedežev na podeželju .

Morphology

The form / lemma ratio of X is 1.006098 (the average of all parts of speech is 1.894262).

The 1st highest number of forms (2) was observed with the lemma “european”: EUROPEAN, European.

The 2nd highest number of forms (1) was observed with the lemma “18f”: 18F.

The 3rd highest number of forms (1) was observed with the lemma “A.”: A..

X occurs with 1 features: Foreign (110; 32% instances)

X occurs with 1 feature-value pairs: Foreign=Foreign

X occurs with 2 feature combinations. The most frequent feature combination is _ (229 tokens). Examples: dr., t., d., sv., P., i., oz., M., j., o.

Relations

X nodes are attached to their parents using 13 different relations: nmod (142; 42% instances), root (58; 17% instances), foreign (57; 17% instances), mwe (19; 6% instances), nsubj (11; 3% instances), appos (9; 3% instances), name (9; 3% instances), advmod (7; 2% instances), dobj (7; 2% instances), amod (6; 2% instances), aux (6; 2% instances), cc (4; 1% instances), conj (4; 1% instances)

Parents of X nodes belong to 11 different parts of speech: X (109; 32% instances), NOUN (95; 28% instances), ROOT (58; 17% instances), PROPN (40; 12% instances), VERB (28; 8% instances), ADJ (4; 1% instances), ADV (1; 0% instances), NUM (1; 0% instances), PRON (1; 0% instances), PUNCT (1; 0% instances), SCONJ (1; 0% instances)

213 (63%) X nodes are leaves.

46 (14%) X nodes have one child.

53 (16%) X nodes have two children.

27 (8%) X nodes have three or more children.

The highest child degree of a X node is 8.

Children of X nodes are attached using 12 different relations: punct (114; 44% instances), foreign (57; 22% instances), nmod (36; 14% instances), mwe (18; 7% instances), amod (9; 3% instances), case (8; 3% instances), name (7; 3% instances), conj (4; 2% instances), appos (3; 1% instances), cc (3; 1% instances), acl (2; 1% instances), nummod (1; 0% instances)

Children of X nodes belong to 11 different parts of speech: PUNCT (114; 44% instances), X (109; 42% instances), ADJ (9; 3% instances), PROPN (9; 3% instances), ADP (6; 2% instances), NOUN (6; 2% instances), CONJ (3; 1% instances), SCONJ (2; 1% instances), VERB (2; 1% instances), NUM (1; 0% instances), PRON (1; 0% instances)

Treebank Statistics (UD_Slovenian-SST)

There are 82 X lemmas (2%), 226 X types (4%) and 1673 X tokens (6%). Out of 16 observed tags, the rank of X is: 6 in number of lemmas, 6 in number of types and 7 in number of tokens.

The 10 most frequent X lemmas: [gap], _, [pause], [speaker:laughter], [audience:laughter], [:voice], [all:laughter], [incident], green, of

The 10 most frequent X types: [gap], [pause], [speaker:laughter], [audience:laughter], [:voice], s, [all:laughter], [incident], j, n

The 10 most frequent ambiguous lemmas: ka (SCONJ 22, X 3), on (PRON 309, X 2), a (ADV 137, INTJ 16, NOUN 6, CONJ 3, X 1), da (SCONJ 533, PART 16, X 1), imam (X 1, NOUN 1), in (CONJ 414, ADV 1, X 1), kaj (PRON 196, ADV 43, X 1), la (ADV 1, X 1), minus (NOUN 6, X 1), od (ADP 83, X 1)

The 10 most frequent ambiguous types: s (ADP 80, X 34, NOUN 1), m (X 16, INTJ 5, NOUN 2), p (X 16, NOUN 1), z (ADP 117, X 16), k (X 10, ADP 8, SCONJ 2, NOUN 1), v (ADP 478, X 10, NOUN 1), d (X 9, NOUN 1), ka (SCONJ 22, X 8), po (ADP 78, X 8), u (X 8, INTJ 1)

s
- ADP 80: smo prišli popoldne s šole ob dveh ob treh smo gnali pasti
- X 34: kdaj si s [gap]
- NOUN 1: eee hkrati se bo pa naredila z eee projekt z izvedbo s in bo z [gap] eee zajemalo notri eee detajle kako narediti ograjo kako narediti tlake med sabo da bojo višinsko enaki
m
- X 16: [all:laughter] saj zdaj si pa kot m [gap] ne vem beavis and butthead
- INTJ 5: na računalnik čakat ja m karte naročit do dvanajstih ne
- NOUN 2: m m ne avto [gap] saj avtomob [gap] [:voice] menda je šel do vsakega avta da ga je požegnal
p
- X 16: p [gap] pozna še to ja
- NOUN 1: p čez trikotnik naj gre ja
z
- ADP 117: eem z električnimi motorji ne
- X 16: ena bo še pa z [gap] pa [name:personal] zrihtala ne eno zvezo
k
- X 10: ne ta ta je taka eem k [gap] jaz temu pravim življenjska
- ADP 8: na faksu se bom tam k enemu radiatorju stisnil pa bo
- SCONJ 2: jo pa ej sploh ne vem a so se pogledal oni mislim na teh slikah k
- NOUN 1: eee ti in jaz čisto v dobri veri dajta mi malo za piti a veš pa me zebe sem šla se obleči in kostanj lepo lupimo tam k eee [name:personal] ga pa peče
v
- ADP 478: pol pa v oklepaju piše le za ta to pa to vizo lahko trideset dni ostaneš
- X 10: zdaj eee v [gap] kaj več pa je težko reči v tem trenutku ne
- NOUN 1: brez drugega dramaturga tako da pozorni še posebej bodite na [name:personal] ker bo drugi del v d dramaturga v smislu da te stvari kar bomo eee poskušali v likih ustvarjati bomo t [gap] skoz govor in bo [name:personal] mogla malo večkrat posr [gap] pomagati na tej poti da to dejansko dobimo
d
- X 9: [all:laughter] tako da bodite previdni ker d [gap] eee
- NOUN 1: brez drugega dramaturga tako da pozorni še posebej bodite na [name:personal] ker bo drugi del v d dramaturga v smislu da te stvari kar bomo eee poskušali v likih ustvarjati bomo t [gap] skoz govor in bo [name:personal] mogla malo večkrat posr [gap] pomagati na tej poti da to dejansko dobimo
ka
- SCONJ 22: ka so stopnjo više kakor ti [gap]
- X 8: ta človek se tako pisal ka to fi ?
po
- ADP 78: pa ti po tistih terenih smučaš vsi ostali pa ne rabijo zraven tebe biti
- X 8: tudi to je po [gap] kar pogosta rastlina na naših travnikih ne
u
- X 8: ja zdajle imate u [gap] a je tu kolektiven ali kako je zdaj to
- INTJ 1: u zanimivo

Morphology

The form / lemma ratio of X is 2.756098 (the average of all parts of speech is 1.575031).

The 1st highest number of forms (152) was observed with the lemma “_”: Bel, Franc, Oma, Slove, [all:laughter], a, am, an, anal, avto, avtomob, b, ba, ce, ci, d, dej, dela, des, deves, devetinos, di, do, dovoli, dovre, e, fe, fizi, g, gos, gospo, gre, grn, hotl, i, ins, ist, istrija, j, jabol, k, ka, km, knji, kolo, kom, kompliciraš, ku, kur, l, le, lu, m, ma, mar, mat, mest, mi, mid, mis, mišani, moš, n, na, naj, napi, nar, naslednj, nek, ni, nih, nikak, nje, njegov, o, od, ojo, om, on, op, opaz, orož, ose, p, pet, petnš, po, pok, ponava, pos, posr, pr, pre, pred, prelis, preve, prevo, pri, psi, r, ra, raz, razu, raču, re, rec, s, sa, se, si, sko, sla, slovarj, so, spa, spe, spla, sprašva, st, t, tist, to, trans, u, usta, uze, v, va, ver, vs, vslak, vzmeti, z, zaba, zac, zag, zar, zdaj, zl, zmišlajo, zob, zve, č, čak, š, še, špi, šte, štir, ž, žens, žul.

The 2nd highest number of forms (1) was observed with the lemma “Bewegung”: bewegung.

The 3rd highest number of forms (1) was observed with the lemma “Mission”: mission.

X occurs with 1 features: Foreign (96; 6% instances)

X occurs with 1 feature-value pairs: Foreign=Foreign

X occurs with 2 feature combinations. The most frequent feature combination is _ (1577 tokens). Examples: [gap], [pause], [speaker:laughter], [audience:laughter], [:voice], s, [all:laughter], [incident], j, n

Relations

X nodes are attached to their parents using 24 different relations: punct (920; 55% instances), reparandum (317; 19% instances), root (307; 18% instances), foreign (61; 4% instances), nmod (15; 1% instances), conj (9; 1% instances), parataxis (6; 0% instances), advcl (4; 0% instances), advmod (4; 0% instances), dobj (4; 0% instances), goeswith (4; 0% instances), aux (3; 0% instances), mwe (3; 0% instances), nsubj (3; 0% instances), vocative (3; 0% instances), cc (2; 0% instances), acl (1; 0% instances), amod (1; 0% instances), appos (1; 0% instances), case (1; 0% instances), ccomp (1; 0% instances), cop (1; 0% instances), dislocated (1; 0% instances), expl (1; 0% instances)

Parents of X nodes belong to 16 different parts of speech: VERB (628; 38% instances), ROOT (307; 18% instances), NOUN (187; 11% instances), X (115; 7% instances), ADJ (104; 6% instances), ADV (91; 5% instances), PART (63; 4% instances), PRON (61; 4% instances), PROPN (35; 2% instances), INTJ (21; 1% instances), CONJ (16; 1% instances), SCONJ (11; 1% instances), NUM (10; 1% instances), ADP (9; 1% instances), DET (9; 1% instances), AUX (6; 0% instances)

1550 (93%) X nodes are leaves.

83 (5%) X nodes have one child.

11 (1%) X nodes have two children.

29 (2%) X nodes have three or more children.

The highest child degree of a X node is 9.

Children of X nodes are attached using 27 different relations: foreign (56; 22% instances), punct (37; 15% instances), reparandum (27; 11% instances), case (19; 7% instances), advmod (15; 6% instances), nsubj (13; 5% instances), mark (12; 5% instances), aux (10; 4% instances), discourse (10; 4% instances), cop (9; 4% instances), dobj (7; 3% instances), parataxis (5; 2% instances), det (4; 2% instances), discourse:filler (4; 2% instances), cc (3; 1% instances), expl (3; 1% instances), goeswith (3; 1% instances), mwe (3; 1% instances), neg (3; 1% instances), nmod (3; 1% instances), amod (2; 1% instances), parataxis:restart (2; 1% instances), acl (1; 0% instances), appos (1; 0% instances), conj (1; 0% instances), iobj (1; 0% instances), parataxis:discourse (1; 0% instances)

Children of X nodes belong to 15 different parts of speech: X (115; 45% instances), PRON (23; 9% instances), ADP (20; 8% instances), VERB (17; 7% instances), PART (16; 6% instances), SCONJ (12; 5% instances), AUX (10; 4% instances), ADV (9; 4% instances), NOUN (8; 3% instances), CONJ (7; 3% instances), PUNCT (6; 2% instances), INTJ (5; 2% instances), DET (4; 2% instances), ADJ (2; 1% instances), NUM (1; 0% instances)

X in other languages: [bg] [cs] [de] [el] [en] [es] [eu] [fa] [fi] [fr] [ga] [he] [hu] [it] [ja] [ko] [sv] [u]

X: other

Definition

Conversion from JOS

Treebank Statistics (UD_Slovenian)

Morphology

Relations

Treebank Statistics (UD_Slovenian-SST)

Morphology

Relations

`X`: other