home edit page issue tracker

This page pertains to UD version 2.

goeswith: goes with

This relation is used to link characters that have been mistakenly split during tokenization.

This relation will be removed in the final delivery of the language data. The words connected by goeswith will simply be fused together. For parser training purposes, however, we keep the relation so that the parser can learn to correct mistakenly separated tokens. All separated units should be given the same POS tag as one would give the combined unit, despite what the individual units may be tagged as had they been treated as individual words.

# visual-style 1 2 goeswith	color:blue
# visual-style 1 3 goeswith	color:blue
# visual-style 1	bgColor:blue
# visual-style 1	fgColor:white
# visual-style 2	bgColor:blue
# visual-style 2	fgColor:white
# visual-style 3	bgColor:blue
# visual-style 3	fgColor:white
1	不	_	INTJ	_	_	0	root	_	_
2	好	_	INTJ	_	_	1	goeswith	_	_
3	意思	_	INTJ	_	_	1	goeswith	_	_

1	"sorry"	_	_	_	_	0	_	_	_

Note that the particular class of idiomatic expressions called 成語 chéngyǔ as well as quotations from Classical Chinese texts, may not always be recognized by automatic segmentation. We use the goeswith relation for such cases as well. Because these expressions are fossilized from earlier stages of the Chinese language, their syntax and composition often do not have a modern analysis, and therefore should not be tokenized apart.

# visual-style 1 2 goeswith	color:blue
# visual-style 1 3 goeswith	color:blue
# visual-style 1	bgColor:blue
# visual-style 1	fgColor:white
# visual-style 2	bgColor:blue
# visual-style 2	fgColor:white
# visual-style 3	bgColor:blue
# visual-style 3	fgColor:white
1	百	_	NOUN	_	_	0	root	_	_
2	無	_	NOUN	_	_	1	goeswith	_	_
3	禁忌	_	NOUN	_	_	1	goeswith	_	_

1	"Nothing	_	_	_	_	0	_	_	_
2	is	_	_	_	_	0	_	_	_
3	taboo."	_	_	_	_	0	_	_	_


goeswith in other languages: [cs] [el] [en] [fi] [fr] [hy] [it] [ka] [pt] [ru] [sl] [sv] [tr] [u] [yue] [zh]