Generating Text by Splicing Together Nearest Neighbors

2021-01-20 18:43:11

Sam Wiseman, Arturs Backurs, Karl Stratos

arXiv_CL

arXiv_CL Text_Generation Pose

Abstract
Abstract (translated)
URL
PDF

Abstract

We propose to tackle conditional text generation tasks, especially those which require generating formulaic text, by splicing together segments of text from retrieved "neighbor" source-target pairs. Unlike recent work that conditions on retrieved neighbors in an encoder-decoder setting but generates text token-by-token, left-to-right, we learn a policy that directly manipulates segments of neighbor text (i.e., by inserting or replacing them) to form an output. Standard techniques for training such a policy require an oracle derivation for each generation, and we prove that finding the shortest such derivation can be reduced to parsing under a particular weighted context-free grammar. We find that policies learned in this way allow for interpretable table-to-text and headline generation that is competitive with or better than state-of-the-art autoregressive token-level policies in terms of automatic metrics, and moreover allows for faster decoding.

Abstract (translated)

URL

https://arxiv.org/abs/2101.08248

PDF

https://arxiv.org/pdf/2101.08248.pdf

Generating Text by Splicing Together Nearest Neighbors

Abstract

Abstract (translated)

URL

PDF Copy

PDF