Abstract
This paper addresses the problem of stylized text generation in a multilingual setup. A version of a language model based on a long short-term memory (LSTM) artificial neural network with extended phonetic and semantic embeddings is used for stylized poetry generation. Phonetics is shown to have comparable importance for the task of stylized poetry generation as the information on the target author. The quality of the resulting poems generated by the network is estimated through bilingual evaluation understudy (BLEU), a survey and a new cross-entropy based metric that is suggested for the problems of such type. The experiments show that the proposed model consistently outperforms random sample and vanilla-LSTM baselines, humans also tend to attribute machine generated texts to the target author.
Abstract (translated)
本文讨论了多语言设置中程式化文本生成的问题。基于具有扩展的语音和语义嵌入的长短期记忆(LSTM)人工神经网络的语言模型的版本被用于风格化的诗歌生成。语音学被证明对于程式化诗歌生成的任务具有与目标作者的信息相当的重要性。由网络产生的诗歌的质量通过双语评估替补(BLEU),一项调查和基于交叉熵的新指标来估计,该指标建议用于此类问题。实验表明,所提出的模型始终优于随机样本和vanilla-LSTM基线,人类也倾向于将机器生成的文本归因于目标作者。
URL
https://arxiv.org/abs/1807.07147