Paper Reading AI Learner

Pun Generation with Surprise

2019-04-15 03:40:16
He He, Nanyun Peng, Percy Liang

Abstract

We tackle the problem of generating a pun sentence given a pair of homophones (e.g., "died" and "dyed"). Supervised text generation is inappropriate due to the lack of a large corpus of puns, and even if such a corpus existed, mimicry is at odds with generating novel content. In this paper, we propose an unsupervised approach to pun generation using a corpus of unhumorous text and what we call the local-global surprisal principle: we posit that in a pun sentence, there is a strong association between the pun word (e.g., "dyed") and the distant context, as well as a strong association between the alternative word (e.g., "died") and the immediate context. This contrast creates surprise and thus humor. We instantiate this principle for pun generation in two ways: (i) as a measure based on the ratio of probabilities under a language model, and (ii) a retrieve-and-edit approach based on words suggested by a skip-gram model. Human evaluation shows that our retrieve-and-edit approach generates puns successfully 31% of the time, tripling the success rate of a neural generation baseline.

Abstract (translated)

我们解决了在一对同音词的情况下产生双关语句子的问题(例如,“死亡”和“染色”)。由于缺乏大量的双关语语料库,监督文本生成是不合适的,即使存在这样的语料库,模仿也与生成小说内容不一致。在本文中,我们提出了一种无监督的双关语生成方法,该方法使用了一组不含有害成分的文本和我们称之为局部全局意外原则:我们假定在双关语句子中,双关语单词(例如,“染色”)与遥远的上下文之间存在着强烈的联系,以及替代词(例如,“死亡”)和直接背景。这种对比产生了惊喜,因此也带来了幽默。我们用两种方式来举例说明双关语生成的原理:(i)作为一种基于语言模型下概率比的度量;(ii)基于跳过图模型建议的单词的检索和编辑方法。人类评估表明,我们的检索和编辑方法能成功生成31%的双关语,使神经生成基线的成功率增加了两倍。

URL

https://arxiv.org/abs/1904.06828

PDF

https://arxiv.org/pdf/1904.06828.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot