ImaginE: An Imagination-Based Automatic Evaluation Metric for Natural Language Generation

2021-06-10 17:59:52

Wanrong Zhu, Xin Eric Wang, An Yan, Miguel Eckstein, William Yang Wang

arXiv_AI

arXiv_AI Embedding Relation Text_Generation Pose Embodied

Abstract
Abstract (translated)
URL
PDF

Abstract

Automatic evaluations for natural language generation (NLG) conventionally rely on token-level or embedding-level comparisons with the text references. This is different from human language processing, for which visual imaginations often improve comprehension. In this work, we propose ImaginE, an imagination-based automatic evaluation metric for natural language generation. With the help of CLIP and DALL-E, two cross-modal models pre-trained on large-scale image-text pairs, we automatically generate an image as the embodied imagination for the text snippet and compute the imagination similarity using contextual embeddings. Experiments spanning several text generation tasks demonstrate that adding imagination with our ImaginE displays great potential in introducing multi-modal information into NLG evaluation, and improves existing automatic metrics' correlations with human similarity judgments in many circumstances.

Abstract (translated)

URL

https://arxiv.org/abs/2106.05970

PDF

https://arxiv.org/pdf/2106.05970.pdf