Counterfactual VQA: A Cause-Effect Look at Language Bias

2020-06-08 01:49:27

Yulei Niu, Kaihua Tang, Hanwang Zhang, Zhiwu Lu, Xian-Sheng Hua, Ji-Rong Wen

arXiv_CV

arXiv_CV VQA QA Inference Knowledge Pose

Abstract
Abstract (translated)
URL
PDF

Abstract

Visual Question Answering (VQA) models tend to rely on the language bias and thus fail to learn the reasoning from visual knowledge, which is however the original intention of VQA. In this paper, we propose a novel cause-effect look at the language bias, where the bias is formulated as the direct effect of question on answer from the view of causal inference. The effect can be captured by counterfactual VQA, where the image had not existed in an imagined scenario. Our proposed cause-effect look 1) is general to any baseline VQA architecture, 2) achieves significant improvement on the language-bias sensitive VQA-CP dataset, and 3) fills the theoretical gap in recent language prior based works.

Abstract (translated)

URL

https://arxiv.org/abs/2006.04315

PDF

https://arxiv.org/pdf/2006.04315.pdf