Generated Faces in the Wild: Quantitative Comparison of Stable Diffusion, Midjourney and DALL-E 2

Abstract
Abstract (translated)
URL
PDF

Abstract

The field of image synthesis has made great strides in the last couple of years. Recent models are capable of generating images with astonishing quality. Fine-grained evaluation of these models on some interesting categories such as faces is still missing. Here, we conduct a quantitative comparison of three popular systems including Stable Diffusion, Midjourney, and DALL-E 2 in their ability to generate photorealistic faces in the wild. We find that Stable Diffusion generates better faces than the other systems, according to the FID score. We also introduce a dataset of generated faces in the wild dubbed GFW, including a total of 15,076 faces. Furthermore, we hope that our study spurs follow-up research in assessing the generative models and improving them. Data and code are available at data and code, respectively.

Abstract (translated)

URL

https://arxiv.org/abs/2210.00586

PDF

https://arxiv.org/pdf/2210.00586.pdf