Abstract
General world models represent a crucial pathway toward achieving Artificial General Intelligence (AGI), serving as the cornerstone for various applications ranging from virtual environments to decision-making systems. Recently, the emergence of the Sora model has attained significant attention due to its remarkable simulation capabilities, which exhibits an incipient comprehension of physical laws. In this survey, we embark on a comprehensive exploration of the latest advancements in world models. Our analysis navigates through the forefront of generative methodologies in video generation, where world models stand as pivotal constructs facilitating the synthesis of highly realistic visual content. Additionally, we scrutinize the burgeoning field of autonomous-driving world models, meticulously delineating their indispensable role in reshaping transportation and urban mobility. Furthermore, we delve into the intricacies inherent in world models deployed within autonomous agents, shedding light on their profound significance in enabling intelligent interactions within dynamic environmental contexts. At last, we examine challenges and limitations of world models, and discuss their potential future directions. We hope this survey can serve as a foundational reference for the research community and inspire continued innovation. This survey will be regularly updated at: this https URL.
Abstract (translated)
通用世界模型代表了一种关键途径,有助于实现人工通用智能(AGI),并为各种应用提供基石,从虚拟环境到决策系统。最近,Sora模型的出现因其出色的模拟能力而受到广泛关注,它表现出对物理定律的初步理解。在本次调查中,我们全面探讨了世界模型的最新进展。我们的分析涵盖了视频生成领域的前沿生成方法,其中世界模型作为关键构建模块促进高度现实主义视觉内容的合成。此外,我们详细研究了自动驾驶世界模型的蓬勃发展,精心描绘了它们在重塑交通和城市出行方式中不可或缺的作用。最后,我们深入研究了部署在自主代理中的世界模型的复杂性,揭示了它们在 enabling intelligent interactions within dynamic environmental contexts中的深刻意义。总之,我们调查了世界模型的挑战和局限性,并讨论了它们未来可能的发展方向。我们希望这次调查能为研究社区提供一种基础性的参考,并激发持续创新。本次调查将定期更新于:https:// this URL。
URL
https://arxiv.org/abs/2405.03520