Paper Reading AI Learner

PADL: Language-Directed Physics-Based Character Control

2023-01-31 18:59:22
Jordan Juravsky, Yunrong Guo, Sanja Fidler, Xue Bin Peng

Abstract

Developing systems that can synthesize natural and life-like motions for simulated characters has long been a focus for computer animation. But in order for these systems to be useful for downstream applications, they need not only produce high-quality motions, but must also provide an accessible and versatile interface through which users can direct a character's behaviors. Natural language provides a simple-to-use and expressive medium for specifying a user's intent. Recent breakthroughs in natural language processing (NLP) have demonstrated effective use of language-based interfaces for applications such as image generation and program synthesis. In this work, we present PADL, which leverages recent innovations in NLP in order to take steps towards developing language-directed controllers for physics-based character animation. PADL allows users to issue natural language commands for specifying both high-level tasks and low-level skills that a character should perform. We present an adversarial imitation learning approach for training policies to map high-level language commands to low-level controls that enable a character to perform the desired task and skill specified by a user's commands. Furthermore, we propose a multi-task aggregation method that leverages a language-based multiple-choice question-answering approach to determine high-level task objectives from language commands. We show that our framework can be applied to effectively direct a simulated humanoid character to perform a diverse array of complex motor skills.

Abstract (translated)

开发能够合成模拟角色自然和逼真动作的系统一直是计算机动画的重点。但是,为这些系统对后续应用具有实用性,它们不仅需要生产高质量的动作,还必须提供易于使用和扩展的接口,使用户可以控制角色的行为。自然语言提供了一种简单易懂且富有表现力的载体,用于指定用户的意图。最近自然语言处理(NLP)的突破已经证明了语言based接口的有效使用,例如图像生成和程序合成应用程序。在这个项目中,我们提出了PADL,利用最近在NLP领域的创新,以开发基于物理角色动画的语言控制器为目标。PADL允许用户发布自然语言命令,以指定角色应该执行的高级任务和低级技能。我们提出了一种对抗性仿效学习 approach 用于训练政策,将其映射到低级控制,使角色能够执行用户命令指定 desired 任务和技能。此外,我们提出了一种多任务聚合方法,利用基于语言的多选问答方法来确定从语言命令中获取的高级任务目标。我们表明,我们的框架可以应用于有效地引导模拟人形角色执行各种复杂的运动技能。

URL

https://arxiv.org/abs/2301.13868

PDF

https://arxiv.org/pdf/2301.13868.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot