Paper Reading AI Learner

Multi-Task Learning with Multi-Task Optimization

2024-03-24 14:04:40
Lu Bai, Abhishek Gupta, Yew-Soon Ong

Abstract

Multi-task learning solves multiple correlated tasks. However, conflicts may exist between them. In such circumstances, a single solution can rarely optimize all the tasks, leading to performance trade-offs. To arrive at a set of optimized yet well-distributed models that collectively embody different trade-offs in one algorithmic pass, this paper proposes to view Pareto multi-task learning through the lens of multi-task optimization. Multi-task learning is first cast as a multi-objective optimization problem, which is then decomposed into a diverse set of unconstrained scalar-valued subproblems. These subproblems are solved jointly using a novel multi-task gradient descent method, whose uniqueness lies in the iterative transfer of model parameters among the subproblems during the course of optimization. A theorem proving faster convergence through the inclusion of such transfers is presented. We investigate the proposed multi-task learning with multi-task optimization for solving various problem settings including image classification, scene understanding, and multi-target regression. Comprehensive experiments confirm that the proposed method significantly advances the state-of-the-art in discovering sets of Pareto-optimized models. Notably, on the large image dataset we tested on, namely NYUv2, the hypervolume convergence achieved by our method was found to be nearly two times faster than the next-best among the state-of-the-art.

Abstract (translated)

多任务学习可以解决多个相关任务。然而,它们之间可能存在冲突。在这种情况下,单独的解决方案很难优化所有任务,导致性能权衡。为了在单个算法通过多任务优化来达到一组最优但分布良好的模型,本文提出了一种将帕雷托多任务学习通过多任务优化视角审视的方法。首先将多任务学习表示为一个多目标优化问题,然后将其分解为一组无约束的标量值子问题。这些子问题使用一种新颖的多任务梯度下降方法共同求解,该方法的独特之处在于在优化过程中模型参数在子问题之间的传递。通过引入这种传递,证明了更快的收敛速度的一个定理。我们研究了使用多任务优化解决各种问题的多任务学习方法,包括图像分类、场景理解和多目标回归。综合实验证实了所提出的方法在发现帕雷托最优模型方面显著提高了现有水平。值得注意的是,在测试的大图像数据集上,即NYUv2,我们测试的方法的凸集增长速度被认为是目前最佳方法的近两倍。

URL

https://arxiv.org/abs/2403.16162

PDF

https://arxiv.org/pdf/2403.16162.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot