Abstract
Multi-task learning solves multiple correlated tasks. However, conflicts may exist between them. In such circumstances, a single solution can rarely optimize all the tasks, leading to performance trade-offs. To arrive at a set of optimized yet well-distributed models that collectively embody different trade-offs in one algorithmic pass, this paper proposes to view Pareto multi-task learning through the lens of multi-task optimization. Multi-task learning is first cast as a multi-objective optimization problem, which is then decomposed into a diverse set of unconstrained scalar-valued subproblems. These subproblems are solved jointly using a novel multi-task gradient descent method, whose uniqueness lies in the iterative transfer of model parameters among the subproblems during the course of optimization. A theorem proving faster convergence through the inclusion of such transfers is presented. We investigate the proposed multi-task learning with multi-task optimization for solving various problem settings including image classification, scene understanding, and multi-target regression. Comprehensive experiments confirm that the proposed method significantly advances the state-of-the-art in discovering sets of Pareto-optimized models. Notably, on the large image dataset we tested on, namely NYUv2, the hypervolume convergence achieved by our method was found to be nearly two times faster than the next-best among the state-of-the-art.
Abstract (translated)
多任务学习可以解决多个相关任务。然而,它们之间可能存在冲突。在这种情况下,单独的解决方案很难优化所有任务,导致性能权衡。为了在单个算法通过多任务优化来达到一组最优但分布良好的模型,本文提出了一种将帕雷托多任务学习通过多任务优化视角审视的方法。首先将多任务学习表示为一个多目标优化问题,然后将其分解为一组无约束的标量值子问题。这些子问题使用一种新颖的多任务梯度下降方法共同求解,该方法的独特之处在于在优化过程中模型参数在子问题之间的传递。通过引入这种传递,证明了更快的收敛速度的一个定理。我们研究了使用多任务优化解决各种问题的多任务学习方法,包括图像分类、场景理解和多目标回归。综合实验证实了所提出的方法在发现帕雷托最优模型方面显著提高了现有水平。值得注意的是,在测试的大图像数据集上,即NYUv2,我们测试的方法的凸集增长速度被认为是目前最佳方法的近两倍。
URL
https://arxiv.org/abs/2403.16162