Abstract
This work formulates the machine learning mechanism as a bi-level optimization problem. The inner level optimization loop entails minimizing a properly chosen loss function evaluated on the training data. This is nothing but the well-studied training process in pursuit of optimal model parameters. The outer level optimization loop is less well-studied and involves maximizing a properly chosen performance metric evaluated on the validation data. This is what we call the "iteration process", pursuing optimal model hyper-parameters. Among many other degrees of freedom, this process entails model engineering (e.g., neural network architecture design) and management, experiment tracking, dataset versioning and augmentation. The iteration process could be automated via Automatic Machine Learning (AutoML) or left to the intuitions of machine learning students, engineers, and researchers. Regardless of the route we take, there is a need to reduce the computational cost of the iteration step and as a direct consequence reduce the carbon footprint of developing artificial intelligence algorithms. Despite the clean and unified mathematical formulation of the iteration step as a bi-level optimization problem, its solutions are case specific and complex. This work will consider such cases while increasing the level of complexity from supervised learning to semi-supervised, self-supervised, unsupervised, few-shot, federated, reinforcement, and physics-informed learning. As a consequence of this exercise, this proposal surfaces a plethora of open problems in the field, many of which can be addressed in parallel.
Abstract (translated)
这项工作将机器学习机制表述为两个层次的优化问题。内部层次优化循环旨在最小化训练数据上所选损失函数的评价值。这仅仅是追求最优模型参数的深入研究训练过程。外部层次优化循环的研究较少,并涉及在验证数据上最大化所选性能度量的评价值,这就是我们所谓的“迭代过程”,追求最优模型超参数。与其他自由度相比,这个过程包括模型工程(例如神经网络架构设计)和管理、实验跟踪、数据版本和增强。迭代过程可以通过自动机器学习(AutoML)自动化或留给机器学习学生、工程师和研究人员直觉。无论我们采取哪种路径,都需要降低迭代步骤的计算成本,并直接减少了开发人工智能算法的碳排放量。尽管迭代步骤作为一个两个层次的优化问题的清晰和统一的数学表述是很干净的,但其解决方案是具体且复杂的。这项工作将考虑这些案例,同时从监督学习到半监督、自监督、无监督、少量样本、分布式、强化学习和物理学 informed 学习增加复杂性等级。由于这项工作,这个建议提出了在该领域出现大量开放性问题,其中许多可以在并行方式下解决。
URL
https://arxiv.org/abs/2301.11316