This paper presents a 6-DoF range-based Monte Carlo localization method with a GPU-accelerated Stein particle filter. To update a massive amount of particles, we propose a Gauss-Newton-based Stein variational gradient descent (SVGD) with iterative neighbor particle search. This method uses SVGD to collectively update particle states with gradient and neighborhood information, which provides efficient particle sampling. For an efficient neighbor particle search, it uses locality sensitive hashing and iteratively updates the neighbor list of each particle over time. The neighbor list is then used to propagate the posterior probabilities of particles over the neighbor particle graph. The proposed method is capable of evaluating one million particles in real-time on a single GPU and enables robust pose initialization and re-localization without an initial pose estimate. In experiments, the proposed method showed an extreme robustness to complete sensor occlusion (i.e., kidnapping), and enabled pinpoint sensor localization without any prior information.
本文提出了一种基于6个自由度的蒙特卡洛局部化方法,采用GPU加速的Stein粒子滤波器。为更新大量粒子,我们提出了一种基于Gauss-Newton的Stein变分梯度下降(SVGD)迭代邻居粒子搜索。该方法使用SVGD共同更新具有梯度和邻居信息的分子的状态,从而实现高效的粒子采样。为了实现高效的邻居粒子搜索,它使用了局部敏感哈希,并随着时间逐个更新每个粒子的邻居列表。邻居列表 then用于在邻居粒子图上传播粒子的后验概率。与传统方法相比,所提出的具有GPU加速的Stein粒子滤波器能够实时评估一百万个粒子,并无需初始姿态估计实现稳健的姿态初始化和重新定位。在实验中,该方法表现出了对完全传感器遮挡(即绑架)的极端鲁棒性,并能在没有任何先前信息的情况下实现精确的传感器局部定位。
https://arxiv.org/abs/2404.16370
Neural architecture search (NAS) is a challenging problem. Hierarchical search spaces allow for cheap evaluations of neural network sub modules to serve as surrogate for architecture evaluations. Yet, sometimes the hierarchy is too restrictive or the surrogate fails to generalize. We present FaDE which uses differentiable architecture search to obtain relative performance predictions on finite regions of a hierarchical NAS space. The relative nature of these ranks calls for a memory-less, batch-wise outer search algorithm for which we use an evolutionary algorithm with pseudo-gradient descent. FaDE is especially suited on deep hierarchical, respectively multi-cell search spaces, which it can explore by linear instead of exponential cost and therefore eliminates the need for a proxy search space. Our experiments show that firstly, FaDE-ranks on finite regions of the search space correlate with corresponding architecture performances and secondly, the ranks can empower a pseudo-gradient evolutionary search on the complete neural architecture search space.
神经架构搜索(NAS)是一个具有挑战性的问题。分层搜索空间允许对神经网络子模块进行廉价的评估,作为架构评估的代理。然而,有时候分层结构过于严格,或者代理无法泛化。我们提出了FaDE,它使用不同的iable架构搜索来获得分层 NAS 空间中有限区域的相对性能预测。这些相对排名的性质要求我们使用进化算法(我们使用具有伪梯度的进化算法)进行无记忆、批量的外搜索。FaDE 特别适用于具有深度分层和多细胞搜索空间的NAS,通过线性成本而不是指数成本进行探索,因此无需代理搜索空间。我们的实验结果表明,首先,FaDE在搜索空间有限区域上的排名与相应的架构性能相关联,其次,排名可以推动在完整神经架构搜索空间上的伪梯度进化搜索。
https://arxiv.org/abs/2404.16218
This paper introduces FlowMap, an end-to-end differentiable method that solves for precise camera poses, camera intrinsics, and per-frame dense depth of a video sequence. Our method performs per-video gradient-descent minimization of a simple least-squares objective that compares the optical flow induced by depth, intrinsics, and poses against correspondences obtained via off-the-shelf optical flow and point tracking. Alongside the use of point tracks to encourage long-term geometric consistency, we introduce differentiable re-parameterizations of depth, intrinsics, and pose that are amenable to first-order optimization. We empirically show that camera parameters and dense depth recovered by our method enable photo-realistic novel view synthesis on 360-degree trajectories using Gaussian Splatting. Our method not only far outperforms prior gradient-descent based bundle adjustment methods, but surprisingly performs on par with COLMAP, the state-of-the-art SfM method, on the downstream task of 360-degree novel view synthesis (even though our method is purely gradient-descent based, fully differentiable, and presents a complete departure from conventional SfM).
本文介绍了FlowMap,一种端到端的不同iable方法,用于求解视频序列中的精确相机姿态、相机内参和逐帧密集深度。我们的方法通过简单最小二乘目标函数对深度、内参和姿态引起的光学流进行逐视频梯度下降最小化。在点跟踪的使用下,我们引入了可进行一级优化的深度、内参和姿态的可导性重新参数化。我们通过实验验证,我们的方法能够使用高斯平铺实现照片现实感的360度轨迹合成。与基于梯度的 bundle adjustment 方法相比,我们的方法不仅远远超过了先前的结果,而且与最先进的SfM方法COLMAP在360度新视图合成下游任务的表现相当。尽管我们的方法是基于梯度的,完全不同导,完全与传统SfM不同,但它成功地克服了传统SfM的局限性。
https://arxiv.org/abs/2404.15259
This paper puts forth a new training data-untethered model poisoning (MP) attack on federated learning (FL). The new MP attack extends an adversarial variational graph autoencoder (VGAE) to create malicious local models based solely on the benign local models overheard without any access to the training data of FL. Such an advancement leads to the VGAE-MP attack that is not only efficacious but also remains elusive to detection. VGAE-MP attack extracts graph structural correlations among the benign local models and the training data features, adversarially regenerates the graph structure, and generates malicious local models using the adversarial graph structure and benign models' features. Moreover, a new attacking algorithm is presented to train the malicious local models using VGAE and sub-gradient descent, while enabling an optimal selection of the benign local models for training the VGAE. Experiments demonstrate a gradual drop in FL accuracy under the proposed VGAE-MP attack and the ineffectiveness of existing defense mechanisms in detecting the attack, posing a severe threat to FL.
本文提出了一种新的联邦学习(FL)数据无连接模欺骗(MP)攻击。新提出的MP攻击将对抗变分图形自动编码器(VGAE)扩展到仅根据未获得FL训练数据的恶意局部模型的创建。这种进步导致了一种VGAE-MP攻击,不仅有效,而且对攻击的检测仍然难以实现。VGAE-MP攻击提取了恶意局部模型和训练数据特征之间的图形结构相关性,以 adversarially 生成 graph 结构,并使用恶意图形结构和良性模型的特征生成恶意局部模型。此外,还提出了一种用VGAE和亚最小二乘法训练恶意局部模型的攻击算法,同时允许为训练VGAE选择最优的良性局部模型。实验证明,在提出的VGAE-MP攻击下,FL的准确性逐渐下降,而现有的防御机制在检测攻击方面无能为力,对FL构成了严重的威胁。
https://arxiv.org/abs/2404.15042
Understanding cognitive processes in the brain demands sophisticated models capable of replicating neural dynamics at large scales. We present a physiologically inspired speech recognition architecture, compatible and scalable with deep learning frameworks, and demonstrate that end-to-end gradient descent training leads to the emergence of neural oscillations in the central spiking neural network. Significant cross-frequency couplings, indicative of these oscillations, are measured within and across network layers during speech processing, whereas no such interactions are observed when handling background noise inputs. Furthermore, our findings highlight the crucial inhibitory role of feedback mechanisms, such as spike frequency adaptation and recurrent connections, in regulating and synchronising neural activity to improve recognition performance. Overall, on top of developing our understanding of synchronisation phenomena notably observed in the human auditory pathway, our architecture exhibits dynamic and efficient information processing, with relevance to neuromorphic technology.
理解大脑中的认知过程需要复杂且能够在大尺度上复制神经动态的模型。我们提出了一个生理学上启发的语音识别架构,与深度学习框架兼容并具有可扩展性,并证明了端到端梯度下降训练会导致中央尖峰神经网络中神经振荡的出现。在语音处理过程中,我们测量了跨频联系,这些联系表明了这些振荡,而在处理背景噪声输入时,并没有观察到这样的相互作用。此外,我们的研究结果突出了反馈机制(如尖峰频率适应和循环连接)在调节和同步神经活动以提高识别性能中的关键抑制作用。总的来说,在发展我们人类听觉通路中同步现象的基础上,我们的架构表现出动态和高效的信息处理,与类神经形态技术有关。
https://arxiv.org/abs/2404.14024
A primary function of back-propagation is to compute both the gradient of hidden representations and parameters for optimization with gradient descent. Training large models requires high computational costs due to their vast parameter sizes. While Parameter-Efficient Fine-Tuning (PEFT) methods aim to train smaller auxiliary models to save computational space, they still present computational overheads, especially in Fine-Tuning as a Service (FTaaS) for numerous users. We introduce Collaborative Adaptation (ColA) with Gradient Learning (GL), a parameter-free, model-agnostic fine-tuning approach that decouples the computation of the gradient of hidden representations and parameters. In comparison to PEFT methods, ColA facilitates more cost-effective FTaaS by offloading the computation of the gradient to low-cost devices. We also provide a theoretical analysis of ColA and experimentally demonstrate that ColA can perform on par or better than existing PEFT methods on various benchmarks.
反向传播的主要功能是计算隐藏表示的梯度和参数的梯度,以便使用梯度下降进行优化。训练大型模型需要由于其庞大的参数大小而产生高昂的计算成本。虽然参数高效的微调(PEFT)方法旨在通过训练小辅助模型来节省计算空间,但它们仍然存在计算开销,特别是在微调作为服务(FTaaS)中,对于大量用户来说尤为明显。我们引入了协作适应(ColA)与梯度学习(GL),一种无参数、模型无关的微调方法,它解耦了隐藏表示的梯度计算和参数计算。与PEFT方法相比,ColA通过将梯度计算的计算负担转移给低成本设备,从而实现更具有成本效益的FTaaS。我们还提供了关于ColA的理论和实验分析,并实验证明了ColA在各种基准测试中的表现与现有PEFT方法相当或者更好。
https://arxiv.org/abs/2404.13844
The Neural Tangent Kernel (NTK) has emerged as a fundamental concept in the study of wide Neural Networks. In particular, it is known that the positivity of the NTK is directly related to the memorization capacity of sufficiently wide networks, i.e., to the possibility of reaching zero loss in training, via gradient descent. Here we will improve on previous works and obtain a sharp result concerning the positivity of the NTK of feedforward networks of any depth. More precisely, we will show that, for any non-polynomial activation function, the NTK is strictly positive definite. Our results are based on a novel characterization of polynomial functions which is of independent interest.
神经元归一核(NTK)已成为研究广义神经网络的基本概念。特别是,已知NTK的正值与足够宽的网络的记忆能力直接相关,即在训练过程中达到零损失的可能性。在这里,我们将超越前人工作,得到关于任何深度的前馈网络NTK正值的尖锐结果。具体来说,我们将证明,对于任何非多项式激活函数,NTK都是严格正定实的。我们的结果基于一个关于多项式函数的新颖刻画,该函数具有独立的有意义性。
https://arxiv.org/abs/2404.12928
Conditional Generative Adversarial Networks (CGANs) exhibit significant potential in supervised learning model training by virtue of their ability to generate realistic labeled images. However, numerous studies have indicated the privacy leakage risk in CGANs models. The solution DPCGAN, incorporating the differential privacy framework, faces challenges such as heavy reliance on labeled data for model training and potential disruptions to original gradient information due to excessive gradient clipping, making it difficult to ensure model accuracy. To address these challenges, we present a privacy-preserving training framework called PATE-TripleGAN. This framework incorporates a classifier to pre-classify unlabeled data, establishing a three-party min-max game to reduce dependence on labeled data. Furthermore, we present a hybrid gradient desensitization algorithm based on the Private Aggregation of Teacher Ensembles (PATE) framework and Differential Private Stochastic Gradient Descent (DPSGD) method. This algorithm allows the model to retain gradient information more effectively while ensuring privacy protection, thereby enhancing the model's utility. Privacy analysis and extensive experiments affirm that the PATE-TripleGAN model can generate a higher quality labeled image dataset while ensuring the privacy of the training data.
条件生成对抗网络(CGANs)在监督学习模型训练方面的潜在优势在于其生成真实标注图像的能力。然而,许多研究表明,CGAN模型的隐私泄露风险。为解决这些挑战,我们提出了一个名为PATE-TripleGAN的隐私保护训练框架。该框架引入了一个分类器来预先分类未标注数据,建立了一个三方最小最大游戏以减少对标注数据的依赖。此外,我们还提出了一个基于Private Aggregation of Teacher Ensembles(PATE)框架和Differential Private Stochastic Gradient Descent(DPSGD)方法的混合梯度缓解算法。该算法可以在保护隐私的同时确保模型具有更好的利用价值。隐私分析和广泛的实验结果证实,在保护训练数据隐私的情况下,PATE-TripleGAN模型可以生成更高质量的标注图像数据。
https://arxiv.org/abs/2404.12730
This paper focuses on reducing the communication cost of federated learning by exploring generalization bounds and representation learning. We first characterize a tighter generalization bound for one-round federated learning based on local clients' generalizations and heterogeneity of data distribution (non-iid scenario). We also characterize a generalization bound in R-round federated learning and its relation to the number of local updates (local stochastic gradient descents (SGDs)). Then, based on our generalization bound analysis and our representation learning interpretation of this analysis, we show for the first time that less frequent aggregations, hence more local updates, for the representation extractor (usually corresponds to initial layers) leads to the creation of more generalizable models, particularly for non-iid scenarios. We design a novel Federated Learning with Adaptive Local Steps (FedALS) algorithm based on our generalization bound and representation learning analysis. FedALS employs varying aggregation frequencies for different parts of the model, so reduces the communication cost. The paper is followed with experimental results showing the effectiveness of FedALS.
本文重点探讨了通过探索泛化界和表示学习来降低联邦学习中的通信成本。首先,我们基于局部客户端的泛化能力和数据分布异质性(非iid场景)定义了一个更紧的泛化界。然后,我们在R轮联邦学习和其与本地更新数量的关系上进行了定义。基于我们对泛化界分析的推理和表示学习的解释,我们证明了表示提取器(通常对应于初始层)进行更少的聚合会导致创建更具有泛化能力的模型,尤其是在非iid场景中。我们基于泛化界和表示学习分析设计了一种名为FedALS的新联邦学习算法。FedALS采用不同的聚合频率来对模型的不同部分进行动态调整,从而降低了通信成本。本文附有实验结果,展示了FedALS的有效性。
https://arxiv.org/abs/2404.11754
A variety of forms of artificial intelligence systems have been developed. Two well-known techniques are neural networks and rule-fact expert systems. The former can be trained from presented data while the latter is typically developed by human domain experts. A combined implementation that uses gradient descent to train a rule-fact expert system has been previously proposed. A related system type, the Blackboard Architecture, adds an actualization capability to expert systems. This paper proposes and evaluates the incorporation of a defensible-style gradient descent training capability into the Blackboard Architecture. It also introduces the use of activation functions for defensible artificial intelligence systems and implements and evaluates a new best path-based training algorithm.
人工智能系统发展了许多形式。两种著名的技术是神经网络和规则专家系统。前一种可以从给出的数据中进行训练,而后者通常是由人类领域专家开发的。 previously proposed 是一种使用梯度下降来训练规则专家系统的联合实现。 一个相关的系统类型,Blackboard Architecture,增加了专家系统的实现能力。 本文提出了并评估将可防御的梯度下降训练功能融入Blackboard架构中的效果。 它还介绍了使用激活函数作为可防御的人工智能系统的实现,并实现了和评估了一种基于新最佳路径的新颖训练算法。
https://arxiv.org/abs/2404.11714
Large Language Models (LLMs) have exhibited an impressive ability to perform In-Context Learning (ICL) from only a few examples. Recent works have indicated that the functions learned by ICL can be represented through compressed vectors derived from the transformer. However, the working mechanisms and optimization of these vectors are yet to be thoroughly explored. In this paper, we address this gap by presenting a comprehensive analysis of these compressed vectors, drawing parallels to the parameters trained with gradient descent, and introduce the concept of state vector. Inspired by the works on model soup and momentum-based gradient descent, we propose inner and momentum optimization methods that are applied to refine the state vector progressively as test-time adaptation. Moreover, we simulate state vector aggregation in the multiple example setting, where demonstrations comprising numerous examples are usually too lengthy for regular ICL, and further propose a divide-and-conquer aggregation method to address this challenge. We conduct extensive experiments using Llama-2 and GPT-J in both zero-shot setting and few-shot setting. The experimental results show that our optimization method effectively enhances the state vector and achieves the state-of-the-art performance on diverse tasks. Code is available at this https URL
大语言模型(LLMs)表现出从仅几个示例中进行In-Context学习(ICL)的令人印象深刻的能力。最近的工作表明,通过从Transformer中提取压缩向量获得的函数可以表示为LLMs训练得到的参数。然而,这些向量的运行机制和优化尚未被深入研究。在本文中,我们通过全面分析这些压缩向量,将参数训练与梯度下降中的参数进行类比,并引入了状态向量概念,来填补这一空白。受到模型 soup 和基于动量梯度的优化工作的启发,我们提出了内化和动量优化方法,应用于在测试时间适应过程中逐步优化状态向量。此外,我们还通过多个示例设置下的状态向量聚合方法,解决了这种挑战。我们在零 shot和少样本设置中使用Llama-2和GPT-J进行广泛的实验。实验结果表明,我们的优化方法有效地增强了状态向量,并在各种任务上实现了最先进的性能。代码可以从该链接获取:https://llama2.github.io/experiments/icl_results.html
https://arxiv.org/abs/2404.11225
Policy-Space Response Oracles (PSRO) as a general algorithmic framework has achieved state-of-the-art performance in learning equilibrium policies of two-player zero-sum games. However, the hand-crafted hyperparameter value selection in most of the existing works requires extensive domain knowledge, forming the main barrier to applying PSRO to different games. In this work, we make the first attempt to investigate the possibility of self-adaptively determining the optimal hyperparameter values in the PSRO framework. Our contributions are three-fold: (1) Using several hyperparameters, we propose a parametric PSRO that unifies the gradient descent ascent (GDA) and different PSRO variants. (2) We propose the self-adaptive PSRO (SPSRO) by casting the hyperparameter value selection of the parametric PSRO as a hyperparameter optimization (HPO) problem where our objective is to learn an HPO policy that can self-adaptively determine the optimal hyperparameter values during the running of the parametric PSRO. (3) To overcome the poor performance of online HPO methods, we propose a novel offline HPO approach to optimize the HPO policy based on the Transformer architecture. Experiments on various two-player zero-sum games demonstrate the superiority of SPSRO over different baselines.
作为通用算法框架,Policy-Space Response Oracles (PSRO) 在学习两个零和博弈中的均衡策略方面已经达到了最先进的性能。然而,在大多数现有工作中,需要深入了解领域知识来进行手动超参数值选择,这使得将PSRO应用于不同游戏的主要障碍。在这项工作中,我们首次尝试研究在PSRO框架中自适应确定最优超参数值的可能性。我们的贡献有三点:(1)我们提出了一种参数PSRO,将梯度上升(GDA)和不同PSRO变体统一起来。(2)我们通过将参数PSRO的超参数值选择问题转化为超参数优化(HPO)问题,提出了一种自适应的PSRO(SPSRO)。我们的目标是在参数PSRO的运行过程中自适应地确定最优超参数值。(3)为了克服在线HPO方法的低性能,我们提出了一种基于Transformer架构的新在线HPO方法,用于优化基于Transformer架构的HPO政策。对于各种两个零和博弈的实验,SPSRO都证明了其优越性。
https://arxiv.org/abs/2404.11144
For natural language understanding and generation, embedding concepts using an order-based representation is an essential task. Unlike traditional point vector based representation, an order-based representation imposes geometric constraints on the representation vectors for explicitly capturing various semantic relationships that may exist between a pair of concepts. In existing literature, several approaches on order-based embedding have been proposed, mostly focusing on capturing hierarchical relationships; examples include vectors in Euclidean space, complex, Hyperbolic, order, and Box Embedding. Box embedding creates region-based rich representation of concepts, but along the process it sacrifices simplicity, requiring a custom-made optimization scheme for learning the representation. Hyperbolic embedding improves embedding quality by exploiting the ever-expanding property of Hyperbolic space, but it also suffers from the same fate as box embedding as gradient descent like optimization is not simple in the Hyperbolic space. In this work, we propose Binder, a novel approach for order-based representation. Binder uses binary vectors for embedding, so the embedding vectors are compact with an order of magnitude smaller footprint than other methods. Binder uses a simple and efficient optimization scheme for learning representation vectors with a linear time complexity. Our comprehensive experimental results show that Binder is very accurate, yielding competitive results on the representation task. But Binder stands out from its competitors on the transitive closure link prediction task as it can learn concept embeddings just from the direct edges, whereas all existing order-based approaches rely on the indirect edges.
对于自然语言理解和生成,使用基于顺序表示的概念嵌入是非常重要的任务。与传统的基于点向量的表示方法不同,基于顺序表示的方法对表示向量施加了几何约束,从而明确地捕捉了概念之间可能存在的各种语义关系。在现有文献中,已经提出了几种基于顺序表示的方法,主要集中在捕捉层次关系;例如欧氏空间中的向量、复数、双曲、顺序和Box嵌入。Box嵌入创建了概念的空间丰富表示,但在过程中它牺牲了简单性,需要为学习表示定制一个优化方案。双曲嵌入通过利用双曲空间的无限扩张性质来提高嵌入质量,但与Box嵌入一样,在Hyperbolic空间中优化梯度分解不是简单的。在这篇论文中,我们提出了Binder,一种全新的基于顺序表示的方法。Binder使用二进制向量进行嵌入,因此嵌入向量具有大小较小且其他方法的足迹更小的 footprint。Binder 使用了一个简单而高效的优化方案来学习表示向量,具有线性时间复杂度。我们的全面实验结果表明,Binder非常准确,在表示任务上产生了竞争力的结果。但是,Binder在等价关系链接预测任务中从竞争对手脱颖而出,因为它可以从直接边缘学习概念嵌入,而所有现有的基于顺序的方法都需要从间接边缘学习。
https://arxiv.org/abs/2404.10924
This paper presents a reactive navigation method that leverages a Model Predictive Path Integral (MPPI) control enhanced with spline interpolation for the control input sequence and Stein Variational Gradient Descent (SVGD). The MPPI framework addresses a nonlinear optimization problem by determining an optimal sequence of control inputs through a sampling-based approach. The efficacy of MPPI is significantly influenced by the sampling noise. To rapidly identify routes that circumvent large and/or newly detected obstacles, it is essential to employ high levels of sampling noise. However, such high noise levels result in jerky control input sequences, leading to non-smooth trajectories. To mitigate this issue, we propose the integration of spline interpolation within the MPPI process, enabling the generation of smooth control input sequences despite the utilization of substantial sampling noises. Nonetheless, the standard MPPI algorithm struggles in scenarios featuring multiple optimal or near-optimal solutions, such as environments with several viable obstacle avoidance paths, due to its assumption that the distribution over an optimal control input sequence can be closely approximated by a Gaussian distribution. To address this limitation, we extend our method by incorporating SVGD into the MPPI framework with spline interpolation. SVGD, rooted in the optimal transportation algorithm, possesses the unique ability to cluster samples around an optimal solution. Consequently, our approach facilitates robust reactive navigation by swiftly identifying obstacle avoidance paths while maintaining the smoothness of the control input sequences. The efficacy of our proposed method is validated on simulations with a quadrotor, demonstrating superior performance over existing baseline techniques.
本文提出了一种反应式导航方法,该方法利用Model预测路径积分(MPPI)控制与平滑插值在控制输入序列和Stein变分梯度下降(SVGD)的增强。MPPI框架通过基于采样的方法确定最优的控制输入序列,从而解决非线性优化问题。MPPI的有效性在很大程度上受到采样噪声的影响。为了迅速识别绕过大型和/或新近发现的障碍物的路线,必须采用高水平的采样噪声。然而,这种高噪声水平会导致平滑控制输入序列,从而导致非平稳轨迹。为了减轻这个问题,我们将在MPPI过程中集成平滑插值,使得尽管使用了大量的采样噪声,仍然可以生成平滑的控制输入序列。然而,标准的MPPI算法在具有多个最优或近最优解决方案的环境中表现不佳,因为其假定最优控制输入序列的分布可以近似为高斯分布。为了解决这个问题,我们通过将SVGD集成到MPPI框架中并使用平滑插值来扩展我们的方法。SVGD,源于最优运输算法,具有将样本聚类在最优解周围的独特能力。因此,我们的方法通过迅速识别避障路径并保持控制输入序列的平滑性,促进了鲁棒的反应式导航。我们在四旋翼仿真中验证了所提出方法的有效性,表明其性能优于现有基线技术。
https://arxiv.org/abs/2404.10395
This paper presents a novel approach to optimizing profit margins in non-life insurance markets through a gradient descent-based method, targeting three key objectives: 1) maximizing profit margins, 2) ensuring conversion rates, and 3) enforcing fairness criteria such as demographic parity (DP). Traditional pricing optimization, which heavily lean on linear and semi definite programming, encounter challenges in balancing profitability and fairness. These challenges become especially pronounced in situations that necessitate continuous rate adjustments and the incorporation of fairness criteria. Specifically, indirect Ratebook optimization, a widely-used method for new business price setting, relies on predictor models such as XGBoost or GLMs/GAMs to estimate on downstream individually optimized prices. However, this strategy is prone to sequential errors and struggles to effectively manage optimizations for continuous rate scenarios. In practice, to save time actuaries frequently opt for optimization within discrete intervals (e.g., range of [-20\%, +20\%] with fix increments) leading to approximate estimations. Moreover, to circumvent infeasible solutions they often use relaxed constraints leading to suboptimal pricing strategies. The reverse-engineered nature of traditional models complicates the enforcement of fairness and can lead to biased outcomes. Our method addresses these challenges by employing a direct optimization strategy in the continuous space of rates and by embedding fairness through an adversarial predictor model. This innovation not only reduces sequential errors and simplifies the complexities found in traditional models but also directly integrates fairness measures into the commercial premium calculation. We demonstrate improved margin performance and stronger enforcement of fairness highlighting the critical need to evolve existing pricing strategies.
本文提出了一种通过梯度下降方法优化非寿险市场利润率的新方法,旨在实现三个关键目标:1)最大化利润率,2)确保转换率,3)强制执行公平标准,如人口平等(DP)。传统的定价优化方法过于依赖线性和半定理规划,在平衡盈利和公平方面存在挑战。特别是在需要连续调整速率和公平标准的情况下,这些挑战变得更加突出。具体来说,间接率簿优化,一种广泛用于新业务价格设置的方法,依赖于预测模型如XGBoost或GLMs/GAMs来估计下游的个体优化价格。然而,这种策略容易产生序列误差,且在处理连续速率场景的优化时表现不佳。在实践中,为了节省时间, Actuaries经常在离散区间内进行优化(例如,范围为[-20%, +20%],固定步长),导致近似估计。此外,为了绕过无解问题,他们通常使用放松的约束条件,导致不公平定价策略。传统模型的反向工程性质使得公平和执行变得更加复杂,可能导致偏差结果。我们的方法通过在连续利率的领域采用直接优化策略来解决这些挑战。通过将公平通过对抗性预测器模型嵌入其中,这种创新不仅减少了序列误差,简化了传统模型的复杂性,而且将公平度措施直接整合到商业保单计算中。我们证明了提高边际表现的改进效果,并强调了必须改革现有定价策略的关键必要性。
https://arxiv.org/abs/2404.10275
While 3D Gaussian Splatting has recently become popular for neural rendering, current methods rely on carefully engineered cloning and splitting strategies for placing Gaussians, which does not always generalize and may lead to poor-quality renderings. In addition, for real-world scenes, they rely on a good initial point cloud to perform well. In this work, we rethink 3D Gaussians as random samples drawn from an underlying probability distribution describing the physical representation of the scene -- in other words, Markov Chain Monte Carlo (MCMC) samples. Under this view, we show that the 3D Gaussian updates are strikingly similar to a Stochastic Langevin Gradient Descent (SGLD) update. As with MCMC, samples are nothing but past visit locations, adding new Gaussians under our framework can simply be realized without heuristics as placing Gaussians at existing Gaussian locations. To encourage using fewer Gaussians for efficiency, we introduce an L1-regularizer on the Gaussians. On various standard evaluation scenes, we show that our method provides improved rendering quality, easy control over the number of Gaussians, and robustness to initialization.
虽然最近3D高斯平铺在神经渲染中变得流行,但现有的方法依赖于仔细设计的克隆和分割策略来放置高斯分布,这并不总是通用,并可能导致渲染质量差。此外,对于现实世界的场景,它们依赖于一个良好的初始点云来表现出色。在这项工作中,我们将3D高斯视为来自描述场景物理表示的概率分布的随机样本——换句话说,随机过程蒙特卡洛(MCMC)样本。在这种观点下,我们证明了3D高斯更新与随机Langevin梯度下降(SGLD)更新非常相似。与MCMC一样,样本只是过去的访问位置,在我们的框架中添加新高斯分布只需要简单的策略,即在现有高斯位置放置新高斯。为了鼓励使用更少的Gaussians,我们在Gaussians上引入了L1正则化。在各种标准评估场景中,我们证明了我们的方法提供了改进的渲染质量,容易控制高斯数量,以及对初始化的鲁棒性。
https://arxiv.org/abs/2404.09591
The worldwide adoption of machine learning (ML) and deep learning models, particularly in critical sectors, such as healthcare and finance, presents substantial challenges in maintaining individual privacy and fairness. These two elements are vital to a trustworthy environment for learning systems. While numerous studies have concentrated on protecting individual privacy through differential privacy (DP) mechanisms, emerging research indicates that differential privacy in machine learning models can unequally impact separate demographic subgroups regarding prediction accuracy. This leads to a fairness concern, and manifests as biased performance. Although the prevailing view is that enhancing privacy intensifies fairness disparities, a smaller, yet significant, subset of research suggests the opposite view. In this article, with extensive evaluation results, we demonstrate that the impact of differential privacy on fairness is not monotonous. Instead, we observe that the accuracy disparity initially grows as more DP noise (enhanced privacy) is added to the ML process, but subsequently diminishes at higher privacy levels with even more noise. Moreover, implementing gradient clipping in the differentially private stochastic gradient descent ML method can mitigate the negative impact of DP noise on fairness. This mitigation is achieved by moderating the disparity growth through a lower clipping threshold.
全球范围内机器学习(ML)和深度学习模型的采用,特别是在关键行业,如医疗保健和金融,对维护个人隐私和公平性提出了实质性的挑战。这两个要素对于一个可信赖的学习环境至关重要。虽然许多研究通过差异隐私(DP)机制保护个人隐私,但新兴研究表示,在机器学习模型中,差异隐私可能平等地影响预测准确性。这导致公平性问题,并表现为偏见表现。尽管普遍的看法是,提高隐私会加剧不公平差异,但较小、但具有重大意义的研究表明,这种看法是错误的。在本文中,我们通过大量评估结果,证明差异隐私对公平性的影响不是单调的。相反,我们观察到,在向ML过程添加更多DP噪声(增强隐私)时,准确度差异 initially增长,但随后在更高隐私水平上减少至更多噪声。此外,在差异隐私随机梯度下降ML方法中实现梯度截断可以减轻DP噪声对公平性的负面影响。这种减轻是通过降低截断阈值来调节差异增长实现的。
https://arxiv.org/abs/2404.09391
This paper presents the first algorithm for model-based offline quantum reinforcement learning and demonstrates its functionality on the cart-pole benchmark. The model and the policy to be optimized are each implemented as variational quantum circuits. The model is trained by gradient descent to fit a pre-recorded data set. The policy is optimized with a gradient-free optimization scheme using the return estimate given by the model as the fitness function. This model-based approach allows, in principle, full realization on a quantum computer during the optimization phase and gives hope that a quantum advantage can be achieved as soon as sufficiently powerful quantum computers are available.
本文提出了基于模型的离线量子强化学习的第一种算法,并在 cart-pole 基准上证明了其功能。模型和要优化的策略都被实现为变分量子电路。通过使用模型给出的返回估计作为 fitness 函数,策略通过梯度无关优化方案进行优化。基于模型的方法在优化阶段原则上允许实现完整的量子计算机,并为尽快拥有足够强大的量子计算机带来了希望。
https://arxiv.org/abs/2404.10017
Neural Cellular Automata (NCA) is a class of Cellular Automata where the update rule is parameterized by a neural network that can be trained using gradient descent. In this paper, we focus on NCA models used for texture synthesis, where the update rule is inspired by partial differential equations (PDEs) describing reaction-diffusion systems. To train the NCA model, the spatio-termporal domain is discretized, and Euler integration is used to numerically simulate the PDE. However, whether a trained NCA truly learns the continuous dynamic described by the corresponding PDE or merely overfits the discretization used in training remains an open question. We study NCA models at the limit where space-time discretization approaches continuity. We find that existing NCA models tend to overfit the training discretization, especially in the proximity of the initial condition, also called "seed". To address this, we propose a solution that utilizes uniform noise as the initial condition. We demonstrate the effectiveness of our approach in preserving the consistency of NCA dynamics across a wide range of spatio-temporal granularities. Our improved NCA model enables two new test-time interactions by allowing continuous control over the speed of pattern formation and the scale of the synthesized patterns. We demonstrate this new NCA feature in our interactive online demo. Our work reveals that NCA models can learn continuous dynamics and opens new venues for NCA research from a dynamical systems' perspective.
神经元细胞自动机(NCA)是一种细胞自动机,其中更新规则通过一个可以利用梯度下降进行训练的神经网络进行参数化。在本文中,我们重点关注用于纹理合成的高NCA模型,其中更新规则受到描述反应扩散系统的部分微分方程(PDE)的启发。为了训练NCA模型,将空间时间离散化,并用欧拉积分进行数值求解PDE。然而,训练后的NCA是否真正学会了由相应PDE描述的连续动态,还是仅仅在训练中过度拟合使用的离散化,仍然是一个未解决的问题。我们研究了在空间时间离散化逼近连续的情况下NCA模型的极限。我们发现,现有的NCA模型往往在训练附近过度拟合训练离散化,尤其是在初始条件附近,也称为“种子”处。为了解决这个问题,我们提出了一个使用均匀噪声作为初始条件的解决方案。我们证明了我们的方法在保持NCA动态的一致性方面具有有效性。通过允许对图案形成速度和合成图案的大小进行连续控制,我们的改进NCA模型为NCA研究提供了新的途径。我们在交互式在线演示中展示了这一新的NCA特性。我们的研究揭示了NCA模型可以学习连续动态,并为NCA研究从动态系统的角度打开新的研究方向。
https://arxiv.org/abs/2404.06279
One of the objectives of continual learning is to prevent catastrophic forgetting in learning multiple tasks sequentially, and the existing solutions have been driven by the conceptualization of the plasticity-stability dilemma. However, the convergence of continual learning for each sequential task is less studied so far. In this paper, we provide a convergence analysis of memory-based continual learning with stochastic gradient descent and empirical evidence that training current tasks causes the cumulative degradation of previous tasks. We propose an adaptive method for nonconvex continual learning (NCCL), which adjusts step sizes of both previous and current tasks with the gradients. The proposed method can achieve the same convergence rate as the SGD method when the catastrophic forgetting term which we define in the paper is suppressed at each iteration. Further, we demonstrate that the proposed algorithm improves the performance of continual learning over existing methods for several image classification tasks.
持续学习的一个目标是防止在按顺序学习多个任务时出现灾难性遗忘,现有解决方案的动力源于塑性-稳定困境的概念。然而,目前对每个任务上连续学习收敛的研究还比较少。在本文中,我们提供了基于记忆的连续学习与随机梯度下降的收敛分析,并给出了实证证据,即训练当前任务会使得以前任务的累积退化。我们提出了一个自适应的连续学习(NCCL)方法,该方法根据梯度调整前一个和当前任务的步长。当我们在每个迭代中抑制我们定义在论文中的灾难性遗忘项时,与SGD方法相同的收敛率。此外,我们还证明了所提出的算法在多个图像分类任务上的性能优于现有方法。
https://arxiv.org/abs/2404.05555