Mutual Wasserstein Discrepancy Minimization for Sequential Recommendation

Abstract
Abstract (translated)
URL
PDF

Abstract

Self-supervised sequential recommendation significantly improves recommendation performance by maximizing mutual information with well-designed data augmentations. However, the mutual information estimation is based on the calculation of Kullback Leibler divergence with several limitations, including asymmetrical estimation, the exponential need of the sample size, and training instability. Also, existing data augmentations are mostly stochastic and can potentially break sequential correlations with random modifications. These two issues motivate us to investigate an alternative robust mutual information measurement capable of modeling uncertainty and alleviating KL divergence limitations. To this end, we propose a novel self-supervised learning framework based on Mutual WasserStein discrepancy minimization MStein for the sequential recommendation. We propose the Wasserstein Discrepancy Measurement to measure the mutual information between augmented sequences. Wasserstein Discrepancy Measurement builds upon the 2-Wasserstein distance, which is more robust, more efficient in small batch sizes, and able to model the uncertainty of stochastic augmentation processes. We also propose a novel contrastive learning loss based on Wasserstein Discrepancy Measurement. Extensive experiments on four benchmark datasets demonstrate the effectiveness of MStein over baselines. More quantitative analyses show the robustness against perturbations and training efficiency in batch size. Finally, improvements analysis indicates better representations of popular users or items with significant uncertainty. The source code is at this https URL.

Abstract (translated)

自监督顺序推荐显著改进推荐性能,通过最大化相互信息与设计良好的数据增强。然而,相互信息估计基于计算库勒比散度的计算,并存在一些限制,包括不等式的估计、样本数量呈指数增长的需求,以及训练不稳定。此外,现有的数据增强大多是随机的,可能破坏随机修改之间的顺序相关性。这两个问题激励我们研究一种可靠的相互信息测量方法,能够建模不确定性并减轻库勒比散度限制。为此,我们提出了一种新的自监督学习框架,基于相互瓦塞尔斯坦差异最小化MStein的顺序推荐。我们提出了瓦塞尔斯坦差异测量来测量增强序列之间的相互信息。瓦塞尔斯坦差异测量基于2瓦塞尔斯坦距离,比2瓦塞尔斯坦距离更加稳健、在小批量规模下更加高效,并能够模拟随机增强过程的不确定性。我们还提出了一种基于瓦塞尔斯坦差异测量的新的竞争学习损失。对四个基准数据集进行的广泛实验表明,MStein相对于基准方法的 effectiveness。更详细的分析显示,对干扰的鲁棒性和训练批量大小的性能。最后,改进分析表明,流行用户或具有重要不确定性的物品更好的表示。源代码在此https URL上。

URL

https://arxiv.org/abs/2301.12197

PDF

https://arxiv.org/pdf/2301.12197.pdf