Paper Reading AI Learner

Sever: A Robust Meta-Algorithm for Stochastic Optimization

2019-05-29 20:51:06
Ilias Diakonikolas, Gautam Kamath, Daniel M. Kane, Jerry Li, Jacob Steinhardt, Alistair Stewart

Abstract

In high dimensions, most machine learning methods are brittle to even a small fraction of structured outliers. To address this, we introduce a new meta-algorithm that can take in a base learner such as least squares or stochastic gradient descent, and harden the learner to be resistant to outliers. Our method, Sever, possesses strong theoretical guarantees yet is also highly scalable -- beyond running the base learner itself, it only requires computing the top singular vector of a certain $n \times d$ matrix. We apply Sever on a drug design dataset and a spam classification dataset, and find that in both cases it has substantially greater robustness than several baselines. On the spam dataset, with $1\%$ corruptions, we achieved $7.4\%$ test error, compared to $13.4\%-20.5\%$ for the baselines, and $3\%$ error on the uncorrupted dataset. Similarly, on the drug design dataset, with $10\%$ corruptions, we achieved $1.42$ mean-squared error test error, compared to $1.51$-$2.33$ for the baselines, and $1.23$ error on the uncorrupted dataset.

Abstract (translated)

在高维中,大多数机器学习方法甚至对结构异常值的一小部分都是脆弱的。为了解决这一问题,我们引入了一种新的元算法,该算法可以引入像最小二乘法或随机梯度下降法这样的基学习者,并使学习者对异常值有更强的抵抗力。我们的方法sever具有很强的理论保证,但也具有很高的可扩展性——除了运行基础学习者本身之外,它只需要计算特定$n乘以d$矩阵的顶部奇异向量。我们将sever应用于药物设计数据集和垃圾邮件分类数据集,发现在这两种情况下,它都比几个基线具有更大的健壮性。在垃圾邮件数据集上,通过$1\%$损坏,我们实现了$7.4\%$测试错误,而基线为$13.4\%-20.5\%$和未损坏数据集为$3\%$错误。同样,在药物设计数据集上,通过10%的腐败,我们获得了1.42美元的均方误差测试误差,而基线为1.51美元-2.33美元,未腐败数据集为1.23美元。

URL

https://arxiv.org/abs/1803.02815

PDF

https://arxiv.org/pdf/1803.02815.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot