Paper Reading AI Learner

Auto-Formula: Recommend Formulas in Spreadsheets using Contrastive Learning for Table Representations

2024-04-19 03:28:18
Sibei Chen, Yeye He, Weiwei Cui, Ju Fan, Song Ge, Haidong Zhang, Dongmei Zhang, Surajit Chaudhuri

Abstract

Spreadsheets are widely recognized as the most popular end-user programming tools, which blend the power of formula-based computation, with an intuitive table-based interface. Today, spreadsheets are used by billions of users to manipulate tables, most of whom are neither database experts nor professional programmers. Despite the success of spreadsheets, authoring complex formulas remains challenging, as non-technical users need to look up and understand non-trivial formula syntax. To address this pain point, we leverage the observation that there is often an abundance of similar-looking spreadsheets in the same organization, which not only have similar data, but also share similar computation logic encoded as formulas. We develop an Auto-Formula system that can accurately predict formulas that users want to author in a target spreadsheet cell, by learning and adapting formulas that already exist in similar spreadsheets, using contrastive-learning techniques inspired by "similar-face recognition" from compute vision. Extensive evaluations on over 2K test formulas extracted from real enterprise spreadsheets show the effectiveness of Auto-Formula over alternatives. Our benchmark data is available at this https URL to facilitate future research.

Abstract (translated)

电子表格被广泛认为是用户最喜爱的开发工具,它将基于公式的计算力量与直观的表格界面相结合。如今,电子表格被数十亿人用于操作表格,其中大多数用户既不是数据库专家也不是专业程序员。尽管电子表格取得了成功,但创建复杂公式仍然具有挑战性,因为非技术用户需要查找并理解非琐碎的公式语法。为了应对这个痛点,我们利用观察到同一组织中通常有很多类似外观的电子表格这一事实,这些电子表格不仅具有类似的数据,而且共享相似的计算逻辑,作为公式编码。我们开发了一种自动公式系统,可以准确预测用户希望在目标电子表格单元格中创建的公式,通过使用与计算视觉中的“相似脸识别”技术灵感相同的对比学习方法来学习并适应现有的类似电子表格中的公式。对来自真实企业电子表格的2K个测试公式的广泛评估显示,自动公式比其他方法更有效。我们的基准数据可在此处访问,以促进未来研究:https://www.example.com/。

URL

https://arxiv.org/abs/2404.12608

PDF

https://arxiv.org/pdf/2404.12608.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot