Paper Reading AI Learner

Improvement in Semantic Address Matching using Natural Language Processing

2024-04-17 18:42:36
Vansh Gupta, Mohit Gupta, Jai Garg, Nitesh Garg

Abstract

Address matching is an important task for many businesses especially delivery and take out companies which help them to take out a certain address from their data warehouse. Existing solution uses similarity of strings, and edit distance algorithms to find out the similar addresses from the address database, but these algorithms could not work effectively with redundant, unstructured, or incomplete address data. This paper discuss semantic Address matching technique, by which we can find out a particular address from a list of possible addresses. We have also reviewed existing practices and their shortcoming. Semantic address matching is an essentially NLP task in the field of deep learning. Through this technique We have the ability to triumph the drawbacks of existing methods like redundant or abbreviated data problems. The solution uses the OCR on invoices to extract the address and create the data pool of addresses. Then this data is fed to the algorithm BM-25 for scoring the best matching entries. Then to observe the best result, this will pass through BERT for giving the best possible result from the similar queries. Our investigation exhibits that our methodology enormously improves both accuracy and review of cutting-edge technology existing techniques.

Abstract (translated)

地址匹配对于许多企业来说特别是送餐和外卖公司,帮助他们从数据仓库中提取特定地址。现有解决方案使用字符串的相似性和编辑距离算法来查找地址数据库中的类似地址,但这些算法对于冗余、无结构或未完整地址数据无法有效工作。本文讨论了语义地址匹配技术,通过它可以从可能的地址列表中找到特定地址。我们还回顾了现有实践及其不足之处。语义地址匹配是深度学习领域中一个基本的语言处理任务。通过这种技术,我们能够克服现有方法中冗余或缩写数据问题的缺点。解决方案使用发票上的OCR提取地址并创建地址数据池。然后将该数据输入到算法BM-25中进行评分,以观察最佳结果。为了观察最佳结果,这还将通过BERT进行处理,从而从类似查询中获得最佳结果。我们的研究结果表明,我们的方法大大提高了现有技术的准确性和尖端技术的审查。

URL

https://arxiv.org/abs/2404.11691

PDF

https://arxiv.org/pdf/2404.11691.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot