Paper Reading AI Learner

VertAttack: Taking advantage of Text Classifiers' horizontal vision

2024-04-12 15:32:17
Jonathan Rusert

Abstract

Text classification systems have continuously improved in performance over the years. However, nearly all current SOTA classifiers have a similar shortcoming, they process text in a horizontal manner. Vertically written words will not be recognized by a classifier. In contrast, humans are easily able to recognize and read words written both horizontally and vertically. Hence, a human adversary could write problematic words vertically and the meaning would still be preserved to other humans. We simulate such an attack, VertAttack. VertAttack identifies which words a classifier is reliant on and then rewrites those words vertically. We find that VertAttack is able to greatly drop the accuracy of 4 different transformer models on 5 datasets. For example, on the SST2 dataset, VertAttack is able to drop RoBERTa's accuracy from 94 to 13%. Furthermore, since VertAttack does not replace the word, meaning is easily preserved. We verify this via a human study and find that crowdworkers are able to correctly label 77% perturbed texts perturbed, compared to 81% of the original texts. We believe VertAttack offers a look into how humans might circumvent classifiers in the future and thus inspire a look into more robust algorithms.

Abstract (translated)

文本分类系统在过去几年中一直不断提高性能。然而,几乎所有当前的最优分类器都有类似的缺陷,它们以水平方式处理文本。水平书写的单词不会被分类器识别。相比之下,人类能够轻松地识别和阅读水平书写和垂直书写的单词。因此,一个的人类攻击者可以垂直书写有问题的单词,其他人类仍能够理解其含义。我们模拟了这种攻击,名为VertAttack。VertAttack会识别分类器所依赖的单词,然后将它们垂直地重写。我们发现,VertAttack能够在5个数据集上大大降低4种不同Transformer模型的准确性。例如,在SST2数据集上,VertAttack将RoBERTa的准确性从94%降低到13%。此外,由于VertAttack没有替换单词的含义,因此很容易保留。我们通过人类研究证实了这一点,并发现,在原文本上,工人能够正确地标记出77%的扰动文本,而原始文本上的81%则无法正确标记。我们相信,VertAttack提供了一个窗口,让人们思考未来人类可能会如何绕过分类器,从而激发了对更健壮算法的思考。

URL

https://arxiv.org/abs/2404.08538

PDF

https://arxiv.org/pdf/2404.08538.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot