Forensic Analysis and Localization of Multiply Compressed MP3 Audio Using Transformers

2022-03-30 17:32:37

Ziyue Xiang, Paolo Bestagini, Stefano Tubaro, Edward J. Delp

arXiv_SD

arXiv_SD Transformer Pose

Abstract
Abstract (translated)
URL
PDF

Abstract

Audio signals are often stored and transmitted in compressed formats. Among the many available audio compression schemes, MPEG-1 Audio Layer III (MP3) is very popular and widely used. Since MP3 is lossy it leaves characteristic traces in the compressed audio which can be used forensically to expose the past history of an audio file. In this paper, we consider the scenario of audio signal manipulation done by temporal splicing of compressed and uncompressed audio signals. We propose a method to find the temporal location of the splices based on transformer networks. Our method identifies which temporal portions of a audio signal have undergone single or multiple compression at the temporal frame level, which is the smallest temporal unit of MP3 compression. We tested our method on a dataset of 486,743 MP3 audio clips. Our method achieved higher performance and demonstrated robustness with respect to different MP3 data when compared with existing methods.

Abstract (translated)

URL

https://arxiv.org/abs/2203.16499

PDF

https://arxiv.org/pdf/2203.16499.pdf