Paper Reading AI Learner

Rate-Perception Optimized Preprocessing for Video Coding

2023-01-25 08:21:52
Chengqian Ma, Zhiqiang Wu, Chunlei Cai, Pengwei Zhang, Yi Wang, Long Zheng, Chao Chen, Quan Zhou

Abstract

In the past decades, lots of progress have been done in the video compression field including traditional video codec and learning-based video codec. However, few studies focus on using preprocessing techniques to improve the rate-distortion performance. In this paper, we propose a rate-perception optimized preprocessing (RPP) method. We first introduce an adaptive Discrete Cosine Transform loss function which can save the bitrate and keep essential high frequency components as well. Furthermore, we also combine several state-of-the-art techniques from low-level vision fields into our approach, such as the high-order degradation model, efficient lightweight network design, and Image Quality Assessment model. By jointly using these powerful techniques, our RPP approach can achieve on average, 16.27% bitrate saving with different video encoders like AVC, HEVC, and VVC under multiple quality metrics. In the deployment stage, our RPP method is very simple and efficient which is not required any changes in the setting of video encoding, streaming, and decoding. Each input frame only needs to make a single pass through RPP before sending into video encoders. In addition, in our subjective visual quality test, 87% of users think videos with RPP are better or equal to videos by only using the codec to compress, while these videos with RPP save about 12% bitrate on average. Our RPP framework has been integrated into the production environment of our video transcoding services which serve millions of users every day.

Abstract (translated)

几十年来,在视频压缩领域取得了很多进展,包括传统的视频编码和基于学习的视频编码。然而,只有少数研究关注使用预处理技术来提高速率扭曲性能。在本文中,我们提出了一种 Rate- perception 优化预处理方法(RPP)。我们首先介绍了一种自适应离散余弦变换损失函数,它可以节省比特率并保留关键高频成分。此外,我们还将来自低级别视觉领域的几种先进技术,如高阶恶化模型、高效的轻量级网络设计以及图像质量评估模型,集成到我们的方法和方法中。通过共同使用这些强大的技术,我们的 RPP 方法平均能够节省 16.27% 的比特率,以不同视频编码器,如 AVC、HEVC 和 VVC,在不同质量指标下运行。在部署阶段,我们的 RPP 方法非常简单且高效,不需要更改视频编码、流媒体和解码的设置。每个输入帧只需要在一次经过 RPP 前将其发送到视频编码器。此外,在我们的主观视觉质量测试中,87% 的用户认为,仅使用编码器压缩的视频与仅使用编码器压缩的视频相比,质量更好或相等,而这些 RPP 视频平均节省约 12% 的比特率。我们的 RPP 框架已经把我们的视频转码服务的生产环境与我们的发布环境融为一体,每天服务于数百万用户。

URL

https://arxiv.org/abs/2301.10455

PDF

https://arxiv.org/pdf/2301.10455.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot