Abstract
Wastewater treatment plants face unique challenges for process control due to their complex dynamics, slow time constants, and stochastic delays in observations and actions. These characteristics make conventional control methods, such as Proportional-Integral-Derivative controllers, suboptimal for achieving efficient phosphorus removal, a critical component of wastewater treatment to ensure environmental sustainability. This study addresses these challenges using a novel deep reinforcement learning approach based on the Soft Actor-Critic algorithm, integrated with a custom simulator designed to model the delayed feedback inherent in wastewater treatment plants. The simulator incorporates Long Short-Term Memory networks for accurate multi-step state predictions, enabling realistic training scenarios. To account for the stochastic nature of delays, agents were trained under three delay scenarios: no delay, constant delay, and random delay. The results demonstrate that incorporating random delays into the reinforcement learning framework significantly improves phosphorus removal efficiency while reducing operational costs. Specifically, the delay-aware agent achieved 36% reduction in phosphorus emissions, 55% higher reward, 77% lower target deviation from the regulatory limit, and 9% lower total costs than traditional control methods in the simulated environment. These findings underscore the potential of reinforcement learning to overcome the limitations of conventional control strategies in wastewater treatment, providing an adaptive and cost-effective solution for phosphorus removal.
Abstract (translated)
污水处理厂在过程控制方面面临独特的挑战,因为它们具有复杂的动态特性、缓慢的时间常数以及观测和操作中的随机延迟。这些特点使得传统控制方法(如比例-积分-微分控制器)难以实现高效的磷去除,而磷的去除是确保环境可持续性的重要组成部分。本研究采用了一种基于Soft Actor-Critic算法的新颖深度强化学习方法来应对这些挑战,并结合了一个定制模拟器设计以模型化污水处理厂固有的延迟反馈。该模拟器整合了长短时记忆网络,用于准确预测多步状态,从而实现逼真的训练场景。为了考虑延迟的随机性质,在没有延迟、固定延迟和随机延迟三种延迟情景下进行了代理训练。结果表明,将随机延迟纳入强化学习框架显著提高了磷去除效率并降低了运营成本。具体而言,在模拟环境中,延迟感知代理相比传统控制方法实现了36%的磷排放减少,55%更高的奖励,77%的目标偏离监管限值更小,以及9%的总成本降低。这些发现强调了强化学习在克服污水处理中传统控制策略局限性方面的潜力,为磷去除提供了一种适应性强且经济高效的解决方案。
URL
https://arxiv.org/abs/2411.18305