Abstract
High-fidelity wildfire monitoring using Unmanned Aerial Vehicles (UAVs) typically requires multimodal sensing - especially RGB and thermal imagery - which increases hardware cost and power consumption. This paper introduces SAM-TIFF, a novel teacher-student distillation framework for pixel-level wildfire temperature prediction and segmentation using RGB input only. A multimodal teacher network trained on paired RGB-Thermal imagery and radiometric TIFF ground truth distills knowledge to a unimodal RGB student network, enabling thermal-sensor-free inference. Segmentation supervision is generated using a hybrid approach of segment anything (SAM)-guided mask generation, and selection via TOPSIS, along with Canny edge detection and Otsu's thresholding pipeline for automatic point prompt selection. Our method is the first to perform per-pixel temperature regression from RGB UAV data, demonstrating strong generalization on the recent FLAME 3 dataset. This work lays the foundation for lightweight, cost-effective UAV-based wildfire monitoring systems without thermal sensors.
Abstract (translated)
利用无人驾驶航空器(UAV)进行高保真的野火监测通常需要多模态传感,尤其是RGB和热成像数据,这会增加硬件成本和能耗。本文介绍了一种新颖的教师-学生蒸馏框架SAM-TIFF,该框架仅使用RGB输入即可实现像素级别的野火温度预测与分割。一个多模态教师网络在配对的RGB-热成像图像及辐射度TIFF地面真值上进行训练,并将知识传递给单模RGB学生网络,从而实现在没有热传感器的情况下也能进行推断。该方法通过混合生成分割监督:使用SAM(Segment Anything Model)引导的掩码生成和TOPSIS选择,以及结合Canny边缘检测与Otsu阈值处理流程自动选择点提示来实现这一目标。 我们的方法首次从RGB UAV数据中实现了每像素温度回归,并在最近发布的FLAME 3数据集上展示了强大的泛化能力。这项工作为轻量级、低成本且无需热传感器的UAV野火监测系统奠定了基础。
URL
https://arxiv.org/abs/2505.01638