Abstract
We propose a two-stage unsupervised approach for parsing videos into phases. We use motion cues to divide the video into coarse segments. Noisy segment labels are then used to weakly supervise an appearance-based classifier. We show the effectiveness of the method for phase detection in colonoscopy videos.
Abstract (translated)
URL
https://arxiv.org/abs/2210.10594