Abstract: Weakly-supervised temporal action localization (WTAL) aims to localize and classify action instances in untrimmed videos with only video-level labels available. Despite the remarkable ...