ICCV 2023 – UnLoc: A Unified Framework for Video Localization Tasks


In this episode we discuss UnLoc: A Unified Framework for Video Localization Tasks
by Shen Yan, Xuehan Xiong, Arsha Nagrani, Anurag Arnab, Zhonghao Wang, Weina Ge, David Ross, Cordelia Schmid. The paper introduces UnLoc, a unified framework for video localization using large-scale image-text pretrained models. UnLoc eliminates the need for action proposals, motion-based features, and representation masking by combining moment retrieval, temporal localization, and action segmentation in a single stage model. Experimental results show that UnLoc outperforms previous methods and achieves state-of-the-art results in all three localization tasks.


Posted

in

by

Tags: