arxiv preprint – UniRef++: Segment Every Reference Object in Spatial and Temporal Spaces


In this episode, we discuss UniRef++: Segment Every Reference Object in Spatial and Temporal Spaces by Jiannan Wu, Yi Jiang, Bin Yan, Huchuan Lu, Zehuan Yuan, Ping Luo. The paper introduces UniRef++, a unified architecture designed to address four reference-based object segmentation tasks: referring image segmentation (RIS), few-shot image segmentation (FSS), referring video object segmentation (RVOS), and video object segmentation (VOS). At the core of UniRef++ is the UniFusion module, which enables multiway fusion adjusted to task-specific references, along with a unified Transformer architecture for instance-level segmentation. UniRef++ demonstrates state-of-the-art performance on RIS and RVOS benchmarks, competitive results on FSS and VOS, and can be integrated with existing models, like SAM, for parameter-efficient finetuning.


Posted

in

by

Tags: