Arxiv paper – VideoJAM: Joint Appearance-Motion Representations for Enhanced Motion Generation in Video Models


In this episode, we discuss VideoJAM: Joint Appearance-Motion Representations for Enhanced Motion Generation in Video Models by Hila Chefer, Uriel Singer, Amit Zohar, Yuval Kirstain, Adam Polyak, Yaniv Taigman, Lior Wolf, Shelly Sheynin. Generative video models typically prioritize appearance accuracy over motion coherence, limiting their ability to capture realistic dynamics. The paper presents VideoJAM, a framework that integrates a joint appearance-motion representation and uses an Inner-Guidance mechanism to enhance motion consistency during generation. VideoJAM achieves state-of-the-art motion realism and visual quality while being easily adaptable to existing video models without major changes.


Posted

in

by

Tags:

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *