Arxiv paper - Improving Video Generation with Human Feedback

In this episode, we discuss Improving Video Generation with Human Feedback by Jie Liu, Gongye Liu, Jiajun Liang, Ziyang Yuan, Xiaokun Liu, Mingwu Zheng, Xiele Wu, Qiulin Wang, Wenyu Qin, Menghan Xia, Xintao Wang, Xiaohong Liu, Fei Yang, Pengfei Wan, Di Zhang, Kun Gai, Yujiu Yang, Wanli Ouyang. The paper introduces a pipeline that utilizes human feedback to enhance video generation, addressing issues like unsmooth motion and prompt-video misalignment. It presents **VideoReward**, a multi-dimensional reward model trained on a large-scale human preference dataset, and develops three alignment algorithms—Flow-DPO, Flow-RWR, and Flow-NRG—to optimize flow-based video models. Experimental results show that VideoReward outperforms existing models, Flow-DPO achieves superior performance over other methods, and Flow-NRG allows for personalized video quality adjustments during inference.

Arxiv paper – Improving Video Generation with Human Feedback