Podcast
The podcast where we breakdown the recent AI papers and explain them in simple terms for you to understand.

Arxiv paper – VideoMind: A Chain-of-LoRA Agent for Long Video Reasoning – AI Breakdown
In this episode, we discuss VideoMind: A Chain-of-LoRA Agent for Long Video Reasoning by Ye Liu, Kevin Qinghong Lin, Chang Wen Chen, Mike Zheng Shou. The paper introduces VideoMind, a novel video-language agent designed for precise temporal-grounded video understanding. It employs a role-based workflow with components like a planner, grounder, verifier, and answerer, integrated efficiently using a Chain-of-LoRA strategy for seamless role-switching without heavy model overhead. Extensive testing on 14 benchmarks shows VideoMind achieves state-of-the-art results in various video understanding tasks, highlighting its effectiveness in multi-modal and long-form temporal reasoning.
- Arxiv paper – VideoMind: A Chain-of-LoRA Agent for Long Video Reasoning
- Arxiv paper – SynCity: Training-Free Generation of 3D Worlds
- Arxiv paper – HD-EPIC: A Highly-Detailed Egocentric Video Dataset
- Arxiv paper – Video-T1: Test-Time Scaling for Video Generation
- Arxiv paper – Calibrated Multi-Preference Optimization for Aligning Diffusion Models
News
- Arxiv paper – VideoMind: A Chain-of-LoRA Agent for Long Video ReasoningIn this episode, we discuss VideoMind: A Chain-of-LoRA Agent for Long Video Reasoning by Ye Liu, Kevin Qinghong Lin, Chang… Read more: Arxiv paper – VideoMind: A Chain-of-LoRA Agent for Long Video Reasoning
- Arxiv paper – SynCity: Training-Free Generation of 3D WorldsIn this episode, we discuss SynCity: Training-Free Generation of 3D Worlds by Paul Engstler, Aleksandar Shtedritski, Iro Laina, Christian Rupprecht,… Read more: Arxiv paper – SynCity: Training-Free Generation of 3D Worlds
- Arxiv paper – HD-EPIC: A Highly-Detailed Egocentric Video DatasetIn this episode, we discuss HD-EPIC: A Highly-Detailed Egocentric Video Dataset by Toby Perrett, Ahmad Darkhalil, Saptarshi Sinha, Omar Emara,… Read more: Arxiv paper – HD-EPIC: A Highly-Detailed Egocentric Video Dataset
- Arxiv paper – Video-T1: Test-Time Scaling for Video GenerationIn this episode, we discuss Video-T1: Test-Time Scaling for Video Generation by Fangfu Liu, Hanyang Wang, Yimo Cai, Kaiyan Zhang,… Read more: Arxiv paper – Video-T1: Test-Time Scaling for Video Generation
- Arxiv paper – Calibrated Multi-Preference Optimization for Aligning Diffusion ModelsIn this episode, we discuss Calibrated Multi-Preference Optimization for Aligning Diffusion Models by Kyungmin Lee, Xiaohang Li, Qifei Wang, Junfeng… Read more: Arxiv paper – Calibrated Multi-Preference Optimization for Aligning Diffusion Models