In this episode we discuss Matryoshka Diffusion Models
by Jiatao Gu, Shuangfei Zhai, Yizhe Zhang, Josh Susskind, Navdeep Jaitly. The paper introduces Matryoshka Diffusion Models (MDM) for high-resolution image and video synthesis. The authors propose a diffusion process that denoises inputs at multiple resolutions simultaneously. They also present a NestedUNet architecture that combines features and parameters for small-scale inputs with larger scales, allowing for improved optimization for high-resolution generation. The approach is demonstrated to be effective on various benchmarks, achieving strong zero-shot generalization using a dataset of only 12 million images.
ArXiv Preprint – Matryoshka Diffusion Models
by
Tags: