ArXiv Preprint – Matryoshka Diffusion Models


In this episode we discuss Matryoshka Diffusion Models
by Jiatao Gu, Shuangfei Zhai, Yizhe Zhang, Josh Susskind, Navdeep Jaitly. The paper introduces Matryoshka Diffusion Models (MDM) for high-resolution image and video synthesis. The authors propose a diffusion process that denoises inputs at multiple resolutions simultaneously. They also present a NestedUNet architecture that combines features and parameters for small-scale inputs with larger scales, allowing for improved optimization for high-resolution generation. The approach is demonstrated to be effective on various benchmarks, achieving strong zero-shot generalization using a dataset of only 12 million images.


Posted

in

by

Tags: