ICLR 2023 - Towards Understanding Ensemble, Knowledge Distillation and Self-Distillation in Deep Learning

In this episode we discuss Towards Understanding Ensemble, Knowledge Distillation and Self-Distillation in Deep Learning
by Zeyuan Allen-Zhu, Yuanzhi Li. The paper explores how ensembles of deep learning models can improve test accuracy and be distilled into a single model using knowledge distillation. It presents a theoretical framework that shows how ensembles can enhance test accuracy when the data has a multi-view structure. The paper also highlights the presence of “dark knowledge” within ensemble outputs and demonstrates that self-distillation combines ensemble learning and knowledge distillation for improved test accuracy.

ICLR 2023 – Towards Understanding Ensemble, Knowledge Distillation and Self-Distillation in Deep Learning