arxiv Preprint – DoG is SGD’s Best Friend: A Parameter-Free Dynamic Step Size Schedule


In this episode we discuss DoG is SGD’s Best Friend: A Parameter-Free Dynamic Step Size Schedule
by Maor Ivgi, Oliver Hinder, Yair Carmon. The paper presents a dynamic SGD step size formula called DoG that does not require manual tuning. The authors analyze the DoG formula and demonstrate its strong convergence guarantees for stochastic convex optimization. Empirical evaluation shows that DoG performs comparably to SGD with tuned learning rate and even outperforms tuned SGD in a per-layer variant.


Posted

in

by

Tags: