arxiv preprint – Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data

In this episode, we discuss Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data by Lihe Yang, Bingyi Kang, Zilong Huang, Xiaogang Xu, Jiashi Feng, Hengshuang Zhao. “Depth Anything” is an approach to improve monocular depth estimation by exploiting a massive collection of about 62 million unlabeled images, aiming to extend dataset size and lessen generalization errors without the need for novel technical developments. The model’s performance is heightened through strategic data augmentation and the incorporation of semantic knowledge from pre-trained encoders, leading to exceptional zero-shot generalization demonstrated on various public datasets and random images. By additionally fine-tuning with metric depth data, the model sets new benchmarks on NYUv2 and KITTI datasets and enhances the efficacy of a depth-conditioned ControlNet, with all models released for public use.


Posted

in

by

Tags: