arxiv preprint – Sapiens: Foundation for Human Vision Models

In this episode, we discuss Sapiens: Foundation for Human Vision Models by Rawal Khirodkar, Timur Bagautdinov, Julieta Martinez, Su Zhaoen, Austin James, Peter Selednik, Stuart Anderson, Shunsuke Saito. The Sapiens model family addresses four key human-centric vision tasks and supports 1K high-resolution inference, with easy adaptability through fine-tuning on a large dataset of human images. Self-supervised pretraining significantly enhances performance across these tasks, especially with limited labeled data. Sapiens models achieve state-of-the-art results in benchmarks like Humans-5K, Humans-2K, Hi4D, and THuman2, improving metrics by substantial margins.


Posted

in

by

Tags: