Arxiv paper - Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach

In this episode, we discuss Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach by Jonas Geiping, Sean McLeish, Neel Jain, John Kirchenbauer, Siddharth Singh, Brian R. Bartoldson, Bhavya Kailkhura, Abhinav Bhatele, Tom Goldstein. The paper presents a new language model architecture that enhances test-time computation by iteratively reasoning in latent space using a recurrent block, allowing flexible depth during inference. Unlike chain-of-thought approaches, it doesn’t require specialized training data, works with small context windows, and can handle complex reasoning not easily expressed in words. A 3.5 billion parameter model was scaled to 800 billion tokens, demonstrating significant performance improvements on reasoning benchmarks with computation loads up to 50 billion parameters.
Huggingface: https://huggingface.co/papers/2502.05171

Github: https://github.com/seal-rg/recurrent-pretraining

Arxiv paper – Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach