Arxiv paper - Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs

In this episode, we discuss Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs by Yue Wang, Qiuzhi Liu, Jiahao Xu, Tian Liang, Xingyu Chen, Zhiwei He, Linfeng Song, Dian Yu, Juntao Li, Zhuosheng Zhang, Rui Wang, Zhaopeng Tu, Haitao Mi, Dong Yu. The paper identifies “underthinking” in large language models like OpenAI’s GPT-4, where models frequently switch reasoning paths without fully exploring promising solutions, leading to errors on complex tasks such as challenging mathematical problems. Through experiments on multiple test sets and models, the authors demonstrate that frequent thought switching is linked to incorrect responses and introduce a metric to measure this underthinking based on token efficiency. To address the issue, they propose a thought switching penalty (TIP) decoding strategy that encourages deeper exploration of each reasoning path, resulting in improved accuracy without requiring model fine-tuning.

Arxiv paper – Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs