ArXiv Preprint - Learning From Mistakes Makes LLM Better Reasoner

In this episode we discuss Learning From Mistakes Makes LLM Better Reasoner
by Shengnan An, Zexiong Ma, Zeqi Lin, Nanning Zheng, Jian-Guang Lou, Weizhu Chen. The paper introduces LEarning from MistAkes (LEMA), a method that improves large language models’ (LLMs) ability to solve math problems by fine-tuning them using GPT-4-generated mistake-correction data pairs. LEMA involves identifying an LLM’s errors in reasoning, explaining why the mistake occurred, and providing the correct solution. LEMA showed significant performance enhancements on mathematical reasoning tasks, surpassing state-of-the-art performances of open-source models, with the intention to release the code, data, and models publicly.

ArXiv Preprint – Learning From Mistakes Makes LLM Better Reasoner