arxiv Preprint – Self-Taught Optimizer (STOP): Recursively Self-Improving Code Generation


In this episode we discuss Self-Taught Optimizer (STOP): Recursively Self-Improving Code Generation
by Eric Zelikman, Eliana Lorch, Lester Mackey, Adam Tauman Kalai. The paper presents a method called Self-Taught Optimizer (STOP) that utilizes a language model to enhance a scaffolding program for solving optimization problems. The language model suggests self-improvement strategies like beam search, genetic algorithms, and simulated annealing. The study demonstrates the success of STOP by comparing the improved program to its original version in various downstream tasks and analyzes the potential risks associated with bypassing a sandbox in the generated code.


Posted

in

by

Tags: