arxiv preprint - Knowledge is a Region in Weight Space for Fine-tuned Language Models

In this episode we discuss Knowledge is a Region in Weight Space for Fine-tuned Language Models
by Almog Gueta, Elad Venezian, Colin Raffel, Noam Slonim, Yoav Katz, Leshem Choshen. The paper investigates the relationships between different neural network models when trained on diverse datasets, focusing on their weight space and loss landscape. The study reveals that language models finetuned on the same task but different datasets form clusters in weight space, and it is possible to navigate between these clusters to create new models with strong or even improved performance on various tasks. By utilizing this understanding, the research introduces a method where initiating finetuning from the central point of a model cluster achieves better results than starting with a pretrained model, as evidenced by an average accuracy improvement of 3.06 across 11 out of 12 datasets.

arxiv preprint – Knowledge is a Region in Weight Space for Fine-tuned Language Models