arxiv preprint - NOLA: Compressing LoRA using Linear Combination of Random Basis

In this episode, we discuss NOLA: Compressing LoRA using Linear Combination of Random Basis by Soroush Abbasi Koohpayegani, KL Navaneet, Parsa Nooralinejad, Soheil Kolouri, Hamed Pirsiavash. The paper introduces a novel technique called NOLA for fine-tuning and deploying large language models (LLMs) like GPT-3 more efficiently by addressing the limitations of existing Low-Rank Adaptation (LoRA) methods. NOLA enhances parameter efficiency by re-parameterizing the low-rank matrices used in LoRA through linear combinations of randomly generated bases, allowing optimization of only the coefficients rather than the entire matrix. The evaluation of NOLA using models like GPT-2 and LLaMA-2 demonstrates comparable performance to LoRA but with significantly fewer parameters, making it more practical for diverse applications.

arxiv preprint – NOLA: Compressing LoRA using Linear Combination of Random Basis