In this episode, we discuss Phantom of Latent for Large Language and Vision Models by Byung-Kwan Lee, Sangyun Chung, Chae Won Kim, Beomchan Park, Yong Man Ro. The paper introduces Phantom, an efficient LLVM family designed to perform comparably to larger models but with significantly smaller sizes, ranging from 0.5B to 7B parameters. By temporarily increasing the latent hidden dimension during multi-head self-attention, Phantom enhances learning capabilities without a substantial increase in model size. Phantom Optimization (PO) combines autoregressive supervised fine-tuning and a direct preference optimization-like concept, resulting in state-of-the-art performance against larger LLVMs.
arxiv preprint – Phantom of Latent for Large Language and Vision Models
by
Tags: