In this episode, we discuss Future Lens: Anticipating Subsequent Tokens from a Single Hidden State by Koyena Pal, Jiuding Sun, Andrew Yuan, Byron C. Wallace, David Bau. The paper investigates if single hidden state vectors from an input token in a model like GPT-J-6B can predict multiple future tokens in a sequence. Using linear approximation and causal intervention methods, the researchers found that certain layers allow accurate future token prediction from a single hidden state with over 48% accuracy. They introduce “Future Lens,” a visualization tool that leverages their findings to give a new perspective on transformer states and their predictive capabilities.
arxiv preprint – Future Lens: Anticipating Subsequent Tokens from a Single Hidden State
by
Tags: