AI Breakdown

Podcast

The podcast where we breakdown the recent AI papers and explain them in simple terms for you to understand.

Arxiv paper – How much do language models memorize? AI Breakdown

In this episode, we discuss How much do language models memorize? by John X. Morris, Chawin Sitawarin, Chuan Guo, Narine Kokhlikyan, G. Edward Suh, Alexander M. Rush, Kamalika Chaudhuri, Saeed Mahloujifar. The paper introduces a method to quantify how much a language model memorizes versus generalizes from data, defining model capacity as total memorization excluding generalization. Through extensive experiments on GPT-family models of varying sizes, the authors find that models memorize data until their capacity is full, after which generalization (or "grokking") increases and unintended memorization decreases. They establish scaling laws linking model capacity, data size, and membership inference, estimating GPT models have about 3.6 bits-per-parameter capacity.
  1. Arxiv paper – How much do language models memorize?
  2. Arxiv paper – MMaDA: Multimodal Large Diffusion Language Models
  3. Arxiv paper – Superhuman performance of a large language model on the reasoning tasks of a physician
  4. Arxiv paper – The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models
  5. Arxiv paper – DanceGRPO: Unleashing GRPO on Visual Generation

News