Category: Uncategorized
-
DeepDive: Advancing Deep Search Agents with Knowledge Graphs and Multi-Turn RL
In this episode, we discuss DeepDive: Advancing Deep Search Agents with Knowledge Graphs and Multi-Turn RL by Rui Lu, Zhenyu Hou, Zihan Wang, Hanchen Zhang, Xiao Liu, Yujiang Li, Shi Feng, Jie Tang, Yuxiao Dong. The paper introduces DeepDive, a method to improve large language models’ deep search capabilities by automatically generating complex questions and…
-
Towards a Physics Foundation Model
In this episode, we discuss Towards a Physics Foundation Model by Florian Wiesner, Matthias Wessling, Stephen Baek. This paper introduces the General Physics Transformer (GPhyT), a foundation model trained on diverse simulation data that can simulate multiple complex physical systems without explicit knowledge of governing equations. GPhyT outperforms specialized models by up to 29 times,…
-
Scalable Option Learning in High-Throughput Environments
In this episode, we discuss Scalable Option Learning in High-Throughput Environments by Mikael Henaff, Scott Fujimoto, Michael Rabbat. The paper presents Scalable Option Learning (SOL), a hierarchical reinforcement learning algorithm designed for high-throughput environments. SOL achieves a 25x increase in training speed and outperforms flat agents by training on 20 billion frames in the game…
-
Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning
In this episode, we discuss Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning by Shenzhi Wang, Le Yu, Chang Gao, Chujie Zheng, Shixuan Liu, Rui Lu, Kai Dang, Xionghui Chen, Jianxin Yang, Zhenru Zhang, Yuqiong Liu, An Yang, Andrew Zhao, Yang Yue, Shiji Song, Bowen Yu, Gao Huang, Junyang…
-
Reverse-Engineered Reasoning for Open-Ended Generation
In this episode, we discuss Reverse-Engineered Reasoning for Open-Ended Generation by Haozhe Wang, Haoran Que, Qixin Xu, Minghao Liu, Wangchunshu Zhou, Jiazhan Feng, Wanjun Zhong, Wei Ye, Tong Yang, Wenhao Huang, Ge Zhang, Fangzhen Lin. The paper introduces REverse-Engineered Reasoning (REER), a novel backward approach that uncovers deep reasoning steps from known good solutions instead…
-
Scaling Performance of Large Language Model Pretraining
In this episode, we discuss Scaling Performance of Large Language Model Pretraining by Alexander Interrante-Grant, Carla Varela-Rosa, Suhaas Narayan, Chris Connelly, Albert Reuther. The paper explores the challenges and strategies involved in training large language models (LLMs) at scale, focusing on distributed training and managing massive datasets across many computing nodes. It provides practical recommendations…
-
General Social Agents
In this episode, we discuss General Social Agents by Benjamin S. Manning, John J. Horton. The paper proposes using AI agents guided by social science theory and natural language instructions to predict human behavior in novel settings without ad hoc adjustments. By training these agents on human data from related “seed” games, they successfully predict…
-
We need a new ethics for a world of AI agents
In this episode, we discuss We need a new ethics for a world of AI agents by Iason Gabriel, Geoff Keeling, Arianna Manzini & James Evans. The paper examines the shift toward autonomous AI agents capable of goal-directed actions with minimal human oversight. It highlights both the potential benefits of these agents, such as economic…
-
ARC-Hunyuan-Video-7B: Structured Video Comprehension of Real-World Shorts
In this episode, we discuss ARC-Hunyuan-Video-7B: Structured Video Comprehension of Real-World Shorts by Yuying Ge, Yixiao Ge, Chen Li, Teng Wang, Junfu Pu, Yizhuo Li, Lu Qiu, Jin Ma, Lisheng Duan, Xinyu Zuo, Jinwen Luo, Weibo Gu, Zexuan Li, Xiaojing Zhang, Yangyu Tao, Han Hu, Di Wang, Ying Shan. The paper presents ARC-Hunyuan-Video, a 7B-parameter…
-
Small Language Models are the Future of Agentic AI
In this episode, we discuss Small Language Models are the Future of Agentic AI by Peter Belcak, Greg Heinrich, Shizhe Diao, Yonggan Fu, Xin Dong, Saurav Muralidharan, Yingyan Celine Lin, Pavlo Molchanov. The paper argues that small language models (SLMs) are more suitable, powerful enough, and cost-effective for many specialized agentic AI tasks compared to…
-
Learning When to Plan: Efficiently Allocating Test-Time Compute for LLM Agents
In this episode, we discuss Learning When to Plan: Efficiently Allocating Test-Time Compute for LLM Agents by Davide Paglieri, Bartłomiej Cupiał, Jonathan Cook, Ulyana Piterbarg, Jens Tuyls, Edward Grefenstette, Jakob Nicolaus Foerster, Jack Parker-Holder, Tim Rocktäschel. The paper introduces a framework enabling large language model agents to dynamically decide when to plan during task execution,…
-
Why Language Models Hallucinate
In this episode, we discuss Why Language Models Hallucinate by The authors of the paper are: – Adam Tauman Kalai – Ofir Nachum – Santosh S. Vempala – Edwin Zhang. The paper explains that hallucinations in large language models arise because training and evaluation reward guessing over admitting uncertainty, framing the issue as errors in…
-
Is Chain-of-Thought Reasoning of LLMs a Mirage? A Data Distribution Lens
In this episode, we discuss Is Chain-of-Thought Reasoning of LLMs a Mirage? A Data Distribution Lens by Chengshuai Zhao, Zhen Tan, Pingchuan Ma, Dawei Li, Bohan Jiang, Yancheng Wang, Yingzhen Yang, Huan Liu. The paper investigates Chain-of-Thought (CoT) reasoning in large language models, revealing it may not reflect true inferential processes but rather learned patterns…
-
Learning from Reward-Free Offline Data: A Case for Planning with Latent Dynamics Models
In this episode, we discuss Learning from Reward-Free Offline Data: A Case for Planning with Latent Dynamics Models by Vlad Sobal, Wancong Zhang, Kyunghyun Cho, Randall Balestriero, Tim G. J. Rudner, Yann LeCun. The paper compares model-free reinforcement learning and model-based control methods for solving navigation tasks using offline, reward-free data. It finds that reinforcement…
-
Persona Vectors: Monitoring and Controlling Character Traits in Language Models
In this episode, we discuss Persona Vectors: Monitoring and Controlling Character Traits in Language Models by Runjin Chen, Andy Arditi, Henry Sleight, Owain Evans, Jack Lindsey. The paper introduces persona vectors in large language models’ activation space that correspond to traits like evil or sycophancy and can track personality changes. These vectors help predict, control,…
-
Learn Globally, Speak Locally: Bridging the Gaps in Multilingual Reasoning
In this episode, we discuss Learn Globally, Speak Locally: Bridging the Gaps in Multilingual Reasoning by Jaedong Hwang, Kumar Tanmay, Seok-Jin Lee, Ayush Agrawal, Hamid Palangi, Kumar Ayush, Ila Fiete, Paul Pu Liang. The paper introduces GEOFACT-X, a multilingual factual reasoning benchmark with annotated reasoning traces in five languages to better evaluate language consistency in…
-
Position: The AI Conference Peer Review Crisis Demands Author Feedback and Reviewer Rewards
In this episode, we discuss Position: The AI Conference Peer Review Crisis Demands Author Feedback and Reviewer Rewards by Jaeho Kim, Yunseok Lee, Seulki Lee. The paper addresses challenges in AI conference peer review caused by massive submission volumes and declining review quality. It proposes a bi-directional review system where authors evaluate reviewers, and reviewers…
-
Working with AI: Measuring the Occupational Implications of Generative AI
In this episode, we discuss Working with AI: Measuring the Occupational Implications of Generative AI by Kiran Tomlinson, Sonia Jaffe, Will Wang, Scott Counts, Siddharth Suri. The paper analyzes 200,000 anonymized interactions between users and Microsoft Bing Copilot to understand how AI assists with various work activities. It identifies information gathering, writing, teaching, and advising…
-
Towards physician-centered oversight of conversational diagnostic AI
In this episode, we discuss Towards physician-centered oversight of conversational diagnostic AI by Elahe Vedadi, David Barrett, Natalie Harris, Ellery Wulczyn, Shashir Reddy, Roma Ruparel, Mike Schaekermann, Tim Strother, Ryutaro Tanno, Yash Sharma, Jihyeon Lee, Cían Hughes, Dylan Slack, Anil Palepu, Jan Freyberg, Khaled Saab, Valentin Liévin, Wei-Hung Weng, Tao Tu, Yun Liu, Nenad Tomasev,…
-
Learning without training: The implicit dynamics of in-context learning
In this episode, we discuss Learning without training: The implicit dynamics of in-context learning by Benoit Dherin, Michael Munn, Hanna Mazzawi, Michael Wunder, Javier Gonzalvo. The paper investigates how Large Language Models (LLMs) can learn new patterns during inference without weight updates, a phenomenon called in-context learning. It proposes that the interaction between self-attention and…
-
Aime: Towards Fully-Autonomous Multi-Agent Framework
In this episode, we discuss Aime: Towards Fully-Autonomous Multi-Agent Framework by Yexuan Shi, Mingyu Wang, Yunxiang Cao, Hongjie Lai, Junjian Lan, Xin Han, Yu Wang, Jie Geng, Zhenan Li, Zihao Xia, Xiang Chen, Chen Li, Jian Xu, Wenbo Duan, Yuanshuo Zhu. The paper presents Aime, a novel multi-agent system framework that improves upon traditional plan-and-execute…
-
ARAG: Agentic Retrieval Augmented Generation for Personalized Recommendation
In this episode, we discuss ARAG: Agentic Retrieval Augmented Generation for Personalized Recommendation by Reza Yousefi Maragheh, Pratheek Vadla, Priyank Gupta, Kai Zhao, Aysenur Inan, Kehui Yao, Jianpeng Xu, Praveen Kanumala, Jason Cho, Sushant Kumar. The paper proposes ARAG, a multi-agent Retrieval-Augmented Generation framework that enhances personalized recommendation by using specialized LLM agents to better…
-
4KAgent: Agentic Any Image to 4K Super-Resolution
In this episode, we discuss 4KAgent: Agentic Any Image to 4K Super-Resolution by Yushen Zuo, Qi Zheng, Mingyang Wu, Xinrui Jiang, Renjie Li, Jian Wang, Yide Zhang, Gengchen Mai, Lihong V. Wang, James Zou, Xiaoyu Wang, Ming-Hsuan Yang, Zhengzhong Tu. The paper introduces 4KAgent, a versatile image super-resolution model capable of upscaling any image to…
-
Critiques of World Models
In this episode, we discuss Critiques of World Models by Eric Xing, Mingkai Deng, Jinyu Hou, Zhiting Hu. The paper critiques existing approaches to world models by emphasizing their role in simulating all actionable possibilities for reasoning and acting. It proposes a new general-purpose world model architecture featuring hierarchical, multi-level, and mixed continuous/discrete representations learned…
-
Arxiv paper – Expert-level validation of AI-generated medical text with scalable language models
In this episode, we discuss Expert-level validation of AI-generated medical text with scalable language models by Asad Aali, Vasiliki Bikia, Maya Varma, Nicole Chiou, Sophie Ostmeier, Arnav Singhvi, Magdalini Paschali, Ashwin Kumar, Andrew Johnston, Karimar Amador-Martinez, Eduardo Juan Perez Guerrero, Paola Naovi Cruz Rivera, Sergios Gatidis, Christian Bluethgen, Eduardo Pontes Reis, Eddy D. Zandee van…