Nathan Lambert publishes ATOM Report update, RLHF book completion, and post-training course

Nathan Lambert, a researcher and author of the Interconnects newsletter, has published a roundup of four projects he describes as not having warranted standalone posts.

TL;DR: Lambert updated the ATOM Report on open model adoption, completed his RLHF book for publisher Manning, launched a free post-training lecture series, and co-authored two papers on multi-turn dialogue and meta-RL for agentic search.

What it says

The first item is an updated version of the ATOM Report, co-authored with Florian, which accompanies the ATOM Project memo arguing for investment in open language models in the United States. Lambert says the report covers GPT-OSS adoption trends, inference market share, the influence of Chinese mid-tier players including Moonshot, Z.ai, and MiniMax, and updates to the Relative Adoption Metric (RAM). The RAM, he writes, is designed so that a score above 1 indicates a model is on track to become a top-10 most-downloaded model in its size category. He notes a recent Gemma model is showing high early RAM numbers.

The second project is his RLHF book, which he says was sent to production with Manning. He writes that content edits are complete and the book will go to print in approximately two months, available for pre-order on Amazon and Manning.

Third, Lambert describes a post-training lecture series he is building to accompany the book. Three lectures are available on YouTube covering RLHF and post-training overview, foundations including instruction fine-tuning and reward modeling, and policy gradient algorithms. He says further lectures and in-person events are planned for the summer.

The final section covers two co-authored papers. TurnWise (Graf et al., 2026) examines the gap between single-turn and multi-turn language model capabilities and how to build training data to close it. Lambert writes that his own interests have shifted toward agents, where multi-turn interaction is “a very important user interface problem.” The second paper, Meta-Reinforcement Learning with Self-Reflection for Agentic Search (Xiao et al., 2026), frames hard-problem solving with RLVR as a meta-learning problem, where context from previous attempts informs future rollouts. Lambert describes the core idea as treating RL for LLMs as a form of continual learning in context, not only in weights.

Lambert closes by noting planned travel to China and Washington, DC over the coming months.