— About

Hi, I'm Lovish! I'm a founding member at Recursive. My research these days centers on reinforcement learning, self-improvement, and automating AI research.

Before that, I wrapped up my PhD at MSL, Meta and UCL NLP working with Pontus Stenetorp and Dieuwke Hupkes. I did research on reinforcement learning, evaluations, and Meta's large language models.

Earlier, I spent time at Google Research India in the Machine Learning and Optimization Team, working alongside Prateek Jain and Srinadh Bhojanapalli on inference-efficient machine learning and natural language processing.

I earned my B.Tech and M.Tech in Computer Science and Engineering from the Indian Institute of Technology, Delhi, where I had the good fortune of working with Parag Singla, Sayan Ranu, and Aaditeshwar Seth on a range of research projects.

Selected publications

2026
The Art of Scaling Reinforcement Learning Compute for LLMs

Devvrit Khatri*, Lovish Madaan*, Rishabh Tiwari, Rachit Bansal, Sai Surya Duvvuri, Manzil Zaheer, Inderjit S. Dhillon, David Brandfonbrener, Rishabh Agarwal

ICLR 2026Oral · Top 1%
2026
Rethinking Thinking Tokens: LLMs as Improvement Operators

Lovish Madaan, Aniket Didolkar, Suchin Gururangan, John Quan, Ruan Silva, Ruslan Salakhutdinov, Manzil Zaheer, Sanjeev Arora, Anirudh Goyal

2025
MLGym: A New Framework and Benchmark for Advancing AI Research Agents

Deepak Nathani, Lovish Madaan, Nicholas Roberts, Nikolay Bashlykov, Ajay Menon, Vincent Moens, Amar Budhiraja, Despoina Magka, Vladislav Vorotilov, Gaurav Chaurasia, Dieuwke Hupkes, Ricardo Silveira Cabral, Tatiana Shavrina, Jakob Foerster, Yoram Bachrach, William Yang Wang, Roberta Raileanu

2025
HARP: A Challenging Human-Annotated Math Reasoning Benchmark

Albert S. Yue, Lovish Madaan, Ted Moskovitz, DJ Strouse, Aaditya K. Singh

2025
Lost in Inference: Rediscovering the Role of Natural Language Inference for Large Language Models

Lovish Madaan, David Esiobu, Pontus Stenetorp, Barbara Plank, Dieuwke Hupkes

2024
The Llama 3 Herd of Models

Core Contributor · Llama Team, AI @ Meta

2024
Quantifying Variance in Evaluation Benchmarks

Lovish Madaan, Aaditya K. Singh, Rylan Schaeffer, Andrew Poulton, Sanmi Koyejo, Pontus Stenetorp, Sharan Narang, Dieuwke Hupkes

2023
Treeformer: Dense Gradient Trees for Efficient Attention Computation

Lovish Madaan, Srinadh Bhojanapalli, Himanshu Jain, Prateek Jain

2022
Field Study in Deploying Restless Multi-Armed Bandits for Maternal & Child Health

Aditya Mate*, Lovish Madaan*, Aparna Taneja, Neha Madhiwalla, Shresth Verma, Gargi Singh, Aparna Hegde, Pradeep Varakantham, Milind Tambe

* — equal contribution

Recursive
Recursive
Meta
MSL
UCL NLP
UCL
Google Research
Google Research
IIT Delhi
IIT Delhi