— About

Hi, I'm Lovish! I am pursuing my PhD at MSL, Meta and UCL NLP. My current focus is on thinking models, reinforcement learning, and self-improvement.

In the past, I worked at Google Research India in the Machine Learning and Optimization Team with Prateek Jain and Srinadh Bhojanapalli on inference-efficient machine learning and natural language processing.

I graduated with a B.Tech and M.Tech in Computer Science and Engineering from the Indian Institute of Technology, Delhi. I worked with Parag Singla, Sayan Ranu, and Aaditeshwar Seth on a variety of research projects during my time there.

Selected publications

2026
The Art of Scaling Reinforcement Learning Compute for LLMs

Devvrit Khatri*, Lovish Madaan*, Rishabh Tiwari, Rachit Bansal, Sai Surya Duvvuri, Manzil Zaheer, Inderjit S. Dhillon, David Brandfonbrener, Rishabh Agarwal

ICLR 2026Oral · Top 1%
2026
Rethinking Thinking Tokens: LLMs as Improvement Operators

Lovish Madaan, Aniket Didolkar, Suchin Gururangan, John Quan, Ruan Silva, Ruslan Salakhutdinov, Manzil Zaheer, Sanjeev Arora, Anirudh Goyal

2025
MLGym: A New Framework and Benchmark for Advancing AI Research Agents

Deepak Nathani, Lovish Madaan, Nicholas Roberts, Nikolay Bashlykov, Ajay Menon, Vincent Moens, Amar Budhiraja, Despoina Magka, Vladislav Vorotilov, Gaurav Chaurasia, Dieuwke Hupkes, Ricardo Silveira Cabral, Tatiana Shavrina, Jakob Foerster, Yoram Bachrach, William Yang Wang, Roberta Raileanu

2025
HARP: A Challenging Human-Annotated Math Reasoning Benchmark

Albert S. Yue, Lovish Madaan, Ted Moskovitz, DJ Strouse, Aaditya K. Singh

2025
Lost in Inference: Rediscovering the Role of Natural Language Inference for Large Language Models

Lovish Madaan, David Esiobu, Pontus Stenetorp, Barbara Plank, Dieuwke Hupkes

2024
The Llama 3 Herd of Models

Core Contributor · Llama Team, AI @ Meta

2024
Quantifying Variance in Evaluation Benchmarks

Lovish Madaan, Aaditya K. Singh, Rylan Schaeffer, Andrew Poulton, Sanmi Koyejo, Pontus Stenetorp, Sharan Narang, Dieuwke Hupkes

2023
Treeformer: Dense Gradient Trees for Efficient Attention Computation

Lovish Madaan, Srinadh Bhojanapalli, Himanshu Jain, Prateek Jain

2022
Field Study in Deploying Restless Multi-Armed Bandits for Maternal & Child Health

Aditya Mate*, Lovish Madaan*, Aparna Taneja, Neha Madhiwalla, Shresth Verma, Gargi Singh, Aparna Hegde, Pradeep Varakantham, Milind Tambe

* — equal contribution

Meta
AI at Meta
UCL NLP
UCL
Google Research
Google Research
IIT Delhi
IIT Delhi