Hi, I'm Lovish! I am pursuing my PhD at AI at Meta and UCL NLP. My current focus is on thinking models, synthetic data, and better evaluations.
In the past, I worked at Google Research India in the Machine Learning and Optimization Team with Prateek Jain and Srinadh Bhojanapalli on inference-efficient machine learning and natural language processing.
I graduated with a B.Tech and M.Tech in Computer Science and Engineering from the Indian Institute of Technology, Delhi. I worked with Parag Singla, Sayan Ranu, and Aaditeshwar Seth on a variety of research projects during my time there.
Recent Publications
Beyond Verifiable Rewards: Scaling Reinforcement Learning for Language Models to Unverifiable Data
,
arXiv Preprint
MLGym: A New Framework and Benchmark for Advancing AI Research Agents
,
arXiv Preprint
HARP: A challenging human-annotated math reasoning benchmark
,
arXiv Preprint
Lost in Inference: Rediscovering the Role of Natural Language Inference for Large Language Models
,
NAACL 2025
The Llama 3 Herd of Models
Research Paper Link
Quantifying Variance in Evaluation Benchmarks
,
Regulatable ML, NeurIPS 2024
Treeformer: Dense Gradient Trees for Efficient Attention Computation
,
ICLR 2023
Google AI Blog Coverage
Field Study in Deploying Restless Multi-Armed Bandits: Assisting Non-profits in Improving Maternal and Child Health
,
AAAI 2022
MLPH Workshop, NeurIPS 2021. Best Paper Award
Google AI Blog Coverage
* - indicates equal contribution.