Hi, my name is Marvin.

I’m an AI researcher working at the intersection of theory and practice. My current research focuses on the science and theory of generative models and developing methods to enhance the capabilities and safety of language models. I’ve had the privilege of collaborating with Prof. Sitan Chen and Prof. Seth Neel. Earlier this year, I graduated summa cum laude with highest honors from Harvard with a B.A. in Computer Science & Mathematics.

My most recent project was on a unified theoretical framework for feature emergence in generative models (ICML 2024; ICML 2025 oral). In this work, we identified the mechanisms behind the emergence of high-level features, such as reasoning accuracy and toxicity, during the sampling trajectories of diffusion models and large language models. This work, detailed in my thesis, was recognized with both the Hoopes Prize for outstanding undergraduate research and the Captain Jonathan Fay Prize, given to the top three theses across all disciplines at Harvard College.

I love to chat with and meet new people. Please reach out at marvin[dot]fangzhou[dot]li[at]gmail.com.

Selected Publications

* denotes equal contribution

Blink of an Eye: A Simple Theory for Feature Localization in Generative Models
Marvin Li, Aayush Karan, Sitan Chen.
ICML, 2025 (Oral, top 1% of submissions)
arXiv / code
A unifying theory showing why and when features suddenly “lock in” during generation in both diffusion and autoregressive models.

Critical Windows: Non-Asymptotic Theory for Feature Emergence in Diffusion Models
Marvin Li, Sitan Chen.
ICML, 2024
arXiv / code
Introduces tight, distribution-agnostic bounds pinpointing when image features appear along the diffusion trajectory.

MoPe: Model Perturbation-Based Privacy Attacks on Language Models
Marvin Li*, Jason Wang*, Jeffrey Wang*, Seth Neel.
EMNLP Main Conference, 2023
arXiv
Shows that second-order gradient information lets an attacker detect training-set membership far more reliably than loss-only baselines.