Hi, my name is Marvin.
I’m an algo developer at Hudson River Trading.
In the past, I have studied the science and theory of generative models and developed methods to enhance the capabilities and safety of large language models. I’ve had the privilege of collaborating with Prof. Sitan Chen and Prof. Seth Neel. Earlier this year, I graduated summa cum laude with highest honors from Harvard with a B.A. in Computer Science & Mathematics.
My most recent project was on a unified theoretical framework for feature emergence in generative models (ICML 2024; ICML 2025 oral). In this work, we identified the mechanisms behind the emergence of high-level features, such as reasoning accuracy and toxicity, during the sampling trajectories of diffusion and large language models. This research, detailed in my thesis, was recognized with both the Hoopes Prize for outstanding undergraduate research and the Captain Jonathan Fay Prize, given to the top three theses across all disciplines at Harvard College.
I love to chat with and meet new people. Please reach out at marvin[dot]fangzhou[dot]li[at]gmail.com.
Selected Publications
* denotes equal contribution
Blink of an Eye: A Simple Theory for Feature Localization in Generative Models
Marvin Li, Aayush Karan, Sitan Chen.
ICML, 2025 (Oral, top 1% of submissions)
arXiv / code
A unifying theory showing why and when features suddenly “lock in” during generation in both diffusion and autoregressive models.
Critical Windows: Non-Asymptotic Theory for Feature Emergence in Diffusion Models
Marvin Li, Sitan Chen.
ICML, 2024
arXiv / code
Introduces tight, distribution-agnostic bounds pinpointing when image features appear along the diffusion trajectory.
MoPe: Model Perturbation-Based Privacy Attacks on Language Models
Marvin Li*, Jason Wang*, Jeffrey Wang*, Seth Neel.
EMNLP Main Conference, 2023
arXiv
Shows that second-order gradient information lets an attacker detect training-set membership far more reliably than loss-only baselines.