Posts by Collection
publications
MOPE: Model Perturbation-based Privacy Attacks on Language Models
Published in EMNLP 2023 Main Conference – Large Language Models and the Future of NLP track, 2023 — PaperCritical Windows: Non-Asymptotic Theory for Feature Emergence in Diffusion Models
Published in International Conference on Machine Learning, 2024 — PaperBias Begets Bias: The Impact of Biased Embeddings on Diffusion Models
Published in ICML 2024 Workshop on Trustworthy Multi-modal Foundation Models and AI Agents, 2024 — PaperBlink of an Eye: A Simple Theory for Feature Localization in Generative Models
Published in International Conference on Machine Learning (Oral, top 1%), 2025 — PaperIn the Blink of an Eye: A Unified Theory for Feature Emergence in Generative Models
Published in Harvard College thesis, 2025 — PaperTeaching Models to Verbalize Reward Hacking in Chain-of-Thought Reasoning
Published in ICML 2025 Workshop on Reliable and Responsible Foundation Models, 2025 — PaperFirm Foundations for Membership Inference Attacks Against Large Language Models
Published in ICML 2025 Workshop on Data in Generative Models, 2025 — Papertalks
Critical windows: a non-asymptotic theory for feature emergence in generative models Permalink
Published:
teaching
COMPSCI 124
Undergraduate course, , 2024
COMPSCI 2243
Graduate course, , 2024
COMPSCI 1240
Undergraduate course, , 2025