My work focuses on the theoretical foundations of machine learning, with an emphasis on
generative models and large-scale systems. I am currently mentored by
Prof. Han Liu
(Northwestern University).
Research Interests
I work in machine learning theory, concentrating on the mathematical foundations of
large language models and generative AI. My recent work spans distribution estimation,
approximation theory for transformers, and flow-based generative models.
Papers & Projects
-
We provide the first end-to-end theoretical analysis of Discrete Flow Matching (DFM)
generative models. Key contributions include a discrete-to-continuous functional
extension technique and a method for bounding discrete distribution estimation error
via velocity estimation error.
-
We develop a systematic recipe for translating ReLU approximation results to the
softmax attention mechanism. The framework covers a wide range of approximation
targets and yields target-specific, economical resource bounds — going beyond
generic universal approximation statements.
-
Learning Low-Dimensional Manifold Data with Flow-Matching Transformers
ICML 2026
We establish statistical estimation rates for flow matching models when data lie on
a low-dimensional manifold, characterizing how the latent dimension and the smoothness
of the velocity function jointly govern the rates.
-
We introduce DoMinO (Discrete flow Matching policy Optimization), a unified framework
for reinforcement learning fine-tuning of DFM models under a broad class of policy
gradient methods. I am responsible for the theoretical analysis of the paper.
Future Directions
Building on my current work, I plan to deepen my investigation of machine learning
theory as it applies to large language models and generative AI, while actively seeking
opportunities to connect theoretical insights with empirical practice.