I am a Ph.D. student at Princeton working with Sanjeev Arora (research group page, ML theory at Princeton).
I focus on machine learning theory, and also have broad interests in theoretical computer science and related math.
Although machine learning (and deep learning in particular) has made great advances in recent years, our mathematical understanding of it is shallow. Learning problems can be highly nonconvex, yet tractable in practice. What hidden structure do these problems have, and how can we design algorithms to take advantage of it?
Current interests include:
- Probabilistic modeling: How to design provable algorithms for learning probability distributions and sampling from them? How can we improve classical algorithms like Markov Chain Monte Carlo, or test the quality of the samples?
- Control theory and reinforcement learning: It is a well-studied problem how to find the optimal control for known linear dynamical system. However, reinforcement learning deals with learning how to act unknown, combinatorially complex systems; algorithms are heuristic and slow. How to bridge this gap?
- Neural networks: Neural networks tackle highly nonconvex problems but do very well in practice. Why? What kind of algorithmic improvements can we come up with by understanding their theoretical foundations more deeply?
- Natural language processing: Language is a fundamental part of human intelligence and a big frontier for machine learning. How do we create machines that can understand “grammar” and “semantics”?
Beyond Log-concavity: Provable Guarantees for Sampling Multi-modal Distributions using Simulated Tempering Langevin Monte Carlo.
with Rong Ge and Andrej Risteski. NIPS AABI Workshop 2017. [arXiv, pdf, webpage].
On the Ability of Neural Nets to Express Distributions.
with Rong Ge, Tengyu Ma, Andrej Risteski, and Sanjeev Arora. COLT 2017. [arXiv, PMLR 65:1271-1296, webpage]