Brownian motion (and other probability prerequisites)
Stochastic calculus
High-dimensional probability (LSI, etc.)
Langevin papers
Boosting, etc.
Fix notes!
Speculative thoughts
Relationship between Rademacher width and size of \(\ep\)-net?
Can we rephrase Frieze-Kannan regularity lemma as follows: there exists a small neural net that approximately computes cuts? Number of hidden nodes independennt of size of graph?
FK as: set of graphs is “small” in cut metric, so has a small \(\ep\)-net. Rademacher?
Can we show that certain solutions to problems are approximable by 2-layer neural nets, by showing that their Barron norm is small?
Ex. Reinforcment learning value function. Must assume some kind of geometry on state space. (For arbitrary partitions, trivial as all value vectors in \(\R^A\).)
On policy side: it’s clear that graph can be partitioned.
Convolution - what’s the analogue here? It’s very weird because functions form infinite-dimensional space, nonlinear functions on functions is weird.
Something about looking at \(\ep^{-\lg |S|}\) nodes, depth \(-\lg \ep\)?
Would be nice to study niceness of value, optimal policy in belief space.
Barron is a stronger condition than Lipschitz - says there aren’t many directions of variation.