Musings on Barron's Theorem

Posted: 2016-12-22 , Modified: 2017-03-15

Tags: neural net

Directions 12/22

Generalization bounds

229T p. 198

Let \(h\) be 1-Lipschitz sigmoid function.

\[\begin{align} f_{w,\al}(x) &=\sumo jm v_j h(w_j\cdot x)\\ \mathcal F &= \set{f_{w,\al}}{\forall j\in [m],\ve{\al}_2\le B_2', \ve{w_j}_2\le B_2}\\ R_n(\mathcal F)&\le \fc{2B_2B_2'C_2\sqrt m}{\sqrt n} \end{align}\]

Composition properties of Rademacher complexity aligns very nicely with the layer-by-layer compositionality of neural networks.

Problems

[Andoni, Panigrahy, Valiant, Zhang] Neural networks approximate bounded-degree polynomials.

Kernel methods

From email

Some things:

Conclusion:

Other musings: