(A16) Provable algorithms for inference in topic models

Posted: 2016-04-04 , Modified: 2016-04-04

Tags: topic models

Model

Definitions

Define

Note \[\la_\de\le \la_0=\la \le \ka.\] (To see this, take \(x=By\).)

Model

Algorithm

Given \(y\sim Ax\) and \(A\),

Analysis

Thresholded linear inverse algorithm

We have \[ \E \wh x_i =x_j^* + \sumo jk ((BA)_{ij} - \de_{i,j}) x_j^*.\] This is why it’s natural to consider the \(\de\)-biased inverse: we don’t need \(B=A^+\) exactly, we can relax this to each \((BA)_{ij} - \de_{i,j}\) being small. Now use Bernstein’s inequality to get concentration on the order of \(\tau\). Finally use union bound.

MLE estimate

If

then with high probability \[\begin{align} \ve{Ax_{MLE}-Ax^*}_1 &\le \wt O\pa{\sfc rn}\\ \ve{x_{MLE}-x^*}_1 &\le \wt O\pa{\ol \ka \sfc rn} \end{align}\]

The proof is like the proof of asymptotic normality of MLE, but with matrix concentration to get a finite sample bound.

Sample complexity lower bounds

These exist!