Weekly summary 2016-08-06

Posted: 2016-08-06 , Modified: 2016-08-06

Tags: none

Representation learning

In dictionary learning, we assume we have samples \(y = Ax + e\) where \(x\) comes from a sparse distribution (ex. \(x_i\) independent, \(x_i\neq 0\) with probability \(s/n\) and then is drawn from some distribution not concentrated at 0) and \(e\) is error (ex. Gaussian).

The way we stated our problem is that \(x\cdot a_i\) is large for only a few \(i\). This is similar to dictionary learning with \((A^+)^T\) where the columns of \(A\) are the \(a_i\). (I.e. the \(x\)’s here are really the \(y\)’s in DL.)

I may be wrong but I think that what’s different is that

(Actually, I think the undercomplete case when the number of \(a_i\) is less than the dimension of \(n\) doesn’t quite correspond to DL because the map \(x\mapsto (x\cdot a_i)_i\) is not invertible…)