(SV16) The mixing time of the Dikin walk in a polytope

Posted: 2016-09-07 , Modified: 2016-09-07

Dikin walk

\(K\)

\(a_i^Tx\ge b\)

\[\begin{align} F(x) :&= -\sum_{i\in [m]} \ln (a_i^Tx-b_i)\\ H(x) = \nb^2F(x) &= \sum_{i\in [m]} \rc{(a_i^Tx-b_i)^2}a_ia_i^T\\ \ve{v}_x^2 :&= v^TH(x)v\\ \text{Dikin ellipsoid }\mathcal E_x:&=\set{z}{\ve{z-x}_x\le 1}. \end{align}\]

Consider the random variable \(g_x\) given by \[ z = x + \fc{r}{\sqrt n} H(x)^{-\rc 2}g,\quad g\sim N(0,1),\] i.e. it is a Gaussian with covariance matrix \(\fc{r^2}{n}H(x)^{-1}\) centered at \(x\).

Let \(p_x\) be the \(g_x\) with the Metropolis filter applied, \[ p_x(z) = \min\{g_x(z),g_z(x)\}\] (stay at \(x\) with remaining probability).

Theorem and Proof

Define the cross-ratio distance \[\si(x,y) = \fc{|\ol{xy}||\ol{pq}|}{|\ol{px}||\ol{qy}|},\] where \(xy\) hits the boundary of \(K\) at \(p,q\) (\(pxyq\) are in that order). \(\ln (1+\si(x,y))\) is a metric on \(K\).

Theorem 1: Given a reversible random walk in \(K\) with stationary distribution uniform, suppose \(\exists \De>0\) such that for all \(x,y\in K\) with \(\si(x,y)\le \De\), \(\ve{p_x-p_y}_1\le 1-\Om(1)\). Then after \(O(\De^{-2})\), the lazy version of the walk from a warm start is within \(\rc4\) TV distance from the uniform distribution.

(This is saying something about uniform continuity of \(x\mapsto p_x\). There’s a “smoothing out” effect.)

Proof of main theorem.

Theorem 2: For the Gaussian Dikin walk with \(r\le \fc{\ep}{400}\pa{\ln \pf{200}{\ep}}^{-\fc 32}\), for \(x,y\in K\) with \(\ve{x-y}_x\le \fc{r}{\sqrt n}\), \(\ve{p_x-p_y}\le \ep\).

Lemma 3: Let \(r\le 1, c\ge 0, c\le \min\{r,\rc 3\}\), \(x,y\in K\). If \(\ve{x-y}_x\le \fc{c}{\sqrt n}\), then \(\ve{g_x-g_y}_1\le 3c\). (Proof: KL divergence between Gaussians, Pinsker’s inequality.)
Lemma 4: Given \(\ep\in [0,\rc 2]\), for \(r\le \fc{\ep}{100}(\ln \fc{50}{\ep})^{-\fc 32}\), we have \(\ve{p_x-g_x}_1\le \ep\). (Proof: Hypercontractivity)
Use triangle inequality.

For any polytope \(K\) described using \(m\) inequalities, \(\si(x,y) \ge \rc{\sqrt m}\ve{x-y}_x\).

Use Theorem 1. \(\si(x,y)\le \De\) implies \(\ve{x-y}_x\le De\sqrt m\). Take \(\De =O\prc{\sqrt{mn}}\).

(SV16) The mixing time of the Dikin walk in a polytope - a simple proof

Dikin walk

Theorem and Proof