(R16) How to calculate partition functions using convex programming hierarchies - provable bounds for variational methods

Posted: 2016-12-28 , Modified: 2016-12-28

Tags: dynamical systems, quasi-convexity

Introduction

Two approaches for calculating partition functions

  1. Markov chains to sample
    • (Jerrum04, JerrumSinclair1993) Certain Markov chains mix rapidly; used to approximate permanent with nonnegative entries and partition function for ferromagnetic Ising.
  2. Variational methods: partition function is the solution of a (intractable) optimization problem over the polytope of valid distributions.
    • Popular, faster, easier to parallelize
    • Little theoretical understanding
    • Another algorithm: Belief propagation (solving non-convex relaxation)
      • Regime of correlation decay and locally tree-like graphs.

Use Sherali-Adams and Lasserre convex programming hierarchies to get the first provable, convex variational methods. They work because local correlations propagate to global correlations, the opposite of correlation decay.

(See hierarchies.)

Definitions and setting

Ising model \(p(x)\propto \exp\pa{\sum_{i,j} J_{i,j}x_ix_j}\), \(x\in \{-1,1\}^n\) is \(\De\)-dense if letting \(J_T=\sum_{i,j}|J_{i,j}|\), \[ \forall i,j\in [n], \De |J_{i,j}|\le \fc{J_T}{n^2}, \] (Ex. if the \(J_{i,j}\) is 1 for an edge and 0 for a non-edge, and the graph has density \(cn^2\), then \(\De = \rc{c}\).)

\(p\) is regular if \(\sum_j |J_{i,j}|=J'\). The adjacency matrix is \(A_{i,j} = \fc{|J_{i,j}|}{J'}\). Let \(\rank_\tau(A)\) be the number of eigenvalues with \(\ad\ge\tau\).

Why dense Ising models?

Result

Method

Dense graphs, SA

Low threshold rank, Lasserre

Interpretation