LDCs

Posted: 2016-03-03 , Modified: 2016-03-14

Tags: LDC

Previous work

Lower bounds

Unless otherwise specified, all bounds are for nonadaptive linear LCC’s/LDC’s. What about adaptive codes?

LCC’s

LDC’s

Let \(\de,\ep\in (0,\rc2]\), \(q\ge 2\).

Upper bounds (construction)

Families

Constant query

A matching vector family in \((\Z/(p-1))^m\) of size \(n\) gives a \((p-1)\)-query LDC \((\Z/(p-1))^m \to (\Z/(p-1))^m\). There is also a trick to reduce the number of queries (e.g. to 3).

Construction: Given \((v_i)\), the encoding is \[(a_i)\mapsto f=\sum_i a_j x^{v_j}.\] (\(x\) is a vector in \(\F_p^{\times n}\).) Recovery works by taking a primitive \(g\) and summing \(f(g)+\cdots + f(g^{p-1}) = a_i\) by orthogonality of characters.

Motivation: Ordinary Reed-Muller codes are \[(a_d)\mapsto \sum a_d x^d.\] We modify by replacing \(x\) by an element in \(\F_p^{\times}\).

More queries

Basic definitions

LDC’s

A \((q,\ep,\de)\), \(k\to n\) LDC consists of

Let \(\mu_i(x)=\EE_{t\sim M_i} D_{i,t}(x)\), \(\mu(x)=(\mu_1(x),\ldots, \mu_k(x))\).

Can we assume WLOG there is perfect decoding of codewords? \(D_i(C(x))=x_i\).

If we can, then \(\{-1,1\}^k\subeq \im \mu\). Otherwise, we still have: for each \(x\in \{-1,1\}\), there is \(x'\) with \(\sgn(x')=\sgn(x)\) such that \(|x_i'|\ge \rc2 + \ep\).5 (If there is \(\mu\) such that this is true, does that mean we can make a LDC out of it? Yes—send the messages to the preimages of those points.)

The picture to have in mind: \[\{\pm 1\}^k \xra{C} \{\pm 1\}^n \xra{\text{noise}} \{\pm 1\}^n \xra{\mu} [-1,1]^k\] (\(\mu\) is the average decoding according to \(D\).)

Notation

Lower bound

It is necessary for \(\im \mu\) to have a large \(\ep\)-net. Otherwise it can’t contin \(\{-1,1\}^k\). Thus we can show lower bounds by showing any \(\im\mu\) has a small \(\ep\)-net. Think of this as a communication complexity problem: Alice wants to communicate \(\mu(y)\) to Bob with \(\ep\) accuracy without sending many bits. (Be careful about \(\ep\) vs. \(\eph\) here.)

Lemma: If there is an \(\ep\)-net of size \(s\), then \(\ln s \ge \pa{1-H\pa{\eph}}k\). (Generalization \(p\)-norm: \(\ge \pa{1-H\pf{\ep^p}2}\).)

Thus if \(s= n^{O_\ep(1)}\) then we get an exponential lower bound.

Toy problem

A random matching should not work. Can we generalize to a larger class of mappings that don’t work? Yes, if we have the property, say, that every set of \(\Om(\ln n)\) of the matchings form an expander, because then the average of \(\mu_i\) for \(d=C\ln n\) indices is concentrated around 0, and even stronger than not having an \(\ep\)-net, the larged \(d\) of them has to be on the order of \(\rc{\sqrt{d}}\), and can’t be close to 1.

Abelian group

Consider matchings \((x,x+y)\) where \(y\) ranges over the whole group. For \(q\)-query, consider \((x,x+y,\ldots, x+(q-1)y)\).

Is this reasonable? In the MVC construction, we query \((h,hg,\ldots, hg^{(p-2)})\) (i.e., fixing a generator \(g\), this is a line \(g^{w+tv}\)). So the \(p-1\)-query MVC is an example of such a construction for \(q=p-1\). Does a bound here for arbitrary \(q\) give a bound for MVC’s? (But it can’t because doesn’t MVC achieve \(2^{n^{o(1)}}\)

Theorem (Approximate Caratheodory): For \(2\le p<\iy\), given \((x_i)_{i=1}^m\in \R^d\), \(\ep>0\), \(\ga = \max_{i\in[n]}\ve{x_i}_p\), for every \(\mu\in \conv\{x_1,\ldots, x_n\}\) there exists \(i_1,\ldots, i_s,s\le \fc{4p\ga^2}{\ep^2}\) such that \[\ve{\mu-\rc{m}\sum_{i\in S} x_i}_p\le \ep.\]

For 2-query LDC, let \(M_y\) be a group matching, \(\set{(x,x+y)}{y\in G}\), \(\La_{M_y}(f,g) = f*g\).6 Now \(f*g\in \conv\{\chi_a\}\) so Theorem gives that it is \(\ep\)-approximated in \(L_2\) by \(O\prc{\ep^2}\) characters. The \(\ep\)-net is of size \(O\pf{\log n}{\ep^2}\), which gives an exponential bound.

3-query LCC’s:

This doesn’t work for 3-query LDC’s. (It can’t because of MVC’s.) The decoding functions are no longer given by all \(y\in G\), but only \(y\in S\sub G\) where \(|S|=k\). The \(L_1\) norm (normalized) blows up by \(\fc nk\). We have to take \(\ep\leftarrow \ep \fc{k}{n}\), which can make \(c(\ep,d)\) a LOT larger.

Question: Generalize this to

Matrices

Upper bound

Let \[\begin{align} \La_D(f_1,\ldots, f_q)&= \sum_{d\in D}\sum_{g\in G} f_1(g)\cdots f_q(g+(q-1)d)\\ \ve{A} &= \sup\set{\fc{A(f_1,\ldots, f_q)}{\prod_i\ve{f_i}_{\ell_q}}}{f_i\in \R^n\bs 0}\\ \la(D) &= \ve{\rc{|D|}\La_D-\rc{|G|}\La_G}\\ \ka(\ep) &= \max\set{k\in \N}{\E[\la(D)]\le \ep}. \end{align}\]

(Use the \(q\)-norm for \(q\)-linear forms because scaling is good?)

Theorem 3.3: For all \(\ep>0\) there exists \(\al,\be\in(0,1]\), \(C:\{-1,1\}^k\to \{-1,1\}^n\) (where \(k=\al \ka(\ep)\), \(n=q|G|\)), \(M_i\) matchings of size \(\fc nq\), such that for all \(x\in \{-1,1\}^k\), \[\EE_{i\in [k],(j_1,\ldots,j_q)\in M_i}[C(x)_{j_1}\cdots C(x)_{j_q}=x_i] \ge\be,\] i.e.,there is a average case LDC with \((r,\fc{\be}2, q)\).7

Proof: Let \(K=\ka(\ep)\) and \(D=(d_1,\ldots, d_K)\sub G\) be a random multiset. Let \[A_i=\rc{K} \La_{d_i} - \rc{K|G|}\La_G.\] Then \[\begin{align} \ep&\le \EE_D\ba{\ve{\rc K\La_D-\rc{|G|}\La_G}}\\ &=\EE_{\mathcal A} \ba{\ve{\sumo ik \pa{A_i-\EE_{\mathcal A'}A_i'}}}\\ &\le \EE_{\mathcal A, \mathcal A'}\ba{\ve{ \sumo ik (A_i-A_i')}} & \text{Jensen}\\ &=\EE_{x_i}\EE_{\mathcal A,\mathcal A'} \ba{\ve{\sumo ik x_i(A_i-A_i')}}& \text{symmetrization}\\ &\le 2 \EE_{x_i,\mathcal A} \ba{\ve{\sumo ik x_iA_i}} &\text{Jensen}\\ &\le \fc 2K \EE_{D,D',x_i} \ba{\ve{\sumo ik \La_{d_i} - \La_{d_i'}}}&\text{symmetrization}\\ &\le \fc 4K \EE_{D,x_i} \ve{\sumo ik x_i\La_{d_i}}& \text{Jensen}\\ \implies \exists D^*, \quad \rc{K}\ve{\sumo ik x_i \La_{d_i^*}}&\ge \fc \ep4. \end{align}\] By Elton’s theorem there exists \(k\ge \fc{cK\ep^2}{16}\), \(E\subeq D^*\) of size \(k\), for all \(x\in \{-1,1\}^E\), \[\rc{k}\ve{\sum_{d\in E} \sum x_d \La_{d}}\ge \fc{c\ep}{4}.\] For each \(x\) there exists \(f^x\), \(\ve{f^x}=1\) with \[\rc k \ve{\sum_{d\in E} x_d \La_d(f_1^x,\ldots, f_q^x)} \ge \fc{c\ep}4.\] We define the LDC by \[\begin{align} C:\{-1,1\}^E &\to (\R^q)^G\\ C(x) &=(f_1^x,\ldots, f_q^x). \end{align}\] Then \(\La\) exactly gives the correlation of the decoded bit: \[\begin{align} \La_d((f_i^x)_i) &= \sum_{g\in G}\prod_i f_i^x(g+(i-1)d)\\ &=\sum_{(j_i)\in M_d} \prod_i C(x)_{j_i}. \end{align}\]

Defining \(C'(x)=\fc{C(x)}e\) so that \(\ve{C'(x)}\le \fc{q^{\rc q}}e\le 1\), we get \[ \rc k \sum_{d\in E} \sum_{(j_i)\in M_d} x_d C'(x)_{j_1}\cdots C'(x)_{j_q}\ge \fc{c\ep}{4e^q}, \] the theorem with \(\al=\fc{c\ep^2}{16}\) and \(\be = \fc{c\ep}{4e^q}\). \(\square\)

Note there is a quadratic loss in going from average-case LDC to LDC (using Chernoff).

Scraps

Smooth decoding: For \(C:\{\pm 1\}^k\to (B^n)^N\) a \((q,\de,\ep)\)-LDC, for each \(i\in [k]\) there exists a probability distribution \(\mu_i\) over \([N]^q\) and functions \(f_S^i,S\in [N]^q\),

Directions

3/10

Talk with Zeev 3/2

Consider a partition of the complete bipartite graph into matchings. What is the worst partition, in the sense that you have to take the union of many (\(\Om(\ln n)\)) matchings before you get an expander in the sense of the Expander Mixing Lemma?

Now consider if the bipartite graph represents a Cayley graph (\(x\) on the left is connected to \(gx\) on the right). “Abelian groups are the worst expanders.”

If you need \(\ge d\) matchings, you get a LDC from \(d\) to \(n\).

Consider tripartite hypergraphs, with \(n^3\) matchings partitioned into \(n^2\). Is the worst partition take \((\ln n)^2\)?

Low-weight LDCs to LDCs

To add

Open questions


  1. Is this true for just linear codes or any codes?

  2. Over other fields, the decoding functions are \(\prod \ze_p^{k_iy_i}\) where \(y_i\in \Z/p\).

  3. We can assume exactly \(q\) queries because we can add a few bits that are always 1. What about if we want to negate? Will this ever be the case? In any case we can cover this by adding \(\pm\).

  4. Why do these have to be matchings? The fact they are matchings means it’s an LDC with parameters \(\de n q\) for error \(\de q\).

  5. This doesn’t make much of a difference for the \(\ep\)-net argument? So for simplicity we just assume \(\{-1,1\}^k\sub \im C\)?

  6. (Do we just need \(f*f\)?)

  7. Here by \((q,\ep,\de)\)-average case LDC, the parameter \(\de\) means each entry is queried with probability \(\le \fc{\de}{n}\). In my definition of LDC, it’s a \((r,\fc{\be}2-\de q, \de)\)-LDC.