Lion tracks in Wolverine territory

Anna Gilbert and Martin Strauss are experts in many aspects of mathematics and its relationship to fundamental questions of computer science. They are professors in the Department of Mathematics at the University of Michigan, and also have joint appointments within the Department of Electrical Engineering and Computer Science. They are also the joint hosts of the Coding, Complexity, and Sparsity Workshop on the University of Michigan campus, which we are currently enjoying. The workshop was also organized by Muthu Muthukrishnan, Ely Porat, and Ken’s Buffalo colleagues Hung Ngo and Atri Rudra, and sponsored by the NSF and the two University of Michigan departments.

Today we wish to convey a little of the flavor of the workshop, and discuss two talks in greater detail.

Anna and Martin are joint in others ways: they are a married couple, have four joint patents, and two children. They were colleagues at AT&T Laboratories for most of 1997–2004, following PhD’s at Princeton and Rutgers, respectively. Martin wrote a thesis on resource-bounded measure theory under Eric Allender in 1995, and then did a postdoc with that subfield’s progenitor, Jack Lutz, at Iowa State. That’s how I (Ken) came to work with Martin—in Germany where Jack co-organized a Dagstuhl workshop on this subject. Our discussions expanded work I’d started with my own student, D. Sivakumar, into a long project motivated to clarify the relationship between ${\mathsf{BPP}}$ and ${\mathsf{EXP}}$ which was joined by Harry Buhrman and Dieter van Melkebeek.

Then Central New Jersey called him back to its great tradition of applied mathematics related to signal processing, and Anna joined in after her doctoral work on methods for making differential equations more tractable. We did not “lose” a complexity theorist, rather complexity joined up with this area for important advances—hence the theme of the workshop. But let us take this tradition back even further, not 50 years but 350 years, to Cambridge, England rather than central Jersey, when Isaac Newton was developing tools for applied mathematics. There is a story, related in Jeremy Gray’s prefatory essay on the Millennium Prize Problems, that Johann Jakob Bernoulli recognized an anonymous correct answer to his Brachistochrone Problem as coming from Newton with the words:

I recognize the lion by his claws.

Let’s see what lion trail Newton left in a few of the talks.

## Newton Act I

Newton is famous for the creation of calculus, among too many other things to list here. Given a function ${f(x)}$ it is possible to define the derivative of ${f}$ by using limits—of course Newton used “infinitesimals” and “fluxions” which are geometric, intuitive, but not rigorous. However, they were later made rigorous by Augustin Cauchy, who introduced the now infamous ${\epsilon}$ and ${\delta}$ method. Some thank him, while others…

But what about derivatives for functions in finite fields? There is no obvious way to define a limiting process in a finite field, but there is a way to define derivatives. The key is to assume that the function is a polynomial:

$\displaystyle f(x) = a_nx^n + \cdots + a_1x + a_0.$

Then it has the derivative

$\displaystyle na_nx^{n-1} + \cdots + a_1.$

Note this is nothing more than using the rule that the derivative of ${x^k}$ is ${kx^{k-1}}$ and then extending the definition by linearity. If it helps just think of the derivative as the “formal” derivative. It makes sense for any polynomial, and has many interesting properties. Of course since the field is a finite field many theorems from the real-numbers case now fail. For example, there is a non-trivial polynomial ${f(x)}$ so that its derivative is zero:

$\displaystyle f(x) = x^p.$

Here we are working over a finite field with characteristic ${p}$. So while the derivative makes sense one must be a bit careful in using it.

Partial derivatives can even be extended to multiple-variable polynomials. The key is that if ${f(x,y)}$ is a polynomial then we define:

$\displaystyle \begin{array}{rcl} \partial_x (x^iy^j) &=& ix^{i-1}y^j \\ \partial_y (x^iy^j) &=& jx^iy^{j-1} \end{array}$

Of course these derivatives do not have the same geometric properties as those over the reals, but they do have some key properties. These properties allow the construction of local codes.

Swastik Kopparty gave a talk titled “Multiplicity Codes” on locally-decodable codes. In local decoding, one does not need to output the entire plaintext ${x \in \{0,1\}^k}$, but must output any single bit ${x_j}$ on demand. The code is locally decodable if this can be done in ${o(k)}$ time. His joint paper with Shubhangi Saraf and Sergey Yekhanin is the first to find codes with high rate and sharply fast local decoding:

Theorem 1 For all ${R < 1}$ and all ${\epsilon > 0}$, and all ${k}$, one can construct codes ${C}$ of rate ${R}$ that correct ${(1-R)^2\epsilon^2}$ errors, with local decoding in time ${O(k^{\epsilon})}$.

See his draft paper and older joint work for the details, but you may have guessed they make essential use of derivatives in finite fields. Essentially information is encoded by using a polynomial over ${x,y}$ and its two partial derivatives: one with respect to ${x}$ and the other with repect to ${y}$.

## Newton Act II

Newton also worked on symmetric functions. He discovered earlier work of Albert Girard that are now known as the Newton-Girard identities. The identities connect the elementary symmetric polynomials with all symmetric polynomials. The elementary ones are:

$\displaystyle \begin{array}{rcl} S_1(x_1,\dots,x_n) &=& x_1 + \cdots + x_n \\ S_2(x_1,\dots,x_n) &=& x_1x_2 + \cdots + x_{n-1}x_n, \end{array}$

and in general ${S_k(x_1,\dots,x_n)}$ is the sum over all products of ${k}$ distinct variables from ${x_1,\dots,x_n}$.

The power symmetric functions ${P_k(x_1,\dots,x_n)}$ are equal to

$\displaystyle x_1^i + x_2^i + \cdots + x_n^i.$

The identities show how compute the power symmetric functions from the elementary ones.

Raghu Meka presented new results on pseudo-random number generators, in joint work with Parikshit Gopalan, Omer Reingold, Luca Trevisan, and Salil Vadhan. The details are quite technical, and will appear in their paper which is accepted to this year’s FOCS conference.

One of their lemmas, however, caught our special attention:

Lemma 2 Suppose that

$\displaystyle \begin{array}{rcl} |S_1(x_1,\dots x_n)| &\le& \mu/2 \\ |S_2(x_1,\dots x_n)| &\le& \mu^2/2 \\ \end{array}$

then it follows for all ${k \geq 3}$ that

$\displaystyle S_k(x_1,\dots x_n) \le \mu^k.$

This is a quite neat inequality that may have many other applications.

## Newton Act III?

Newton is of course famous for his iterative method for finding solutions of equations to high numerical precision. This was part of Dick’s talk on linear-time computation along lines of his post on the Hartmanis-Stearns Conjecture. But it has general application in compressive sensing in particular, for example these papers by Emanuel Candés, and by Justin Romberg and Salman Asif at Georgia Tech. The question is what other ramifications this may have for computational complexity.

## Open Problems

There was an Open Problems session at the end today. We presented problems from the end of the above-linked Hartmanis-Stearns post, this post on quantum circuits, and this one on “lazy” problem-solving. Andrew McGregor presented one on unweighted data streams. Two problems at the end connect to things we’ve talked about, and can be stated here:

• Mahdi Cheraghchi: Consider circuits ${C_n}$ computing a linear operator ${A x}$ with ${A \in F^{n \times n}}$, where ${F}$ is the finite field ${F_q}$, each of whose gates computes a fixed linear combination of its inputs. It is trivial to achieve ${|C_n| = O(n^2)}$, and many important cases such as the Fourier transform can be done in ${O(n\log n)}$ size. Can we compute more functions in sub-quadratic size if we allow gates to compute over extension fields ${F_Q}$, ${Q = q^r}$?

If ${r = O(1)}$ there is no asymptotic difference, but the problem is interesting for unbounded ${r}$. What if we allow the trace operator ${tr_{Q:q}}$ as a primitive operation?

• Swastik Kopparty: Can we achieve a two-variable extension of the Berlekamp-Welch algorithm for Reed-Solomon decoding? Namely given ${S \subset F_q}$ and ${h: S \times S \longrightarrow F_q}$, can we find a polynomial ${p(x,y)}$ of a specified total degree ${d}$ such that

$\displaystyle |\{(x,y): p(x,y) \neq h(x,y)\}| \leq \frac{1}{2}|S|(|S|-d)?$

We thank Martin and Anna and the other organizers for their work and invitation.

[fixed j superscripts; needed to add absolute value bars to Lemma 2]