Comments on three papers from the Conference on Computational Complexity

Michael Saks is Chair of the Program Committee of this year’s Conference on Computational Complexity (CCC). He was helped by Paul Beame, Lance Fortnow, Elena Grigorescu, Yuval Ishai, Shachar Lovett, Alexander Sherstov, Srikanth Srinivasan, Madhur Tulsiani, Ryan Williams, and Ronald de Wolf. I have no doubt that they were faced with many difficult decisions—no doubt some worthy papers could not be included. The program committee’s work does not completely end after the list of accepted papers is posted, but it is not too early to thank them all for their hard work in putting together a terrific program.

Today I wish to highlight three papers from the list of accepted ones.

The biggest danger in selecting three is that I will disappoint the authors of the remaining ${29-3}$ papers. Of course that assumes that they care what we say at GLL, so perhaps there is really no danger. In any event I selected the papers because they caught my immediate fancy. I plan on discussing others later on, but let’s just look at these now. Two concern favorite problems of mine, and the third is an amazing result—in my opinion.

Of course as a PC member, Mike was not allowed to submit a paper, so it follows that he is not an author of any of the CCC papers we mention. But one of his recent papers, with Ankur Moitra of Princeton IAS, appeals to me as an example of learning from the past. Their paper analyzes an old algorithm for learning a distribution from random samples to show that it runs in polynomial time, and beefs it up to apply in more-general cases. Their proof uses linear programming duality and analyzes invertible matrices that have bad condition numbers, something Ken and I have been thinking about in connection with the Jacobian Conjecture and other matters. Their paper appeared at FOCS 2013.

## Something New About Kolmogorov Complexity

${\bullet}$ Linear list-approximation for short programs (or the power of a few random bits) by Bruno Bauwens and Marius Zimand.

They study the Kolmogorov complexity of ${x}$—the length ${C(x)}$ of the shortest program that generates ${x}$. One of the great “paradoxes” of Kolmogorov complexity is that while most strings have ${C(x)}$ about the length of ${x}$, it is impossible to compute the function ${C(x)}$. Even worse it is impossible to prove in a fixed theory that ${C(x) \ge k}$ for ${k}$ large enough for any ${x}$. See here for more details.

Their cool result is that given a string ${x}$ they can construct a list of ${n=|x|}$ “answers” so that one of them is a description of length ${C(x)+O(\log n)}$ of ${x}$. The computation of this list can be done by a randomized algorithm that uses only ${O(\log n)}$ random bits. As they say:

Thus using only few random bits it is possible to do tasks that cannot be done by any deterministic algorithm regardless of its running time.

Very neat.

## Progress on Group Isomorphism

${\bullet}$ Algorithms for group isomorphism via group extensions and cohomology by Joshua Grochow and Youming Qiao.

Group isomorphism for groups defined by multiplication tables (${\mathsf{GpI}}$) is one of my favorite problems—note the initials are “Gp” for group and “I” for isomorphism—while the more-storied graph-isomorphism problem gets the shorter moniker ${\mathsf{GI}}$. The best general bound remains the longstanding one of order ${n^{O(\log n)}}$, which we have discussed recently and before. That result uses just one important but trivial property of finite groups: any proper subgroup of a group is at most one half the size of the group.

A curious consequence of the classification of finite simple groups is that isomorphism for simple groups is easy—in polynomial time. This is because finite simple groups are all ${2}$-generated. By the way there appears to still be no chance to get a direct proof of this: all roads to ${2}$-generation lead through the whole classification theorem. Indeed a proof of ${2}$-generation would greatly simplify the classification proof itself, as remarked in this MathOverflow thread.

So the “hard” cases are all groups that have non-trivial normal subgroups. Suppose that ${G_{1}}$ and ${G_{2}}$ are two groups that we wish to see if they are isomorphic. Let ${H_{1}}$ be a non-trivial normal subgroup of ${G_{1}}$. An obvious idea is to try and find the corresponding ${H_{2}}$ of ${G_{2}}$. Suppose that we could do this. Then we can check whether ${H_{1}}$ is isomorphic to ${H_{2}}$ by recursion. The obstacle is that even if they are the isomorphic it is possible that the subgroups “sit” inside their respective groups in vastly different ways. This is what makes ${\mathsf{GpI}}$ hard and interesting.

The group extension problem is all about how a subgroup sits inside a larger group. And cohomology is one tool that can be used to understand this.

In order to make further progress on group isomorphism the authors believe—I think correctly—that much deeper properties of groups must be brought to bear on ${\mathsf{GpI}}$. They argue that any strategy must begin to look at the so called cohomology classes and structure of group extensions. Following this strategy, they solve ${\mathsf{GpI}}$ in ${n^{O(\log \log n)}}$ time for certain important families of groups. These results uses the detailed structure of cohomology classes, as well as algorithms for equivalence, coset intersection, and determining the structure of modules over finite-dimensional algebras.

Note, the bound is still super-polynomial. The exponent still is unbounded, but as someone who has thought too often about this problem, I can aver that going from ${\log n}$ to ${\log \log n}$ in the exponent is quite impressive.

## PIT and Polynomial Factoring

${\bullet}$ Equivalence of Polynomial Identity Testing and Deterministic Multivariate Polynomial Factorization by Swastik Kopparty, Shubhangi Saraf, and Amir Shpilka.

The title is direct and almost needs no comment. They prove that polynomial identity testing (PIT) is equivalent to factoring of polynomials over many variables. One fact they use is the—as they say—amazing fact that if a polynomial ${f}$ has an arithmetic circuit of size ${s}$ and degree ${d}$ in ${n}$ variables, then the irreducible factors of ${f}$ have arithmetic circuits of size ${\mathsf{poly}(s,d,n)}$.

They also use the famous Hilbert Irreducibility Theorem. Actually they use an effective version of it. I thought rather than outline their proof, which is so carefully explained in their paper, that I would instead say something about this theorem.

Suppose that ${f(x,y)}$ is a polynomial with rational coefficients. Suppose further that ${f(x,y)}$ is irreducible over the rationals. Hilbert’s question was the natural one:

Must there be a rational ${r}$ so that ${f(x,r)}$ remains irreducible?

The answer is yes. And it generalizes to multiple variables and to other fields, but not all. Indeed a field with this property is called—you guessed right—a Hilbertian field. Essentially a Hilbertian field is the opposite from being algebraically closed: the complex numbers are not Hilbertian. Even further there is interest in what subsets of the rationals can be used and still leave ${f(x,r)}$ irreducible. It is known that various subsets of the rationals have this kind of universality property, and there is continuing interest in what sets can work.

## Open Problems

Some obvious questions arise in regard to each paper. Can we use the list of descriptions of a string ${x}$ to solve some interesting question? It seems like a powerful idea that should help solve some problems. Also lets finally solve the group isomorphism problem. And accessible problems about PIT too.

1. February 18, 2014 9:22 am

Regarding Kolmogorov complexity: why do they end up with a list of possible answers? Why can’t they just run the programs in the resulting list and see which one produces $x$?

• February 18, 2014 4:08 pm

Perhaps the issue is that all the candidate programs potentially could produce x, but some may be shorter than others. The halting problem prevents an attempt to check whether the shorter ones suffice. Just a guess.

February 18, 2014 7:49 pm

Dan

The point is we can run the algorithms in parallel and get the one. But they can generate the list fast. Usually, loosely speaking, we expect that the short description is a computation that runs for a very long time.

2. February 18, 2014 10:50 am

I understand that the Bauwens-Zimand result has theoretical interest from the point of view of “doing something with randomization that cannot be done deterministically”. But I am wondering whether the object computed, namely the list one of whose elements is a succint representation of x, has any interest (I am considering the computation of a succint representation of a string interesting; but from such a list, one cannot in general deduce which one is the succint representation). Could you say something about that?

February 18, 2014 7:51 pm

Amir

I agree. I like the result, but would really like to see some neat application. At first I thought there might be some way around the fact that the list is not a single element. But my ideas do not yield anything yet.

So I agree, we need to figure out some application.

It is impossible to effectively obtain a shorter list containing the succinct representation. The paper proves a lower bound of $\Omega(n/(c+1))$ for the size of any list obtained by a randomized algorithm (no matter how slow) that contains a representation that is within c bits from optimal. For deterministic algorithms there is lower bound of $\Omega((n/(c+1))^2)$, proved in a paper by Bauwens, Mahklin, Vereshchagin and Zimand (CCC 2013).