Fast Matrix Products and Other Amazing Results
Some amazing mathematical results
Volker Strassen has won many prizes for his terrific work in computational complexity—including the Cantor medal, the Paris Kanellakis Award, and most recently the Knuth Prize. He is perhaps most famous for this brilliant work on matrix multiplication, but he has done so many important things—others may have their own favorites.
Today I plan on talking about some amazing results. Two are classic results due to Strassen, and the third is a recent result related to boolean matrix products.
I will have to do another longer discussion on “Amazing Mathematical Results.” I need to think, first, how I would define them—for today I will just give three examples.
Fast Matrix Product
Strassen’s 1969 result Gaussian Elimination is not Optimal shows, of course, a way to multiply two matrices in time order instead of the classic cubic method. This is certainly an amazing result—it came as a shock to everyone that the classic method was not optimal. Strassen’s fast matrix algorithm launched a thousand results; today many of the best algorithms for X are obtained by reducing X to matrix products. As a consequence, it is quite common to see algorithms which run in time , for instance, where is the exponent of the fastest known matrix multiplication algorithm.
The current best matrix product algorithm is now due to Don Coppersmith and Shmuel Winograd, and runs in time . I must mention the brilliant work of Henry Cohn, Robert Kleinberg, Balázs Szegedy and Christopher Umans who can reduce the search for faster matrix algorithms to certain group theory questions. I will discuss their work another time.
There is a story—I believe is true—how Strassen found his terrific algorithm. He was trying to prove that the naive cubic algorithm is optimal. A natural idea was to try and prove first the case of two by two matrices—one by one matrices are trivial. One day he suddenly realized, if there was a way to multiply two by two matrices with multiplications rather than , there would be a sub-cubic recursive method for by . He immediately sat down and quickly discovered his now famous formulas. At least this is the story.
All For The Price of One
Strassen also has another amazing result, in my opinion, on arithmetic circuits—this is joint work with Walter Baur. They have used this result to get some tight lower bounds on arithmetic computations. In 1983 they proved their result about the arithmetic straight-line complexity of polynomials: let denote the minimum number of basic arithmetic operations needed to compute the multi-variate polynomial , and let
denote the minimum number of operations needed to compute the polynomials , , , and so on, all at the same time. As usual is the partial derivative of with respect to the variable . The amazing result they prove is:
Theorem: For any polynomial ,
Thus, the cost of computing is almost the same as the cost of computing and all of its first order partial derivatives. Getting functions for the “price” of one seems surprising to me.
This result has been used to get lower bounds: this was why Baur and Strassen proved their result. The general method is this:
- Argue the cost of computing is .
- Then, reduce computing these polynomials to computing and all its partial derivatives, for some .
- This implies by Baur-Strassen’s theorem, the cost of must be at least .
The critical point is proving a lower bound on the cost of one polynomial may be hard, but proving a lower bound on the cost of polynomials may be possible. This is the magic of their method.
Virginia Vassilevska Williams and Ryan Williams have proved a number of surprising results about various matrix products and triangle detection.
One of the long standing open questions is what is the complexity of determining if a graph has a triangle? That is does the graph have three distinct vertices so that
One way to detect this is to use matrix product—invoke Strassen, now Coppersmith-Winograd. Another approach is combinatorial. It is easy to see that there is an algorithm that runs in time order
Of course, this can be as large as cubic in the number of vertices.
What Williams and Williams consider is the following question: is there a relationship between detecting triangles and computing boolean matrix products? They also consider the same type of question between boolean matrix verification and boolean matrix products.
The surprise is that the ability to detect triangles or the ability to verify boolean products, can also be used to compute boolean matrix products. What is so surprising about this is these both return a single bit, while matrix product computes many bits. How can a single bit help compute many bits? The simple answer is that it cannot—not without some additional insight. Their insight is that the algorithms can be used over and over, and in this manner be used to speedup the naive boolean matrix product.
One Bit, Two Bits, Three Bits, Bits
Let me give a bit more details on what they prove, not on how they prove it. See their paper for the full details. Since they prove many theorems, I will state just one.
Theorem: Suppose Boolean matrix product verification can be done in time for some . Then graph triangle detection on -node graphs can be done in time.
Note, matrix product verification is simple with randomness, but boolean matrix product seems much harder to check. Recall, to check that for by matrices, pick a random vector . Then, check
This can be done in time.
The main open problems are of course to finally resolve the exponent on matrix product—is it possible to multiply in ? Also the Williams results show there are very interesting connections between checking and computing. I think more can be proved in this direction.