Valiant’s Permanent Contributions
The complexity theory of the permanent of a matrix
Leslie G. Valiant is one of the top theorists in the world. He has just won the 2010 Turing Award to add to the many other honors he has already received. He is a perfect choice for computer science’s highest award.
Today Ken and I would like to talk about just a part of Valiant’s seminal contributions, in particular his work on the complexity of counting.
His Turing Award announcement cites his work on machine learning and more. We think his paper on machine learning was an important event that helped create the vital field of modern machine learning. Since we are not experts on machine learning, we will let others discuss that contribution, and will discuss here his brilliant work on the permanent of a matrix.
Permanent
The permanent is like the determinant in that it maps square matrices to values. The definition of the permanent is:
Here is over all permutations of
. It was introduced by Augustin-Louis Cauchy in 1812 while he developed the notion of determinant.
We note that David Glynn in this paper has given a new formula for the permanent. It can be implemented to have the same best-known complexity as a formula due to Herbert Ryser. Glynn’s formula takes an average over all the
vectors
with entries
and
beginning with a
, where
means the product (i.e., parity) of those entries:
Valiant’s Papers
The work was published in 1979 as, “The Complexity of Computing the Permanent.” It introduced the complexity class and many ideas that are fundamental to modem complexity theory, along with the companion paper, “The Complexity of Enumeration and Reliability Problems.” Both papers were originally written in 1977 while he was at the University of Edinburgh, made into technical reports and submitted straight to journals; neither appeared at STOC or FOCS or apparently any other conference. A third paper, “Completeness Classes in Algebra,” did make it into STOC 1979. The main theorem, which entails that the computation of
is
-hard, is:
Theorem: Over any field of characteristic other than
, the permanent is complete for
under polynomial-time reductions.
Of course the restriction to characteristic other than is due to the simple observation:
which follows from the neat fact that . In mathematics
is often special—being the only even prime makes it unique.
Genesis of the Ideas
George Pólya asked, in 1913, a question about the relationship between the permanent and the determinant. The question asked whether there was an “easy” way to reduce the calculation of a permanent to that of a determinant. The rationale was simple: computing determinants is easy, while at the time it was unknown how to compute the permanent in any manner significantly faster than computing all the possible terms.
The question precisely was: Given a matrix , is there a way to set the signs of the entries so that the resulting matrix
satisfies
This is trivially true for matrices of size , and the question was, is there a way to do this in general?
I think the thought process behind the question may have been something like this: The determinant differs from the permanent only in signs on terms. Perhaps there is a way to get all the signs right by flipping some signs of the matrix . Actually to prove that this is impossible is not trivial, and it was done by Gábor Szegő. Note the power of Valiant’s notion of
. We now know that if there was a way to do this, even a way that was non-uniform, then
at least would have polynomial size circuits. Since this would be quite surprising—against most people’s beliefs—we could say even without Szegő’s proof that finding such sign-setting rules is quite unlikely.
Actually the beauty of this argument via complexity is that it even “proves” a much stronger rule: there is no sign setting method that is adaptive. Adaptive means the method is allowed to decide which signs to flip based on the values of the entire matrix. I believe proving this result without complexity theory would be very hard.
The general impact of Valiant’s work is that it explained why mathematicians had been unable to make an easy connection between two seemingly similar concepts. It is surprising that the permanent is less tractable to deal with, because its formula looks simpler without the signs that the determinant has. This was an early example of complexity theory speaking truth to the deep structure of mathematics. Or at least speaking evident truth—no one has yet proved that the permanent cannot be computed feasibly like the determinant can. The rest of this discussion is about efforts to understand and compute permanents.
Inequalities
We have just mentioned the famous van der Waerden Conjecture: a doubly-stochastic matrix
of non-negative real numbers has
and this is uniquely achieved by the matrix with all entries . In 2008, Leonid Gurvits proved an inequality that generalizes the van der Waerden Conjecture, giving a new and simpler proof besides. This also extended related inequalities by Alexander Schrijver and W.G. Valiant. (No—we have not had another visit from the Leprechaun Neil L. playing tricks with Les’ name. The name “W.G. Valiant” appears on their paper, but Lex Schrijver tells us that it was a pseudonym for the researchers in the Werkgroep van Lint, and hence honors not Les or “Prince Valiant” but the (late) famous combinatorialist Jack H. van Lint.)
Here is a statement of Gurvits’ inequality, as given in a recent article by Schrijver with Monique Laurent in the American Mathematical Monthly.
Here is the number of non-zero entries in column
of
. Gurvits’ paper actually proves a more-general version involving
-fold partial derivatives of homogeneous polynomials. Here is a video of Donald Knuth recently giving a talk on this work.
A final inequality that we find quite interesting is the following:
for any matrix . The key point is that this implies the neat fact—one that I found surprising—that the permanent of any unitary matrix is at most
.
Reductions to Determinants
Valiant’s STOC 1979 paper proved that every arithmetical function with a formula of size can be computed symbolically by taking the determinant of an
matrix. For a slight tightening, see this paper.
Note that the determinant itself is not known to have polynomial-sized formulas. The best known formulas have size . Of course if the permanent had formulas of such quasi-polynomial size
, then it could be computed by taking the determinant of basically a
matrix. Thus by proving a lower bound
on the size of matrices
giving
, we could prove formula size lower bounds on
itself.
However, even proving to be more than linear was a devil of a task. For a long time the best known lower bound was
by Jin-Yi Cai in this 1990 paper. Then in 2004 came a breakthrough result of Thierry Mignon and Nicolas Ressayre, showing
must be at least quadratic for fields of characteristic zero. Others including Cai with Xi Chen and Dong Li have refined the technique, but have not yet definitively pushed
above quadratic.
Approximating the Permanent
Mark Jerrum and Alistair Sinclair proved in 1989 that the permanent of 0-1 matrices could be given a fully polynomial randomized approximation scheme, on several cases, including the case when the matrix had sufficiently many 1’s. This condition was lifted, and the scheme extended to any matrix with non-negative real entries, when Eric Vigoda was added to the team, for this paper. This result has been cited as a major answer to the question, Who cares about Permanents?
Other Aspects of The Permanent
There are many other studies of the permanent that were launched by Valiant’s seminal work, and have helped shape computational complexity. Since this is already getting long let me just list the top level ideas.
Random Self Reductions:
I proved in 1991 that the ability to compute for a large fraction of the matrices was enough to compute the permanent for all values. I would like to discuss the story of this simple, but useful result, another time.
Lower Bounds:
Eric Allender has shown that uniform constant depth threshold circuits of polynomial size cannot compute the permanent. This is one of the few provable results we have on permanent, ones that actually show that it is not too easy to compute.
Values of Random Permanents:
There has been recent interest in the value distribution of the permanent of random Bernoulli matrices. Terry Tao and Van Vu have just proved some anti-concentration theorems—see their paper here for the details.
The Road Ahead
Perhaps the greatest homage paid to Les’ work on the permanent is that it has become a metonym for the versus
question itself. (Ken wrote this sentence, since I had no idea what “metonym” means.) Or rather vice-versa: when people talk of Ketan Mulmuley’s attack on
with Milind Sohoni and others, they are really talking about a strategy that has not yet gone beyond showing that the permanent does not polynomial-time reduce to the determinant. This is symbolized by Valiant as
, and is not unconditionally known to be related either way to
—see this recent “Stack Exchange” discussion. Nevertheless,
captures so much of the spirit that it can even stand for the other question in what we teach our children.
Open Problems
We owe Les a great deal of thanks, for without his seminal work connecting the permanent to complexity theory, much research would have been impossible. Thanks Les and congratulations on receiving this year’s Turing Award.
The main open problem is: while various barriers seem to stop us cold on or trying to show that
does not have polynomial-size circuits directly, can we develop mathematical tools and make steady incremental progress toward
?
[fixed Diaconis reference “Why care…” to “Who cares…”; 2011 -> 2010 as date of his Turing Award]
Hi Dick,
Speaking of the permanent — and the related topics of random walks and approximation algorithms — can you write sometime about the two X-Karp-Lipton-Lovasz-Y papers?
Thanks,
Aravind
Excuse me, my URL is corrected now.
Aravind
My favorite variation (generalization, really) on the determinant and permanent is the immanant (http://en.wikipedia.org/wiki/Immanant_of_a_matrix)…basically replace the sign in the permanent by a character of an irreducible representation of the symmetric group. Some interesting things are known about immanants, for example just as the characteristic polynomial det(x I -A) of the adjacency matrix of a graph is a graph invariant, the polynomial imm_\lambda(xI-A) of the adjacency matrix of a graph is a graph invariant. All of these are not, however, complete invariants. Also interesting is that the complexity of computing immanants appears to be related to the size of the largest step in the Young tableau for the irrep…if the you consider immanants where this grows like O(n) then these seem to be as hard to compute as the permanent.
Dave, is immanent sampling as easy (for a QC) as BosonSampling?
To indicate the richness here, we chose to leave the following content “on the cutting room floor” to avoid overweighting things:
(*) It was something of a surprise that determining whether a bipartite graph has a perfect matching is in P, yet counting the number of such matchings is #P-complete, hence NP-hard. Same for #2SAT, as Valiant also proved in the 1979 SIAM J. Comp. paper.
(*) Counting the number of witnesses mod 2 or mod 3 is typically NP-hard under randomized reductions. However, for certain counting problems where this is hard mod 2 or mode 3 etc., it suddenly becomes polynomial-time feasible to count modulo 7, or mod other Mersenne primes. See this paper by Jin-Yi Cai and Pinyan Lu, and many others on Jin-Yi’s site.
I can give a hat-tip (HT) to Craig Feinstein for the Fortnow link at the end, which also had Lance’s permission to use, and thank Alexander Schrijver for the explanation of “W.G. Valiant”.
“In mathematics 2 is often special—being the only even prime makes it odd.”
Why people find so hard to believe that #P cannot be same as FP (function P)? Why could this not be the missing peace of the puzzle (NP=?P)? Nobody even thought of trying to prove #P=FP (I suppose).
Oops– I meant “piece of the puzzle”.
“Actually the beauty of this argument via complexity is that it even “proves” a much stronger rule: there is no sign setting method that is adaptive.” I don’t thing that’s quite true: it seems like that only “proves” the weaker statement, that there is no *efficient* adaptive sign-setting algorithm.
“This result has been cited as a major reason to answer `yes’ to the question, Why care about Permanents?”
Proofread!
Thanks!—sure thought I read the wording as “why care—?”, probably because I was wearing Neil L.’s glasses…
Great post, great homage to a great scientist! I agree that Valiant’s permanental contribution,
including his algebraic version of P vs NP, was and is a major stimulant in modern mathematics.
A few comments:
1. A formula from David Glynn’s paper was known for ages, it was, for instance,
in Eric Bax and Joel Franklin 1997 paper “A permanent formula with many zero-valued terms”. Interestingly, this formula, viewed as a unbiased estimator of the permanent,
is provably good for matrices which are contractions respect to some nice norms.
See the appendix in recent S. Aaronson and A. Arkhipov paper on Quantum Linear Optics.
2. Fix the inequality |per(A)| \leq per(AA^*) to |per(A)|^2 \leq per(AA^*).
Finally, many thanks for your great service!
Pardon my inability to comment with more substance. I thought the following phrase was rather interesting: “In mathematics 2 is often special—being the only even prime makes it unique.”
I believe, at least shallowly, this isn’t quite right. Since “even” by definition means “multiple of two”, you would naturally expect that there is only one prime “multiple of two” just as you would expect there is only one prime multiple of 3, and indeed exactly one prime multiple of p for any prime.
I don’t doubt there are many deeper reasons why two is special, but saying that it is the only even prime seems a bit vacuous to me.
Supposing we know to calculate permanent of a $n \times n$ matrix efficiently, how can one translate this breaking an one-way function such as factor $N = pq$ where $N = 2^{O(n)}?
Is there a paper or reference that talks about this?
To factor using an efficient permanent algorithm you can do the following steps. (1) Build circuit that takes p,q (as binary numbers) on input, outputs 1 or 0 depending on whether pq=N. (2) Convert this to 3SAT. (3) Convert problem of “number of solutions to this 3SAT problem” to Permanent, which is easy because Permanent is #P-Complete. (4) Fill in the bits in the factors p,q one-by-one, testing each bit for permanent-is-nonzero and choosing that setting for the given bit.
The permanent of a matrix can be defined recursively for any i as:
perm(A)= Sum_{j=1}^n a_{ij} * perm(A_{ij}) and perm([a_{11}])=a_{11}.
It’s clear that this formula cannot be made simpler, since each term contains an independent a_{ij} and all of the terms have positive signs unlike the similar recursive formula for the determinant, so there are no clever ways of cancelling out terms. To compute the permanent using this formula, you need at least 2^n time.
So there is just no chance any algorithm will beat 2^n time for computing the permanent.
> so there are no clever ways of cancelling out terms. To compute the permanent
> using this formula, you need at least 2^n time. So there is just no chance any
> algorithm will beat 2^n time for computing the permanent.
Wouldn’t that same logic establish an n^3 time bound for calculating the product of two n-by-n elementwise positive matrices?
This logic would imply that computing an individual entry of the product of two n by n matrices requires order of n time, since each entry is of the form:
Sum_{k=1}^n a_{ik}b_{kj}
and this cannot be simplified. But it doesn’t imply that you have to compute each entry separately, as the entries are not independent from one another. It is conceivable (without knowing anything about o(n^3) multiplication algorithms like the Strassen algorithm) that there is enough overlap between the matrix entries that one can beat the n^3 time.
@Craig “It is conceivable (without knowing anything about o(n^3) multiplication algorithms like the Strassen algorithm) that there is enough overlap between the matrix entries that one can beat the n^3 time.”
In Strassen’s technique the matrices involved have no overlap!
“In Strassen’s technique the matrices involved have no overlap!”
You misunderstood how I was using the word “overlap”.
Then please explain your ideas clearly. I don’t get it. If you (looking at a problem once) think that it is solvable in a few lines, then do you think the rest of the scientists are conmen to get government funding for their research or they are all stupid??
My idea is simple.
The permanent of a matrix can be defined recursively for any i as:
perm(A)= Sum_{j=1}^n a_{ij} * perm(A_{ij}) and perm([a_{11}])=a_{11}.
To compute the permanent using this formula, you need at least 2^n time, as it involves 2^n sub-matrices of A. This formula cannot be simplified. Therefore, it is impossible to compute the permanent of a matrix faster than 2^n time. QED
I don’t think the rest of the scientists are con-men or are stupid. I just think they are making this problem into a bigger problem than it is.
I dont see how your recursion is the shortest way. In fact Ryser’s formula provides a shorter length formula. Yours needs n! steps. Ryser’s needs just 2^n. I have another which needs (n^3)(2^(nlogn)). Both of which are faster than your naive approach.
This recursion does not require n! steps. It requires at least 2^n steps.
You can use simple dynamic programming to get it to run in 2^n times a polynomial time, because there is overlap between the subproblems of computing the permanent of each A_{ij} at every step in the recursion.
Only if you don’t take advantage of this overlap will it require n! steps.
@Craig: So got you. You have reduced an obvious n! algorithm to 2^n steps. How do you know 2^(1-\epsilon) for some epsilon > 0 is impossible?
Because the recursive formula requires at least 2^n recursive calls and the recursive formula cannot be simplified.
@Craig: You don’t get the point. Do you? “Your” recursion does not yield an epsilon > 0. How do you know there is nothing else you can come up with?
I don’t know what you mean by epsilon>0. You said, “2^(1-\epsilon)”. Did you mean 2^(n-\epsilon)?
Or in other words make precise the mathematical meaning of “cannot be simplified” in your statement from the a consistent set theory.
Which consistent set theory? The ones I know to be consistent are very weak.
“a” consistent set theory!
So craig. I am sure you are in contention for the next Fields. Congrats! If only below average Lipton and Regan get it! Even Godel would be turning over his grave by now if he hears your argument! Congrats again!
from,
I thought Fields medals are for people under 40. I’m not under 40.
I don’t think the Lipton and Regan are under 40 either.
Dont worry Craig! They may change the rules for you!
So I assume that you agree with my proof now.
Sure Dr.Lipton is probably recommending you to the president’s advisory board as well.
I put the proof here: http://arxiv.org/PS_cache/cs/pdf/0305/0305035v26.pdf
@Craig, will you accept the structurally similar proof here:
@Craig Are you autistic? check with your doctor! serious!! you are missing even rudimentary higher level reasoning skills!!
Barak,
What are you trying to prove? That it cannot be simplified? What is the formula? And what do the x variables represent?
I cannot tell whether it can be simplified or not, as it may be possible to factor the x’s. Your formula may be very similar to the normal definition of the permanent with the n! terms, which can be simplified into the recursive formula that I gave.
Barak’s polynomial is from the ring $\mathbb{Z}[x_{1},x_{2},\dots,x_{13}]$! You can take the constants to be 1 instead of the numbers he has mentioned.
Why an I even trying to correct an unknown nut case! I am bailing out!
@Craig “What are you trying to prove? That it cannot be simplified? … I cannot tell whether it can be simplified or not.”
That is the point: maybe there is some tricky way to evaluate that polynomial, given values for the x1…x13, which requires significantly less than 2^13 operations. It is hard to tell just by looking at this one particular way of evaluating it which happens to call for more than 2^13 operations. The permanent is also a polynomial. You have a particular way of evaluating it which requires O(2^n) operations. And you don’t see any tricky way to evaluate the permanent more efficiently. But, as with the big polynomial I exhibited, just because you don’t see any good tricky way does not prove that none exists.
@from: spoilsport
Barak,
But it is easy to see that there is no way of simplifying the recursive formula for the permanent that I gave above just by looking at it. This is just high school algebra.
In the case of your polynomial, it is not easy to see that there is no way of simplifying it. I have no idea whether you can simplify it or not. It’s just too complicated for me to know.
This is the big difference between the two cases.
@Craig “But it is easy to see that there is no way of simplifying the recursive formula for the permanent that I gave above just by looking at it. This is just high school algebra.”
Maybe, as with the determinant, there is some other completely different way of decomposing it which can be evaluated much more efficiently.
That is impossible, because everything has a positive sign in front of it. Negative signs cannot help us here as they helped us with the determinant.
Here is what happens when you try my argument with the determinant:You have the same recursive formula,
det(A)= Sum_{j=1}^n (-1)^{i+j}*a_{ij} * det(A_{ij}) and det([a_{11}])=a_{11}.
Then, you transform the matrix A into a matrix with the first column all zeroes except the ith row, call it matrix B. (For simplicity, we won’t worry about pivoting rows.) Then the first equation in the recursive formula gets reduced to:
det(A)=det(B)=(-1)^{i+1}*a_{i1}*det(B_{i1}),
since all of the other determinants are zero. This process continues until n=1.
You cannot do this with the permanent of a matrix. None of the n permanents of the (n-1) x (n-1) submatrices can be zeroed out like the case of the determinant, since there are no negative signs in the definition of a permanent. So the recursive formula for the permanent of a matrix is of the form a_i1*x_1+….+a_in*x_n. There is a way to simplify this expression, so the fastest way to compute it is to directly compute it.
My point is that the polynomial-time algorithm for computing the determinant of a matrix didn’t come from nowhere and is not magical. So your comment, “Maybe, as with the determinant, there is some other completely different way of decomposing it which can be evaluated much more efficiently.” has as much credibility as “maybe there is an elementary formula (with only radicals and arithmetic) for computing the roots of a degree 5 polynomial.” 🙂
Someone made the point that the Fibonacci recursive formula a_{n+1}=a_n + a_{n-1} a_0=0, a_1=1 also cannot be simplified, yet you don’t have to compute each of the n a_k’s in order to compute the final a_n.
My response is that this Fibonacci recursive formula can in fact be simplified to a_n=(phi^n-psi^n)/(phi-psi), where phi and psi are the roots to x^2=x+1. But the recursive formula for the permanent reduces to an expression with n! terms, which is not simpler than the recursive formula for permanent.
That person also claimed that my proof is not valid because it is not true when the entries in the matrix are in Z/2Z.
My response is that the reason why it is not true when the entries in the matrix are in Z/2Z is because you can simplify the permanent formula through cancellation of terms just like the formula for the determinant. When the entries in the matrix are in Z, you cannot do this, so my proof is still valid then.
@Barak Don’t even try. The nut case will not get it! See all his proofs on arxiv by typing Crauig Feinstein.
Aw, this was a really nice post. Taking a few minutes and actual effort to produce a top notch article… but what can I say… I hesitate a lot and never seem to get nearly anything done. I have found following MCQ is really helpfull for CSE student http://gatecseit.in/computer-science-mcq-2/algorithm/
➡ ⭐ 💡 ❗ fyi a quick summary/ overview of some striking new results in this area, hope you blog on these recent developments: arithmetic circuit complexity, Valiants VP=?VNP, recent advances/ breakthroughs, new directions 😀 😎