Skip to content

Modulating the Permanent

December 21, 2014


When does it become hard to compute?

Muir

Thomas Muir coined the term “permanent” as a noun in his treatise on determinants in 1882. He took it from Augustin Cauchy’s distinction in 1815 between symmetric functions that alternate when rows of a matrix are interchanged versus ones that “stay permanent.” To emphasize that all terms of the permanent have positive sign, he modified the contemporary notation {\left| A \right|} for the determinant of a matrix {A} into

\displaystyle  \overset{+}{|} A \overset{+}{|}

for the permanent. Perhaps we should be glad that this notation did not become permanent.

Today Ken and I wish to highlight some interesting results on computing the permanent modulo some integer value.

Recall the permanent of a square matrix {A} is the function defined as follows by summing over all permutations of {\{1,\dots,n\}}, that is, over all members of the symmetric group {S_n}:

\displaystyle  \mathrm{perm}(A)=\sum_{\sigma\in S_n}\prod_{i=1}^n a_{i,\sigma(i)}.

It looks simpler than the determinant formula,

\displaystyle  \det(A) = \sum_{\sigma \in S_n} \mathrm{sgn}(\sigma) \prod_{i=1}^n a_{i,\sigma_i},

but soon acquired a reputation as being ‘strangely unfriendly’ compared to the determinant. We owe to Les Valiant the brilliant explanation that computing the permanent exactly, even restricted to matrices with all entries 0 or 1, is likely to be very hard, whereas the determinant is easy to compute.

Muir is known for rigorizing a lemma by Arthur Cayley involving another matrix polynomial that at first looks hard to compute but turns out to be easy. The Pffafian of a {2n \times 2n} matrix {A} is defined by

\displaystyle  \mathrm{pf}(A)= \frac{1}{2^n n!}\sum_{\sigma\in S_{2n}} \mathrm{sgn}(\sigma) \prod_{i=1}^n a_{\sigma(2i-1),\sigma(2i)}.

This vanishes unless {A} is skew-symmetric, meaning {A^T = -A}, whereupon Muir following Cayley proved the relation

\displaystyle  \mathrm{pf}^2(A) = \det(A).

The Pfaffian, and its use in the FKT algorithm for counting matchings in planar graphs, figures large in Valiant’s 2006 discovery that some generally-hard computations become easy modulo certain primes. All this is background to our query about which matrices might have easy permanent computations modulo which primes.

Permanent Computations

Herbert Ryser found an inclusion-exclusion method to compute the permanent in {O(2^{n}n)} operations:

\displaystyle  \mathrm{perm} (A) = (-1)^n \sum_{S\subseteq\{1,\dots,n\}} (-1)^{|S|} \prod_{i=1}^n \sum_{j\in S} a_{ij}

This was found in 1963 and still stands as the best exact method. Note that it is exponential but is still better than the naive method which would sum {n!} terms. David Glynn recently found a different formula giving the same order of performance.

Mark Jerrum, Alistair Sinclair, and Eric Vigoda found an approximation method for non-negative matrices that runs in probabilistic polynomial time. This award-winning result is based on delicate analysis of certain random walks. It fails for a matrix with even one negative term, since they show that such matrices can have permanents that are NP-hard to approximate.

Modulo 2, of course, the determinant and permanent of integer matrices are the same. It seems to be less well known that the permanent is easy modulo any power of 2. Modulo {2^k}, the known time is {O(n^{4k-3})}, and this too was proved in Valiant’s famous paper. However, subsequent to that paper, computing the permanent modulo any other integer was shown to be NP-hard under randomized reductions.

But wait. There are some special cases modulo {3} that we would like to point out that actually are easy to compute—that take only polynomial time.

Permanent Modulo 3

Grigory Kogan wrote a paper in FOCS 1996 that addresses this issue. It is titled “Computing Permanents over Fields of Characteristic 3: Where and Why It Becomes Difficult.” His main positive result was the following:

Theorem 1 Let {F} be a field of characteristic {3}. Let {U} be a {n \times n} matrix over {F} such that {UU^{T}=I_{n}}. Then {\mathrm{perm}(U)} can be computed in time {O(n^{4})}.

Further, he gave a slightly worse polynomial time for computing {\mathrm{perm}(U)} when {UU^{T} - I_{n}} is a matrix of rank one. When {UU^{T} - I_{n}} has rank two, however, computing the permanent mod 3 remains randomly hard for {\mathsf{NP}}, indeed complete for {\mathsf{Mod3P}}.

The details in the full version involve using mod 3 to regard certain matrices as skew-symmetric and hence work with their Pfaffians. The proof also uses extension fields in which {\sqrt{2} = \sqrt{-1}} exists, and the theorem holds over any such field.

We wonder what similar tricks might be available modulo other primes. One advantage of working modulo {p} is that the permanent becomes randomly self-reducible with no loss in numerical accuracy.

Possible Ideas

Let’s look at using this idea to answer questions about the permanent of the famous Hadamard matrices. As we have remarked before, the original ones of order {n = 2^k} were previously defined by Joseph Sylvester:

\displaystyle  \begin{array}{rcl}  H_1 &=& \begin{bmatrix} 1 \end{bmatrix},\\ H_2 &=& \begin{bmatrix} 1 & 1 \\ 1 & -1 \end{bmatrix},\\ H_{2^k} &=& \begin{bmatrix} H_{2^{k-1}} & H_{2^{k-1}}\\ H_{2^{k-1}} & -H_{2^{k-1}}\end{bmatrix}. \end{array}

Jacques Hadamard gave the functional equation for any matrix bearing his name,

\displaystyle  HH^T = nI,

which for {n \geq 4} is known to require {n} to be a multiple of {4}. Put {n = 2^k m} with {m} odd and {k \geq 2}. Whether such matrices exist is known when {m^2 < 2^k}, when {n - 1} is a prime power, or when {k=2} and {2m - 1} is a prime power. The case {n = 668 = 4\cdot 167} avoids these since {667 = 23\cdot 29} and {333 = 9 \cdot 37}, and remains unknown.

If {n \equiv 1 \bmod{3}} then {H} satisfies Kogan’s theorem. If {n \equiv 2} then it appears we can still leverage the proof by working with {U = \sqrt{-1} H} instead. However—and this is by way of caution—if {n} is a multiple of 3 then every {H} is nilpotent mod 3, and it follows that {\mathrm{perm}(A) \equiv 0 \bmod{3}}. Nevertheless, all this means that we can compute the permanent of any Hadamard-type matrix modulo {3} in polynomial time.

A further open question is whether there exists a Hadamard matrix {H} with {\mathrm{perm}(H) = 0}. This is open even for Sylvester’s matrices. This is known to need {n \geq 32}. Of course, to vanish, it must vanish mod 3. We wonder how far gaining knowledge about behavior modulo 3 and other primes might help with these problems.

Some of these papers treat related questions for other matrices of entries {\pm 1}, perhaps with some {0}‘s and/or a normalizing constant factor. Greater reasons for interest in questions about permanents has come recently from boson sampling, for which we also recommend these online notes scribed by Ernesto Galvão of lectures given by Scott Aaronson in Rio de Janeiro a year ago. A main issue is whether the randomized equivalence of worst and average case for permanents mod {p} can be carried over to the kinds of real-or-complex matrices that arise.

Open Problems

Can we do better? Can we compute the permanent for a larger class of orthogonal matrices?

10 Comments leave one →
  1. December 21, 2014 10:26 pm

    Reblogged this on josephdung.

  2. December 22, 2014 7:58 am

    Reblogged this on countingcomplexity and commented:
    Modulo calculating the permanent is an interesting topic in counting complexity🙂

  3. December 23, 2014 1:24 am

    What needs to be done to improve the quadratic complexity lower bound for permanents?

  4. December 25, 2014 1:53 pm

    Question (informal)
    Are trapdoor sampling algorithms known?

    Definitions  A_n be a set parameterized by an integer {n > 0}; let f\colon A_n\to \mathbb{R} be a function outside of the complexity class P; and let D_{\mathbb{R}} be a distribution on \mathbb{R}.

    Example  Let A_n be a set of n\times n matrices having random entries [0,1], let f be the permanent, and let D_{\mathbb{R}} be the log-normal distribution.

    Further definition  We say that a distribution D_{A_n} is a P-trapdoor distribution on the set A_n iff it induces a distribution D^\prime_{\mathbb{R}} that is indistinguishable by m-sample Kolmogorov-Smirnov test from D_{\mathbb{R}}, for any sample-size m that is a polynomial in n.

    Question (formal)
    Are P-trapdoor sampling algorithms known?

    Welcome answers

    • Yes, and here’s an example so trivial that it’s not in the literature.

    • Yes, and here’s an example that unnaturally exploits a loophole in the definition of “P-trapdoor”.

    • Yes, and the literature provides the following natural-yet-nontrivial example.

    • Yes, but the literature provides only a non-constructive existence-proof.

    • No natural examples are known, and none are expected to be found, because any such algorithm would violate long-standing conjectures in complexity theory (e.g., such an algorithm would collapse the polynomial hierarchy).

    • The problem is open: no natural P-trapdoor sampling algorithms are known, yet neither is there any evident reason why such algorithms could not be constructed.

    Needless to say, the most hoped-for answer is …

    • No examples are in the literature, but here’s a natural-yet-nontrivial example that is based upon the wonderful ideas and powerful mathematical frameworks that have been provided by Dick Lipton’s and Ken Regan’s fine essays here on Gödel’s Lost Letter and P=NP

    …  especially if the answer is from a student!

    ————-

    Background  This question is grounded in the sampling problems set forth by Lund et al in their recent article “Boson sampling from a gaussian state” (2014), as contrasted with the sampling problems set forth in Mark Murcko’s survey “Accelerating drug discovery: the accurate prediction of potency” (2014).

    The former article asserts (broadly) “sampling is generically hard”; the latter survey asserts (broadly) “sampling is generically easy”. Hmmm … the themes of the Harrow-Kalai quantum debate are flowering perennially and bountifully!

    Falsifiable Prediction  The exhibition of a concrete P-trapdoor sampling algorithm would go far to productively reconcile the (Aram Harrow-style) Church of the Larger Hilbert Space with the (Gil Kalai-style) Free Masonry of Efficient Sampling-Simulation.

    Holiday Wishes  Sincere appreciation and best wishes are extended to Dick Lipton, Ken Regan, Aram Harrow, Gil Kalai — and dozens more colleague-commenters too — for rational respectful responsible discourse that, in 2014 as in so many previous years, has elevated, instructed, and amused us all.

    Like Oliver Twist, “please sirs, we want some more!”

    
    @article{cite-key, Title = {Boson Sampling
    from a Gaussian State}, Note = {on-line at
    \url{http://arxiv.org/abs/1305.4346}}.
    Author = {Lund, A. P. and Laing, A. and
    Rahimi-Keshari, S. and Rudolph, T. and
    O'Brien, J. L. and Ralph, T. C.}, Journal =
    {Phys. Rev. Lett.}, Month = {Sep}, Pages =
    {100502}, Volume = {113}, Year = {2014}, }
    
    @inproceedings{cite-key, Title =
    {Accelerating Drug Discovery: The Accurate
    Prediction of Potency}, Note = {on-line at
    \url{https://www.youtube.com/watch?v=
    wa59ZgflJD8}}. Author = {Mark Murcko},
    Booktitle = {Advances in Drug Discovery and
    Development}, Month = {24 September},
    Organization = {Chemical \&\ Engineering
    News (Virtual Symposium)}, Year = {2014}}
    
    • December 25, 2014 2:09 pm

      Dick and Ken, please correct residual LaTeX errors in the above (which is a draft of a question planned for TCS StackExchange).

      Hmmm … and it should have been stated explicitly that computing an individual P-trapdoor sample from the distribution D_{A_n} has to reside in the complexity class P.

      Very probably this natural-yet-tricky problem (as it seems to me) has remaining definitional defects … suggestions for improvements prior to TCS StackExchange posting will be gratefully received.

    • December 25, 2014 3:27 pm

      Further background information (that is central to the themes of the Harrow-Kalai quantum debate, as it seems to me) comes from engineering articles like Zvi Or-Bach’s 28nm: The Last Node of Moore’s Law (2014), whose much-discussed bottom-line is “transistors aren’t getting cheaper any more.”

      Open questions  Is the end of Moore’s Law for hardware being compensated by an acceleration of Moore’s Law for sensing, simulation, and sampling? And if so, what foundations in mathematics can we evolve, to support a natural understanding of this sustained acceleration?

      Concerns  The present-day literature of complexity theory (as I read it) scarcely begins to address these pressing strategic concerns, and one reason (as it seems to me) that its canonical definitions are relatively rigid. The resulting frameworks are better-suited to framing no-go postulates (per the Lund et al article above) than to pivoting enterprises and economies (per the Murcko survey above). Perhaps we can move toward a middle ground?

    • December 26, 2014 8:54 am

      Further references and reflections

      One of the most popular of all MathOverflow questions is What are your favorite instructional counterexamples?, and it is unsurprising that a search of Google Books for titles of the form “Counterexamples in \langle\mathsf{\text{any topic}}\rangle finds Counterexamples in Topology, […] Analysis, […] Several Complex Variables, […] Optimal Control Theory, […] Model Theory (etc.).

      Surveys like the appended articles by Goldreich and Vadhan, and by Bogdanov and Trevisan, make me long for a juicy compendium of Counterexamples in Sampling Complexity.

      Certainly there is plenty of material for such a survey … and sustained Moore’s Law advances in practical simulation-sampling capabilities are rapidly increasing our store of natural sampling problems (in the most practical sense of “natural”).

      For the reasons set forth in Dick and Ken’s essay, a survey of sampling-related permanent-problems could provide one of the most rich and subtle chapters in any such compendium.

      ————-
      Readings

      
      @article{Bogdanov:2006:AC:1295194.1295195,
      Author = {Bogdanov, Andrej and Trevisan,
      Luca}, Journal = {Found. Trends Theor.
      Comput. Sci.}, Month = oct, Number = {1},
      Pages = {1--106}, Title = {Average-case
      Complexity}, Volume = {2}, Year = {2006}}
      
      @article{Goldreich:2007:SIW:1341675.
      1341678, Author = {Goldreich, Oded and
      Vadhan, Salil}, Journal = {Comput.
      Complex.}, Month = dec, Number = {4},
      Pages = {325--330}, Title = {Special Issue
      On Worst-case Versus Average-case
      Complexity: Editors' Foreword}, Volume =
      {16}, Year = {2007}}
      
    • January 11, 2015 8:24 am

      Dear John, at the end I am not sure what your question is. Do you want a distribution that one cannot generate in P but one cannot distinguish from a distribution generated in P on a polynomial number of samples?

  5. Grigory Kogan permalink
    May 27, 2015 11:51 pm

    Hello to everyone participating in this discussion! I’m happy to know that my results on computing permanents over Characteristic 3 are not yet forgotten. The problem is indeed very interesting as Characterisctic 3 is the least nonzero characteristic where the permanent differs from the determinant and that’s why, seemingly, it possesses so many useful properties. It concerns not only unitary matrices, but, say, the matrices {1/(x[i]-x[j])} and many others. But even a more intriguing fact is an analogy which we can find between the permanent modulo 3 and the hamiltonian modulo 2 (the hamiltonian of an nXn-matrix is the sum of the products of its entries corresponding to a Hamiltonian (having only one cycle) permutation over all the Hamiltonian permutations of the set {1,…,n}). If A is a unitary matrix (i.e. rank(AA^T – I) = 0) over a field of Characteristic 2 then ham(A) = (det(A + I[\1]))^2 where |[\1] is the identity matrix with its entry of indexes 1,1 replaced by 0. And, in the same time, if, in such a field, rank(AA^T – I) = 1 then the problem becomes as hard as in the generic case, i.e. #2P-comlete!

  6. Grigory Kogan permalink
    May 27, 2015 11:54 pm

    Hello to everyone participating in this discussion! I’m happy to know that my results on computing permanents over Characteristic 3 are not yet forgotten. The problem is indeed very interesting as Characterisctic 3 is the least nonzero characteristic where the permanent differs from the determinant and that’s why, seemingly, it possesses so many useful properties. It concerns not only unitary matrices, but, say, the matrices {1/(x[i]-x[j])} and many others. But even a more intriguing fact is an analogy which we can find between the permanent modulo 3 and the hamiltonian modulo 2 (the hamiltonian of an nXn-matrix is the sum of the products of its entries corresponding to a Hamiltonian (having only one cycle) permutation over all the Hamiltonian permutations of the set {1,…,n}). If A is a unitary matrix (i.e. rank(AA^T – I) = 0) over a field of Characteristic 2 then ham(A) = (det(A + I[\1]))^2 where I[\1] is the identity matrix with its entry of indexes 1,1 replaced by 0. And, in the same time, if, in such a field, rank(AA^T – I) = 1 then the problem becomes as hard as in the generic case, i.e. #2P-comlete!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s