LSE source: “Calculus on Clay?” |

Norman Biggs is the author of the wonderful book *Algebraic Graph Theory*. Both Ken and I read it long ago, and both of us have it out now because of its relevance to Hao Huang’s beautiful short proof of the Boolean Sensitivity Conjecture.

Today we wish to ask, *What are your top five favorite books on mathematics and theory for summer reading?*

There’s an aporia in that question. A working definition of aporia is: “a self-contradiction that isn’t.” The point is that books for summer reading should be new, so how would you already know which are your favorites? Well, we are thinking of books that are so rich you can always find new things in them—and that also played formative roles earlier in our careers.

Ken knew Biggs during his first year at Oxford when Biggs was visiting there from London. He took part in a weekly sitting-room seminar organized by Peter Neumann. Biggs’s book was a central reference for Ken’s undergraduate senior thesis at Princeton, and both he and Ken presented material based on it.

Here are my votes for all-time best books in mathematics and in computer science theory.

*Algebraic Graph Theory*, by Norman Biggs. A wonderful book. First appeared in 1974.

* An Introduction to Probability Theory and Its Applications, Vol. 1*, by William Feller. This is the book I used to learn probability theory.

* An Introduction to the Theory of Numbers*, by Godfrey Hardy and Edward Wright. Now updated by Andrew Wiles, Roger Heath-Brown, and Joseph Silverman.

*Elements of Number Theory*, by Ivan Vinogradov. Another small book that is loaded with ideas.

*The Art of Counting*, by Paul Erdős and Joel Spencer. This book changed my life. Today the book is of course *The Probabilistic Method*, by Noga Alon and Joel Spencer.

Ken reaches back to his teen years but it’s still the same span of years as my list. Here he tells it:

All books by Martin Gardner—in particular, the books of collections of his “Mathematical Games” columns in *Scientific American*. Here is an overview.

*Scarne on Dice* and * Scarne on Cards*. Originally it was neither of these books—nor John Scarne’s *Complete Guide to Gambling*—but a different book on in which both Scarne and Gardner figured prominently. Alas I, Ken, cannot trace it. That’s what I used to learn probability theory.

*Spectra of Graphs*, by Dragoš Cvetković, Michael Doob, and Horst Sachs. I could put Biggs’s book here, but this is the one that got me on to the whole subject just before my senior year at Princeton. It was fresh out in 1980—I recall the tactile sensation of the dark green spanking new cover in the Fine Hall Library’s copy. A great book with pictures and algebra.

* Ideals, Varieties, and Algorithms*, by David Cox, John Little, and Donal O’Shea. Fast forward to 1997. Having realized that techniques from algebraic geometry could surmount the “Natural Proofs” barrier (see also GCT), I went whole-hog after it. See “Manic Monomials” in this post for one thing that tripped it up. The book remains incredibly stimulating. It has a sequel, *Using Algebraic Geometry*.

*Quantum Computation and Quantum Information* by Michael Nielsen and Isaac Chuang. As with Hardy and Wright, it has its own Wikipedia page. Dick and I can say this is nominating a competitor, but Chaung & Nielsen is really in a class by itself for the sheer richness and writing style. One odd mark of its influence: In 2006 when I reacted to the sensational and frightening accusations of cheating at the world championship match, my first thought was to apply distributional distance measures of the kind used in its later chapters. Among such measures is (quantum) fidelity, and although I focused more on Jensen-Shannon divergence before deciding on simpler stuff, my chess research website retains “fidelity” in its name as part of a multi-way reference to FIDE, faith, and playing in good faith.

What books most influenced you? What are your votes for the best books that might influence others?

]]>Cropped from Emory homepage |

Hao Huang is a mathematician and computer scientist at Emory University. Last week he released a paper of only six pages that solves the Boolean Sensitivity Conjecture, which goes back at least to a 1992 paper by Noam Nisan and Mario Szegedy.

Today we discuss his brilliant proof and what it means for sensitivity of the *tools* one employs.

Several of our blogging friends have covered this news in posts already, and Ryan O’Donnell even summarized the proof in one tweet. Scott Aaronson’s thread includes a comment by Huang on how he came by his proof.

We will try to draw implications for the related matter of how *you* might come by proofs of *other* conjectures. We have previously discussed the possibility of overlooking short solutions to major problems. Here we will discuss how to *find* them.

To get a flavor of what Huang proved, consider the graph of an ordinary cube:

The question is, *can you color 5 vertices red so that no red node has 3 red neighbors?* Your first impulse might be to color 4 nodes red according to parity so that none has a red neighbor, per below left:

But then any 5th node will have 3 red neighbors. Another “greedy” idea is to pack a subgraph of the allowed degree 2 into half the cube, as at right. Any 5th node will again create a degree-3 vertex in the subgraph induced by the red nodes.

The answer is that actually one can pack 6 nodes that induce a simple cycle:

Now let’s up the dimension by one—that is, take and . How many nodes can we color red and keep the induced degree 2?

Again the parity trick gives us degree 0 with 8 nodes, but then we can’t add a 9th. We can greedily try to pack the outer cube with our 6-node solution, but then—perhaps surprisingly—we can add only 2 more red nodes from the inner cube. So we can only do 5 from the outer cube. We can get 9 overall by:

The fact that one red node is isolated seems to give room to improve, but there is no way to make 10.

The calculations have left an interesting jump from degree 0 with eight red nodes and degree 2 with nine. How about degree 1? Can we do that with 9 nodes? We can pack four disjoint edges but then there is nowhere to stick an isolated node.

So for 9 nodes, which is , the best we can do is degree 2, which is . This is what Huang proved:

Theorem 1Every subgraph induced by nodes of the -dimensional hypercube graph has a node of degree at least .

This is completely tight. When is a perfect square there is a way to achieve as the maximum degree (shown here). Otherwise the least integer above is best. Thus every subgraph of the -cube induced by 17 nodes has a node with three neighbors, but you can go as high as 257 nodes in the -cube while keeping the maximum degree to 3.

We will mention the relation to Boolean sensitivity only briefly. The nodes of the -cube correspond to truth assignments in . Since every red node has neighbors in the cube but at most red neighbors, the color function is highly sensitive to bitflips. But every flip also changes the parity of . Hence the *exclusive-or* of the color function with the parity function has *low* sensitivity.

But not too low: Huang proved it is at least . That was enough to prove the conjecture. I’ve cut two sections on Boolean sensitivity from this post’s original draft—let’s just say the connection to the -cube and graph degree was known since this 1992 paper. Here we’ll focus on what it took to prove this theorem.

From my undergrad days I’ve kept an interest in spectral graph theory. One of the basic facts is that the degree of a graph is always at least as great as the largest eigenvalue of its adjacency matrix . For a -regular graph they are equal. Huang’s first trick is to note that the classic proof of this also allows values on edges:

Lemma 2Let be a symmetric matrix obtained from by multiplying some entries by and any of its eigenvalues. Then .

*Proof:* Choose an eigenvector such that and take an index that maximizes . Then

Dividing out gives the lemma.

So now what we want to do is find conditions that force when is a -vertex subgraph of the -cube with , where . The trick that Huang realized is that he could do this by making sit inside a matrix with at least eigenvalues of .

To see how, form by knocking out the last row and column of . Since and are both real and symmetric, their eigenvalues are real, so we can order them and in nonincreasing order. The basic fact is that they always *interlace*:

See this for a one-page proof. The neat point is that you can repeat this: if you get by knocking out another row and corresponding column, and are its eigenvalues in order, then

It follows that . If you do this again, you get a matrix whose leading eigenvalue is still at least as big as . Do it times inside , and you’re still above , which we just said we will arrange to be . Thus if we knock out the white nodes, we will get the graph on the red nodes with adjacency matrix and conclude:

Plugging into the lemma gives:

(In fact, as also noted on Scott’s blog, this case of interlacing can be inferred from simpler reasoning—but our point is that the interlacing theorem was in Huang’s bag of tricks.)

Finally, how do we lay hands on ? We want a matrix of trace zero such that . Then all its eigenvalues are and . They come in equal numbers because they sum to the trace which is zero. So we will have eigenvalues of , as needed. And we would want to be the matrix of the -cube but that doesn’t work: each entry of its square counts all paths of length 2 from node to node and that number can be nonzero.

This is where the trick of putting on edges comes in, and we can explain it in a way familiar from quantum. We arrange that every 4-cycle of the -cube has exactly one edge with . Then the pairs of paths from one corner to the opposite corner will always *cancel*, leaving whenever . And because there are ways to go out and come back along the same edge, always contributing or either way. Huang defines the needed labeling explicitly by the recursion:

This puts a sign on exactly one-fourth of the entries in the needed way. OK, we changed Huang’s subscripts for consistency with “” above and also to note that the basis could be . Anyway, he verifies directly by simple algebra and induction. That’s it—that’s the proof.

Why was it hard to spot? Dick and I believe it was the trick. In the 1980s, I thought about ways to convert undirected graphs into directed ones by putting arrows on the edges, but not signs. The chance of thinking of it maybe rises with knowing quantum ideas such as interference and amplification. Now we can see, OK, is the quantum NOT gate and the recursion treats signs in similar fashion to the recursion defining Hadamard matrices. The matrix is unitary, so it defines a quantum operator. This all goes to our main point about having tools at one’s command—the more tools, the better.

Huang’s theorem still leaves a gap between a quadratic lower bound and his 4th-power upper bound (my longer draft lays this out). Can this gap be closed? In discussing this, Huang notes that his spectral methods need not be confined to sub-matrices of the -cube, and our thoughts of involving quantum are similar. Can quantum tools improve the results even further?

]]>

*Can theory help?*

source; art by Bill Hennessy |

John Roberts is the Chief Justice of the United States.

Today I will discuss the recent Supreme Court decision on gerrymandering.

The 5-4 decision in Rucho v. Common Cause takes the courts out of deciding if redistricting was done fairly. Roberts, penning the majority argument, felt that it was hard, if not impossible, for courts to determine whether districts were reasonably drawn. That is, whether partisan motives dominated when they were created.

I will explain what gerrymandering is, and how computational methods may play a role. I must add a takeaway:

The current view of computational methods to avoid gerrymandering may be based on incorrect assumptions.

You probably know that gerrymandering is used, negatively, to describe creating voting districts that do not reflect the voters will. The term was coined as part of an attack on the then Governor of Massachusetts in 1812. The shape of one particularly contrived district looked like a salamander. Since his name was Elbridge Gerry, it became a *gerrymander*. He was a Democratic-Republican and was not re-elected.

Here is a figure from our friends at Wikipedia that presents examples of gerrymandering.

By redistricting in the above, one can achieve anything from districts all won by the majority, to most won by the majority. These examples, show how the party who controls the districts can control the outcome.

Well that is an overstatement. They can control the believed leanings of the voters in the districts. In the above example, they can control how yellow a district is. Real life is complicated by other factors:

- A candidate may win because they are just more popular. That is a green candidate could still win in a yellow district.
- A candidate may win because of random fluctuations. If the voter margin is small enough, random fluctuations could change the “expected” outcome.

The latter is a danger for gerrymanderers. This site on gerrymandering says:

The trick is not to spread your voters out so much that districts become vulnerable to flipping to the other party in the normal give and take of electoral politics.

Since Roberts is not our intended audience we will use mathematical definitions. I am sure Roberts and the rest of the Supreme Court justices are smart, but we have our own methods. We do not use legal jargon such as “prima facie” and suspect they do not use math jargon like “prime” numbers.

So let the yellow party have fraction of the voters. Suppose the voters have to be divided into districts of the same size. A division is just a vector so that

and each is non-negative. For such a vector the number that yellow *wins* is the number of indices so that

Note, we assume that of the voters makes a district a win. We can change this to strictly larger than , but will leave it for now.

Definition 1Define to the maximum over all of the number of wins; and define to be the minimum number of wins.

Note, we do not care who is doing the redistricting. Nor do we care about the geometry. For example

What can we say about these functions? Here are some simple observations.

If , then . Just place fraction of the voters in each district.

If , then . No matter how the districts are drawn there must be at least one where yellow has a majority.

An advantage of these functions is that now we can discuss growth rates, not just present examples. A strength of theory is that we have replaced statements like “this algorithm is fast” by formulas for their running time. Another advantage is that these functions are *independent of geometry*. Previously I thought the dominating issue was how regions looked. Now I believe the issue is how close one gets to the best and worst case: and .

Lemma 2Suppose that . Then

*Proof:* Let . Let’s create the arrangement that makes green win as many districts as possible. If regions are mostly green then that takes of the green voters. This implies that

So it follows

This proves that is equal to .

It may help to set where . Let’s agree to ignore the rounding off and delete the floor and ceiling functions. Then

and so

Thus for we get that

This says that independent of any geometry if yellow has a ten percent majority, then best case if they set the districts they could win all, and if green sets the districts the worst they can get is twenty percent of the districts.

There is long-term and continuing interest in algorithms that automate redistricting. The hope is that automated systems will be able to create districts that are fair. A trouble with this research is that there is no universal notion of what makes districts fair. The mantra is:

A redistricting is fair if and only if the districts collectively satisfy some geometric criterion.

An explicit statement of such a criterion, from a site on such algorithms, is:

The best district map is the one where people have the lowest average distance to the center of their district.

Of course the name “gerrymandering” came from how districts looked. Somehow this enshrined the notion that districts must look right. I feel this could be wrong.

Here is a quote from a recent paper on an algorithm for redistricting.

We propose a method for redistricting, decomposing a geographical area into subareas, called districts, so that the populations of the districts are as close as possible and the districts are compact and contiguous. Each district is the intersection of a polygon with the geographical area. The polygons are convex and the average number of sides per polygon is less than six.

The authors are Philip Klein and Neal Young, who are well known researchers on various aspects of algorithms. They do interesting work, their paper is interesting, but the assumption that geometry is the key I do not get.

I think that we need to go beyond geometry to understand and avoid gerrymandering. The connection between geometry and fairness is driven—I believe—by tradition. Voters in the same district probably want to be near each other. In the past being near each other was probably important, since travel was so difficult. Perhaps today location is less of an issue then it was before cars, phones, cell phones, internet access, and email. Perhaps districts can be fair and yet do not look good.

An analogy to cake cutting may occur to you. Recall in the cake cutting problem success is *not* measured in how the pieces of the cake look. It is only measured by whether the parties cutting the cake are happy. Is there some way to push this analogy? I just came across a paper using the cake cutting method: *A Partisan Districting Protocol With Provably Nonpartisan Outcomes* by Wesley Pegden, Ariel Procaccia, and Dingli Yu. More in the future.

Can algorithmic methods help? Is geometry the fundamental issue?

[sourced photo, other word edits]

[ Composite of various sources ] |

Michael Griffin, Ken Ono, Larry Rolen, and Don Zagier (GORZ) have recently published a paper on an old approach to the famous Riemann Hypothesis (RH).

Today we will discuss their work and its connection to P=NP.

Their paper is titled, “Jensen polynomials for the Riemann zeta function and other sequences.” The final version is in the Proceedings of the National Academy of Sciences (PNAS).

The RH is still open. We are not aware of any update on the status of Michael Atiyah’s claim to have solved it since this note on an AMS blog and our own discussion of his papers’ contents.

Recall the RH is a central conjecture in number theory, and it has the following properties:

- If true, it would greatly enhance our understanding of the structure of the primes:
- It has resisted attacks for 160 years and counting.
- There are an immense number of statements equivalent to the RH.

See, for example, the survey paper, “The Riemann Hypothesis,” by Brian Conrey.

Complexity theory has the P=NP question; number theory has the RH. Both seem to be beyond reach, and both are fundamental questions. The recent work of GORZ has created buzz among number theorists—perhaps they are on the verge of a breakthrough? Is there hope that we might see progress on P=NP? Or must we wait 160 years?

The buzz is reflected in a May 28 story in the online journal *LiveScience*. It quotes Enrico Bombieri, who wrote the official Clay Prize description of RH, as saying:

“Although this remains far away from proving the Riemann hypothesis, it is a big step forward. There is no doubt that this paper will inspire further fundamental work in other areas of number theory as well as in mathematical physics.”

Bombieri wrote an accompanying paper in PNAS, in which the above quoted sentences also appear. We will try to explain what the excitement is about.

RH is equivalent to the hyperbolicity of Jensen polynomials for the Riemann zeta function. Note: A real, polynomial is *hyperbolic* if all its roots are real—a fancy name for a simple concept.

There is an analytic function so that

Almost a hundred years ago, Johan Jensen and George Pólya created an approach to the RH based on . Rather than prove has only real roots, they showed it is enough to show that a family of polynomials all are hyperbolic. The point is that polynomials are “simpler” than analytic functions—at least that is the hope.

What GORZ prove is this:

Theorem 1For each and almost all , the polynomial is hyperbolic.

Of course, “almost all” means that there at most a finite number of polynomials that are not hyperbolic. They also show that for degree fixed,

Where the polynomials have only real roots.

This is explained further in a talk by Ono, which has this table showing how the polynomials converge:

Note: they use different notation for the polynomials.

There are many equivalent formulations of the RH. While all are equivalent, some are more equivalent than others. Some seem to be a more plausible path toward a resolution of the RH. Of course to date none have worked.

There is a way to distinguish equivalent formulations of the RH. Suppose that depends on some parameter . Suppose that as increases the statement becomes a weaker version of the RH.

Then let us informally say that the formulation has the “Approximation Property” (AP). The point is that progress in proving caes of —even as partial progress—is exciting. But if the equivalence only holds for , with no connections for higher , then the partial progress could be seen as morally useful—but it is not really mathematically useful.

There are equivalences of the RH that have AP and some that seem not to have it. The approximation for the RH is natural. The RH states that there is no zero of the Riemann zeta function with above . The weaker statement: If is a zero, then is unknown for any in the open interval . This can be used to get a property with the AP.

For example, this 2003 paper by Luis Baez-Duarte, titled “A new necessary and sufficient condition for the Riemann hypothesis,” has the AP. He notes, in our terminology, that his property is an AP.

Does the approach of GORZ have the AP? The issue is that while we get a universal statement in terms of , the “all but finitely many” condition on works against its being an approximation—what if the finite sets of exceptions grow in terms of in ways that offset the purpose behind the equivalence?

]]>

*NY Times article on the paper*

LinkedIn source |

Lucy Lu Wang is the lead author of a paper released this Friday on gender parity in computer science. The paper is from the Allen Institute for Artificial Intelligence. The authors are Wang, Gabriel Stanovsky, Luca Weihs, and Oren Etzioni. We will call them WSWE for short.

Today we will discuss some of the issues this study raises.

The paper was highlighted by the New York Times in an article titled, “The Gender Gap in Computer Science Research Won’t Close for 100 Years.” The news article begins with an equally sobering statement of this conclusion:

Women will not reach parity with men in writing published computer science research in this century if current trends hold, according to a study released on Friday.

We are for gender-neutral opportunities and have always promoted this—see our earlier discussions here and here. We are for doing a better job in supporting women in computer science. The study by WSWE is an important paper that helps frame the problem. Quoting them:

The field has made more of an effort to reach a more balanced gender status. But the data seems to show that even with all the progress, we are still not making the change fast enough.

I suggest that you might wish to read the paper. Unfortunately there are many papers with similar conclusions—see this by Natalie Schluter, for example.

Here are some points of the paper by WSWE:

*There is a measurable gap*. No one would, I believe, doubt this. But it is important to see that it is measurable.

*The gap is shrinking, but slowly*. Again this seems correct, but whether it is shrinking in all relevant measures of publication weight is still an issue.

*The predictions*. Perhaps it will not be closed for over a century.

*Modern technology allows such a study*. This is one aspect that we can all applaud. WSWE used automated tools that allowed this study to search millions of papers.

WSWE filtered a corpus of **2.87 million** papers tagged as in computer science. The volume constrained their approaches to handling several basic issues.

How to tell the gender of an author? They use first names and try to detect gender from that alone. This is not easy. Not only can differently-gendered names in different nations or language groups have the same Romanized form, many names apply to both genders within those groups. The names Taylor and Kelley are perfect examples pf the latter.

WSWE used a statistical weighing method. So “Taylor,” for example, would be weighted as 55 percent female, 45 percent male. The weightings come from a large database called *Gender API* compiled from government and social agencies not directly related to computer science.

Another issue concerns the prediction part of their paper. They attempt to extrapolate and guess when there will be parity between female and male authorship.

As all predictions this is not easy. It is my main complaint with this and other papers on the gender-gap issue. They predict that parity will not be reached until 2167, in 168 years. An earlier study puts the parity point at 280 years away.

I believe that a major issue is hiring by computer science departments and other institutions. A major CS department just hired assistant professors, of which were male. This is a problem.

Should studies on the gender gap count all papers? Perhaps they should weight the papers by some citation indices. Are women writing more impactful papers? What percent of papers by gender have citations rate above X?—you get the idea.

Finally I wonder if parity is the right goal? **How about aiming for more women papers than men**? Why not?

[various formatting and word edits]

[ Jeff ] |

Jeff Lagarias is a mathematician **or** a professor at the University of Michigan.

Today I wish to discuss Diophantine equations.

Note, the “or” is a poor joke: For mathematicians is true also when both and are true.

Jeff has a paper titled *Complexity of Diophantine Equations* and related talk version. It is a nice review of some of the issues around Diophantine equations.

He has worked on number theory, on complexity theory, and on many other problems. Some are applied and some theoretical. He with Peter Shor solved an open problem years ago that was first stated by Ott-Heinrich Keller in 1930. Our friends at Wikipedia state:

In geometry, Keller’s conjecture is the conjecture that in any tiling of Euclidean space by identical hypercubes there are two cubes that meet face to face.

For dimensions ten or more it is now proved to be false thanks to Jeff and Peter. I am amazed by such geometric results, since I have no geometric intuition. Ten dimensions is way beyond me—although curiously I am okay in dimensions. Strange? Jeff has a relevant quote:

Every dimension is special.

Recall the main problem is to find solutions to equations usually restricted to be integers or rationals. This restriction makes the problems hard as in *open*, and hard as in *computationally difficult*.

Jeff mentions the following hardness result in his talk: Solving in nonnegative integers for

is as hard as integer factoring. This follows since and must be non-trivial factors of . Note this relies on the fact that and are both greater than or equal to .

We can delete “nonnegative” in the above by the following simple idea. Suppose that has two factors that are both congruent to modulo . If not, then we can still factor, but I believe the idea is clearer without adding extra complications. Then solve in integers

By assumption on there is a solution of this equation. Moreover suppose that are some solution. The key is that cannot be and also cannot be . Then it follows that each is also not equal to in absolute value. Thus

and we have factored .

Jeff points out that it is unlikely that Diophantine problems are going to be classified by the NP-hard machinery. He says

shows the (possible) mismatch of “natural” Diophantine problems with the P versus NP question.

Factoring is one of those in between problems that could be in P, but most believe that it could not be NP-hard.

The problem of deciding whether a polynomial has a *rational* root is still open. That is given a polynomial with integer coefficients, does the equation

have a rational solution ? Of course the famous Hilbert’s Tenth problem asks for integer solutions of polynomials and is undecidable. The rational case is open. We have discussed it before here. See this for a survey.

I had tried to see if we could at least prove the following: The decision problem over the rationals is at NP-hard. I thought for a while that I could show that solving equations over the rationals is at least NP-hard. My idea was to try to replace by a equation that works over the rationals. The attempt failed. I was also unable to find a reference for such a result either. So perhaps it is open.

Is it undecidable to determine whether a polynomial has a root over the rationals? Can we at least get that it is factoring hard?

]]>

*Different ways of recursing on graphs*

Bletchley Park 2017 source |

William Tutte was a British combinatorialist and codebreaker. He worked in a different group at Bletchley Park from that of Alan Turing. He supplied several key insights and algorithms for breaking the Lorenz cipher machine. His algorithms were implemented alongside Turing’s on Colossus code-breaking computers.

Today we discuss graph recursions discovered by Tutte and Hassler Whitney.

Tutte wrote a doctoral thesis after the war on graph theory and its generalization into *matroid theory*. We will follow the same arc in this and a followup post. He joined the faculty of the universities of Toronto and then Waterloo, where he was active long beyond his retirement.

For more on Tutte and his work, see this article and lecture by Graham Farr, who is a professor at Monash University and a longtime friend of Ken’s from their Oxford days. We covered some of Tutte’s other work here.

The two most basic recursion operations are *deleting* and *contracting* a chosen edge in a given graph :

These operations produce graphs denoted by and , respectively. A motive for them harks back to Gustav Kirchhoff’s counting of spanning trees:

- A spanning tree of avoids using edge if and only if it is a spanning tree of the graph with deleted.
- A spanning tree of uses edge if and only if the rest of it is a spanning tree of the graph after contracting .

Well, this is not how Kirchhoff counted trees. Counting via the recursion would take exponential time. Our whole object will be telling which cases of the recursions can be computed more directly.

Note that contracting one edge of the triangle graph produces a *multi-*graph with one double-edge. Then contracting one edge of yields the loop graph .

Thus contraction yields non-simple undirected graphs, but the logic of counting their spanning trees remains valid.

The order of edges does not matter as long as one avoids disconnecting the graph, and the base case is a tree (ignoring any loops) which contributes .

A similar recursion counts colorings that are *proper*, meaning that for each edge , .

- A proper coloring of makes iff it is a proper coloring of .
- A proper coloring of makes iff it induces a proper coloring of .

This leads to the recursive definition of the *chromatic polynomial*:

The base cases are that an isolated vertex contributes , whereas an isolated loop contributes since its single edge is never properly colored. The final rule is that is always the product of over all connected components of . Then counts the number of proper -colorings.

This is like the recursion for coutning spanning terees except for the minus sign. Tutte’s brilliant insight, which was anticipated by Whitney in less symbolic form, was that the features can be combined by using two variables and . Call an edge a bridge if it is not part of any cycle. If is not a bridge, the recursion is

The base case is now a graph with some number of bridges and some number of loops, which gives . An important feature is that all -vertex trees have the same Tutte polynomial , since there are edges and they are all bridges. The following are just some of the beautiful rules that follows. Let stand for the number of connected components of .

- counts the number of spanning trees forests. This counts the number of spanning trees if is connected.
- , when multiplied by , yields the chromatic polynomial.
- counts the number of spanning subgraphs.
- is just .
- gives the Jones polynomial of a knot related to .

There are many further relations. The Jones polynomial has many applications including in quantum physics.

Recall our definition of the “amplitude” of an undirected -vertex graph from the “Net-Zero Graphs” post:

where is the number of black-and-white 2-colorings that make an even number of edges have both nodes colored black, and for an odd number.

There does not seem to be a simple recursion for from and . We can, however, obtain one by using another kind of contraction that adds a loop at the combined vertex:

We denote this by . We have not found a simple reference for this. We obtain the following recursive formula:

This recursion allows to be a bridge, so the base cases are for an isolated vertex and for a loop. More generally, the basis is for a node with an even number of loops, for odd. Here is an example for the ‘star graph’ on 4 vertices:

The diagram would need another layer to get down to (products of) base cases, which we have shortcut by putting values of for each graph at a leaf. Adding the products over all branches gives . For the star graph,

Clearly this brute-force recursion grows as . This is slower than the order- time of using the coloring definition directly, but what all this underscores is how singular it is to be able to compute in polynomial time, indeed time. The search for a more-efficient recursion, one that might apply to -hard quantities, leads us to consider a more-drastic operation on edges.

The new recursion operation is well illustrated by this figure:

Two vertices disappear, not just one. Not only does the edge disappear, but any other edge incident to or from a vertex gets “recoiled” into a loop at . We denote this operation by to connote that is not just deleted but “exploded.”

Properly speaking, we need to specify what happens if there are other edges between and or loops at or . In an upcoming post we will see that those become *circles* in a *graphical polymatroid* which generalizes the notion of a graph. For now, however, it suffices to let be the total number of vaporized edges, including . Then we obtain a two-term recursive formula:

The base cases for isolated vertices are the same as before, but explosion also needs a base case for pure emptiness. This contributes . In the following example diagram, for the path graph on four nodes, we denote such base cases by `w’ for “wisp”:

Note again the rule that when the recursion disconnects the graph, the component values multiply together. Thus the value is

This is different from the amplitude of the star graph. What this means is that does not obey the rules of the Tutte polynomial, which is the same for both of these 4-vertex trees.

To prove the recursion equation (1), for , note that every coloring has the same odd/even parity of black-black edges for as for except those that color both and black. Let denote the colorings among the latter that make an even number of black-black edges (including ) overall, for an odd number. Then

Now if there are no other edges between or loops at and , then is the same as the number of colorings of that make an even number of black-black edges, and becomes the odd case in again because we subtracted . Considering the sign change from other edges or loops and yields equation (1). It is also possible to “explode” a loop, and our readers may enjoy figuring out how to define it.

We can expand on this by defining a polynomial such that . The base cases are for an isolated vertex but still for a “wisp” and for a loop. The basis extends to give for an isolated node with an even number of loops and for odd. Another way to put it is that two edges with the same endpoints, or two loops at the same node, can be removed. The above diagram shows that for the path graph ,

Whereas, the recursion for the star graph—noting that the “star” on two nodes is just a single edge—gives:

This is not the same polynomial as , again implying that is not a specialization of the Tutte polynomial. We will show in the last post in this series that does specialize the polynomial introduced in this 1993 paper titled, “A Characterization of Tutte Invariants of 2-Polymatroids” and covered further in this 2006 paper.

What other rules does our “amplitude polynomial” follow? We will explore this in the mentioned upcoming post. What other quantities can it be made to count?

What we called “explosion” is in fact attested as the natural form of *contraction* for the **polymatroids** considered in these papers. What further uses might “explosion” have in graph theory apart from polymatroids?

[ Essen ] |

Arno van den Essen is the author of **the** book on the Jacobian Conjecture.

Today I want to highlight one of the ideas he presents in his book.

The theory is sometimes called stabilization methods. Or K-theory methods. It is often used in connection with the famous Jacobian conjecture (JC). I will not say any more about JC now—see this for some comments we made a while ago.

Essen states that the philosophy of stability theory is:

It is possible to change a map and make it “nicer” provided we allow to be increased.

Not a direct quote.

That is provided we can change to

where is larger than . The whole method is based on a simple observation. Suppose that and define by

Then is injective if and only if is injective. This is trivial—really trivial. From trivial observations sometimes important methods are created.

I will now explain how why this is useful by presenting an example.

Suppose that

is a polynomial mapping where

We wish to show that we can replace by another polynomial map that has degree at most . Moreover, the new polynomial map is injective if and only if the original polynomial map is injective. The method can be used to preserve other properties of the polynomial mapping, but being injective is a important example.

There seems to be no way to lower the degree of without destroying its structure. But if we use the stability philosophy and allow extra dimensions we can succeed. That is we replace by the function

Why does this work? The idea is that the extra two dimensions can be used as extra *registers*. These registers can be used to simplify the computation, and reduce the degree of .

Let have one term that we wish to remove. To be concrete, let’s assume the term is

Start with the input

Now change this to

where

This is an invertible transformation. Note this is only possible because we have two extra dimensions or registers. Otherwise, we could not compute and without messing up the rest of the computation. Now map this to

This is nothing more than computing the original function and ignoring the new registers.

The next step is to go to

The last point is that

cancels the term we wished to remove.

The price we pay is that new terms have been added, but they have at most degree .

We can prove by induction the following general theorem:

Theorem 1Suppose is polynomial map where is a field. Then we can construct a polynomial map of degree at most denoted by so that it is injective precisely when is injective.

Even stronger theorems are possible. For example, the polynomial map can be required to be cubic linear:

Definition 2Suppose that is in matrix over the field . The is thecubic linearmap for the matrix is defined to be the map

where is defined to be the vector so that for all coordinates .

See Essen’s book for more details. Note a cubic linear map when is of the form:

where are constants. This reduction to cubic linear maps is quite pretty, and requires a clever application of the stabilization method.

The reduction in degree is possible only to degree . It cannot be reduced to degree in general. Let’s look at the intuition why this is true. The last step is

which is

Suppose has a leading term of degree . Also suppose that has degree and has degree . Then

since the leading term goes away. But and have degrees and respectively. So to keep and both or less, it follows that can be at most . However, in this case a term of degree is removed and other terms of degree are added. This is not a formal proof that the method cannot reduce the degree to . I do believe that formalized properly it is a theorem that reduction to degree is in general impossible.

I like this technology. I wonder if it might be possible to use it on some of our favorite problems. I do like that it conserves invertibility. This seems like it could be related to quantum computing, because of the reversible nature of quantum computing.

]]>
*A new class of undirected graphs with quantum relevance*

Cropped from source

Gustav Kirchhoff was a German physicist active in the mid-1800s. He is known for many things, especially for his “Laws” governing voltage and current in electrical circuits. Today we ask whether anything akin to Kirchhoff’s laws can be formulated for quantum circuits.

What may be less known about Kirchhoff is that he was a pioneer in graph theory. He proved Kirchhoff’s Theorem that the number of spanning trees equals the determinant of an associated matrix. This shows that the trees can be counted in polynomial time—a cool result. Here we present a class of graphs arising from quantum circuits and associated operations more complex than determinants.

Our search for new graph-based laws was driven by our work on simulations of stabilizer circuits. We have discussed this recently here and here. The bottom line is this:

A new class of graphs arises in a natural way and holds a key to improving certain quantum simulations.

We call these *net-zero graphs*. We like to imagine that Kirchhoff would have been interested. We will say more about why after we present the graphs.

The new class of graphs comes from a natural counting problem. Consider black/white two colorings (not necessarily proper) of the vertices of a graph , and count the number of edges whose two nodes are both colored black, being called B-B edges. Let be the count of colorings that make an even number of B-B edges and be the count of colorings that make an odd number of B-B edges. Then divided by . Now we are good to define “Net-Zero” graphs as follows:

An undirected graph is net-zero if .

Furthermore, we can call *net-positive* if and *net-negative* if . By simple trial and error, the smallest net-zero graph is the triangle graph and the graph made by two triangles sharing an edge is net-zero as well. Here are some connected net-zero graphs of small size:

You might ask why study such labelings of graphs? Why is net-zero an interesting property? An equivalent formulation of was given as “” (divided by ) in a 2009 paper by Leslie Goldberg, Martin Grohe, Mark Jerrum, and Marc Thurley, as part of a larger enumeration of polynomial-time cases. Their proof works by reduction to the problem of counting solutions to quadratic polynomials modulo 2, whose time we just improved from to . Their paper does not mention quantum but does involve Hadamard-type matrices. Thus the short answer is that net-zero captures a type of balancing property that is related to understanding quantum circuits.

The following elementary facts show how the theory of our graphs takes shape.

Proposition 1Every cycle graph with odd is net-zero.

This follows because every coloring of has an even number of B-W edges. Hence the number of monochrome edges is odd, and so complementing the coloring flips the parity between B-B and W-W edges. However, having an odd cycle as an induced graph does not make a graph net-zero. An example of this would be a 4-clique graph. Any subset of vertices of size 3 gives an triangle which is net-zero, but the 4-clique itself is net-negative.

Proposition 2A graph is net-zero if and only if one of its connected components is net-zero.

This fact is intuitive when we look at the graph as a tensor product over the connected components, so the colorings to each component are independent. Every coloring on nodes outside the net-zero connected component can be easily extended to one coloring for the entire graph by coloring nodes on the net-zero component, so the difference restricted by is zero.

Now let’s deviate from net-zero graphs. What are some typical net-positive graphs?

Proposition 3Bipartite graphs are net-positive.

To prove this, let be the two disjoint vertex sets such that each edge connects one node from and one from . If all nodes in are set to be white, then there will be zero B-B edges regardless of how is acolored, and in this case. Now if any of the nodes, say , in is colored black, then the number of B-B blacks equals the number of nodes connected to that are colored black, and in this situation by straightforward combination calculation. Hence bipartite graphs are net-positive.

As a consequence, since all trees are bipartite, all trees are net-positive. So is for even. Net-negative graphs may seem to be rarer. We invite readers to work out from Pascal’s triangle when the -clique is net-negative, net-zero, and net-positive. Congruence modulo is involved.

Other interesting examples come from allowing self-loops. The smallest net-zero graph of this kind is a single self-loop. But a 2-node graph with an edge connecting them and two self-loops is net-negative, and so is a graph of two triangles connected by one edge. Pictorially, these two graphs are:

There is a “local equivalence” between a single self-loop and a triangle: Any self-loop in a graph can be replaced by a triangle using two new vertices, and the resulting graph will be net-zero if and only if is.

There is a special class of quantum circuits that relate closely to graphs. They use just two kinds of quantum gates: Hadamard gate and the gate. For more on quantum circuits see this elementary post and this more involved post.

Definition 4Given a graph , the correspondinggraph state circuitinvolves qubits and consists of:

- An initial Hadamard gate on each qubit line .
- For every edge , a gate connecting lines and . The order of placing the gates does not matter.
- A closing Hadamard gate on each line .

These circuits are a subset of *stabilizer circuits*, which we have been discussing. They become equivalent to stabilizer circuits if we also allow so-called phase gates on single qubits, where they are analogous to a loop or “half-loop” at the corresponding vertex. We will stay with the simpler circuits here. The connection to graphs is expressed by:

Theorem 5For any graph , , that is, the amplitude of measuring an all-zero output given an all-zero input. In particular, is net-zero if and only if .

Theorem 5 implies that whether a graph is net-zero can be decided in time. The question is, can we improve the time to , which for dense graphs means linear in the number of edges? The reason why we want to do so is the following further theorem:

Theorem 6If net-zero graphs of nodes with self-loops allowed are recognizable in time, then computing the strong simulation probability for quantum stabilizer circuits is -time equivalent to computing matrix rank over .

This is proved in section 5 of our paper, which has a duality technique for eliminating the self-loops from phase gates that works for the probability but possibly not for the amplitude. Another way of stating our theorem is:

Theorem 7Given any -vertex graph , we can compute in time given only the rank of the adjacency matrix of and the yes/no answer about whether .

This result extends to computing the probability of any output of a stabilizer circuit given a standard-basis input. This is why the decision problem for recognizing net-zero graphs is important.

In upcoming posts we will connect net-zero graphs further to ideas of circuit “laws” by defining recursions for . These recursions do not give efficient algorithms by themselves, but they connect to a wide theory involving graph polynomials and matroids. That theory includes Kirchhoff’s counting of spanning trees as a special case, and we will be interested in which other cases are polynomial-time feasible. This may position quantum computing as a meeting point for closer connections between work such as this 1997 paper by Andrei Broder and Ernst Mayr on counting minimum-weight spanning trees in time and the paper by Goldberg et al. mentioned above.

What is the complexity of deciding whether a given -vertex graph is net-zero? We know it is at worst order-. If it is , then we obtain a really tight connection between computing matrix rank and computing a quantum simulation probability.

Are there further applications of net-zero graphs?

[gave Kirchhoff his second “h”]

[ GIT ] |

Ray Miller just passed away. He had been a researcher and leader at IBM Research, Georgia Tech, and University of Maryland. At all he did important research and also was a leader: a group head, a director, and a chair.

Today we remember Ray.

For starters you can read this or his memoir here. Ray started in the field with an electrical engineering PhD thesis tilted *Formal Analysis and Synthesis of Bilateral Switching Networks*. See this for his paper.

Ray’s memoir does not mention tennis. But we played tennis when I visited IBM, and also when we could while at a conference. Ray was not slim, not fast, not obviously athletic. But he was the best tennis player in our group. He was miles above me. His trick was he could control the ball, especially on his serve so it was untouchable. He could spin it so that it landed and bounced at right angles. He was so good that he hardly ever had to move. When I played him in singles, we had to agree “no spinning serves”. When we played doubles there was a small chance that someone could return his ball, but not much.

I think this is a fair way to summarize Ray: It was easy to underestimate him. Ray always had a smile on his face. I cannot remember him being anything but happy. This may led some to think he was not serious. But Ray was. He did wonderful research and was a leader in our field. He changed, for the better, all the places he called “home”. I can directly attest to his impact at Tech; others I believe can attest to IBM and Maryland. He was a great editor for *Journal of the ACM*, and did much more for our field. Thanks Ray.

Here are just a few comments from friends of Ray.

*Rich DeMillo*:

Ray was a quiet but strong and effective leader and a good friend to many of us. Nancy Lynch and I recruited him to be School of ICS Director, bringing immediate stature—both internal and external—to computer science at Georgia Tech. Up until that time Information and Computer Science was an odd duck interdisciplinary graduate program in a very traditional engineering school that had not invested in the field despite the early success of the School of Information and Computer Science and a world-class Burroughs installation (Georgia Tech had played a key role in Algol development in the ’60s). Ray was an engineer with impeccable credentials and was taken seriously by the administration in ways that the library scientists, linguists, philosophers, psychologists, cyberneticists, and mathematicians who founded the School never managed to achieve.Dick Lipton and I were long term collaborators on theory research with Ray, and Rays’ presence helped shine a national spotlight on the School. He was the editor-in-chief of the Journal of the ACM and palled around with luminaries like Sam Winograd and Dick Karp.

In addition to his own work in switching theory and automata, Ray had co-edited the volume in which Karp’s NP-completeness paper appeared, and so his name was forever associated with that ground-breaking paper. From that point on, Atlanta became a mandatory stop on the national research circuit that was the hallmark of theory research in those days.

One of his first accomplishments was snagging the Computer Science Conference, the large, research-oriented conference for the field. It gave Georgia Tech a chance to showcase its work for the rest of the world. There was a steady rise in rankings and a steady flow of visitors like Michael Rabin, Leslie Lamport, Mike Fisher, Andy Yao, Ravi Kannan, in addition to Karp, Winograd, and Lipton. Ray tried to entice Lipton to Georgia Tech with what was at the time the university’s juiciest startup package. He was ultimately successful, but it took nearly twenty years, and by that time, Tech had replaced the School of ICS with the College of Computing, and Ray was in semi-retirement in Maryland.

*Umakishore Ramachandran*:

I have fond memories of my early years at Tech after being recruited by Ray to join the small but vibrant group of faculty in ICS specializing in theory, systems, and AI. Ray was extremely caring and supportive in nurturing junior faculty. One could say that Ray’s leadership was instrumental in the transformation of CS at Georgia Tech and putting GT on the map to compete with other more established CS departments around the nation.Ray helped create a sense of family in the department. I recall every day he would be at lunch in the faculty club which in those days served coffee at $0.10. He would be there eating a healthy meal featuring a giant sausage and reserving an entire table for the ICS faculty to join him for lunch. It created such a friendly and amicable environment for discussing any issue.

I feel compelled to share a lighter anecdote when I got hired by Ray. Those were the days of terminals connected to mini computers, DEC VAX 750 and 780, and I wanted Ray to promise me that he will give me a terminal and modem for connecting to the campus computers from home. Even as a grad student at UW-Madison I had a 2400 baud modem at home. When I arrived at Tech, Ray gave me a 300-baud acoustic coupler since I had not specified what speed modem I wanted in my startup package. Ray had a giant infectious smile all the time that made it difficult to get mad at him for anything. Note: .

He will be missed by anyone and everyone whose lives he touched.

*Bill Gasarch*

Here is a short quote from Bill and see his post for more details.

I was saddened to hear of Ray Miller’s death. He was at University of Maryland for many years. Since he got his PhD in 1957 and was still active into the 2000’s he had a broad view of computer science. He did theory AND systems AND other things. It was great to talk to him about the early days of computer science before we had these labels.

Ray is missed. Our condolences to his family and his many friends for their loss.

]]>