Old algebra helps build new algorithms

Nisheeth Vishnoi is a theorist at Microsoft Research who has many pretty results. Some of them require great technical expertise—the main idea may be clear, but the execution is quite difficult. Others of his results require less technical expertise, but their proofs depend on a flash of deep insight. It is rare, in my experience, to find this combination in one person.

Today I would like to talk about his recent talk at GIT on finding a balanced separator in an undirected graph. It was a beautiful talk that explained a proof of the former kind: clear ideas but very technical details.

This work is joint with Lorenzo Orecchia and Sushant Sachdeva and will appear in the next STOC—the paper is entitled “Approximating the Exponential, the Lanczos Method and an ${\tilde{O}(m)}$-Time Spectral Algorithm for Balanced Separator.”

One thing I really appreciate is that although the algorithm works on graphs and gives you a graph decomposition, the key ingredients are numerical and algebraic. A 37-year old result on polynomial approximations is combined with matrix algebra. As soon as you hear “matrix” you might think that algorithms of time ${n^\omega}$ are involved, where ${\omega \leq 2.373\dots}$ is the exponent of matrix multiplication, but no, the time stays strictly less even when the ${n}$-vertex graphs have ${m = \Theta(n^2)}$ edges.

Before I start discussing this work (OSV) I must disclose that I was once Nisheeth’s Ph.D. advisor—perhaps you are always your students’ Ph.D. advisor, even after they graduate. But I do not really know where he got the tremendous ability that he demonstrates now, since I cannot imagine that it came from my advising. He was great as a student, great as a colleague when at Tech, yet now he seems to operate at a higher level. Ah, well.

This phenomenon of students being better than their advisors has happened to me many times. It reminds me of a question that Ken Steiglitz used to ask me years ago. He wanted to know how we could make objects to extremely small tolerances when as humans we could only see fairly large differences? Another way to ask this is:

With tools that operate only to within a tolerance of ${\delta}$ how can we make objects that have tolerances of ${\epsilon}$, where ${ \epsilon \ll \delta}$?

Let’s leave that discussion for another day and move on to the result of OSV.

## Cutting Graphs

Undirected graphs ares used to model countless things in computer science; if they did not already exist, we would have had to invent them. Hence finding algorithms that decompose graphs efficiently is extremely important to theory in general, and to creating graph algorithms in particular. Decompositions can be used to create recursive algorithms and to solve many other problems on graphs.

One of the reasons that binary trees are often an accessible class of graphs is that they satisfy the following beautiful result:

Theorem: Let ${T}$ be a binary tree on ${n>1}$ vertices. Then there is an edge ${e}$ whose removal cuts the tree ${T}$ into two pieces, and each is of size at least ${n/3}$. Moreover the edge ${e}$ can be found in linear time.

(source, S’09 final)

There are three features to this theorem that make it useful:

1. The cut is small: it consists of a single edge.
2. The two pieces are balanced: each is at least half the size of the other.
3. The cut can be found in linear time.

I would love to report that this theorem on binary trees generalized to all graphs, or even to all bounded degree graphs. Of course this is false. There are bounded-degree graphs such that all balanced cuts have size ${\Omega(n)}$. This follows even if we weaken the notion of balanced to allow a piece to be only a ${b>0}$ fraction of the other, and even if we do not require that there is a polynomial time algorithm to find the cut.

There is a property called conductance of a graph, usually denoted by ${\gamma(G)}$, that measures how well the graph can be cut into two pieces. The main result of OSV is a new theorem that settles the question of designing asymptotically optimal algorithms for finding balanced cuts.

Theorem: There is a ${\tilde{O}(m)}$-time algorithm that given an undirected graph ${G}$, a constant balance ${b \in (0,1/2]}$, and a parameter ${\gamma}$, either finds an ${\Omega(b)}$-balanced cut of conductance ${O(\sqrt(\gamma))}$ in ${G}$, or outputs a certificate that all ${b}$-balanced cuts in ${G}$ have conductance at least ${\gamma}$.

The theorem is about as good as one can expect. The cut is balanced and its size is controlled by the conductance as it must be. Also the algorithm runs in ${\tilde{O}(m)}$ time.

## Almost Linear

In this paper and elsewhere you will see the phrase “almost linear,” and it is denoted by adding a tilde to the “O:” as in ${\tilde{O}(m)}$. This means that the algorithm in question runs in time ${O(n\cdot f(n))}$ where ${f(n)}$ is a product of terms that depend only logarithmically on ${n}$ and all other parameters. We would prefer a truly linear time algorithm, but given the state of our understanding, often almost linear is the best we can do. So we do what we can.

## An Old Theorem

OSV’s paper depends on a rather surprising result by Edward Saff, Arnold Schönhage, and Richard Varga (SSV), which proves the existence of a very good rational approximation to the function ${e^{-x}}$ on the entire interval ${[0,\infty)}$. As given in Corollary 6.9 on page 35 of OSV, SSV proved in 1975:

Theorem: For any integer ${k>0}$, there exists a degree ${k}$ polynomial ${p_{k}}$ such that ${p_{k}( \frac{1}{(1+x/k)})}$ approximates ${e^{-x}}$ up to an error of ${O(k\cdot 2^{-k})}$ over the interval ${[0,\infty)}$.

Note, the theorem proves only the existence of this good approximation, and this is one of the issues that OSV must overcome.

The reason that having a good rational approximation is important is that OSV uses this on matrices: this allows a certain linear algebra problem to be solved very fast. We will now turn and discuss this theorem.

## A New Theorem

In order to prove their theorem, OSV must be able to compute an exponential of a matrix fast. They do this by using SSV and quite a number of other tricks. The main result is:

Theorem: Given an ${n \times n}$ SDD matrix ${A}$, a vector ${v}$, and a parameter ${\delta \le 1}$, we can compute a vector ${u}$ such that

$\displaystyle || \exp(-A)v - u|| \le \delta ||u||.$

Further the algorithm runs in time ${\tilde{O}((m + n)\log(2 +||A||))}$.

Here SDD means that the matrix is symmetric and diagonally dominant. They use the beautiful work of Daniel Spielman and Shang-Hua Teng on approximately inverting matrices. One of the very technical details is the error analysis, since the approximation to the inverses they get from Spielman-Teng interact in a complex way with the error bounds of SSV. As usual read the paper for details.

## Open Problems

Can the result of SSV be used elsewhere? Are all linear algebra problems almost linear?

[fixed OSV theorem statement, fixed vertex v –> edge e in tree lemma statement, and sourced lemma]

April 25, 2012 9:21 am

Am I missing something?

In the diagram, it looks like you are eliminating an edge which results in a vertex reduction.

Don’t you want to red mark the vertex (cut point) to the upper right of the higher vertex connected by the red marked edge?

April 25, 2012 11:06 am

My mistake.

I was thinking: “If you remove a vertex …”

• April 25, 2012 1:16 pm

I fixed the lemma statement in-between your looks—it was noted at the bottom—but had to give a lecture and fix the SSV statement in the meantime. The notes for my class (on NC^1) included the tree lemma with “edge” instead of “vertex”. To answer Christian (Sommer) below, it’s a binary tree. Thanks to both and Moritz also below.

April 25, 2012 10:01 am

Small typo: In the SSV theorem you probably want to say that the error decays exponentially with the degree. So, perhaps O(k 2^{-k}) is the right dependence, but I don’t know the theorem.

April 25, 2012 10:36 am

Dick’s question about tools is associated to the following challenge, which is central to mechanical engineering folk-culture:

You are marooned on a desert island whose sole resources are sand, sea, and coconut trees. Construct a surface flat to atomic dimensions.

The following answer beautifully blends engineering, physics, and mathematics.

• Construct a fire-bow from bark and branches (engineering)

• Start a fire, and melt sand into three irregular lumps of glass (physics)

• Using sea-water as a lubricant, grind each shape against the other two (mathematics)

Continue this grinding process, varying the pairing and direction of the grinding strokes, until three flat surfaces are obtained. Then it is an elegant theorem of geometry that three surfaces that are mutually congruent under translation and rotation, must be exactly flat.

Essentially in consequence of this geometric theorem (which was known to Renaissance mirror-grinders), all modern high-precision artifacts are constructed by (seemingly) low-precision grinding processes … the quartz-sphere test-masses of Gravity Probe B are perhaps the most celebrated examples.

• April 25, 2012 7:11 pm

I really enjoyed this solution! Can you point me towards more problems/solutions like it. I am unsure how to search for this specific turn of thought.

April 26, 2012 5:28 pm

Sorry, what’s a name or reference for this theorem for those of us who don’t know geometry?

April 30, 2012 11:00 am

Wyatt and Sniffnoy — the above charming narrative is a folk-theorem that was taught to me by a friend who was an amateur telescope maker.

One starting reference to the mathematical literature can be found on page 29 of Alexandre Borovik’s Mathematics Under the Microscope: Notes on Cognitive Aspects of Mathematical Practice (AMS Press, 2009).

Almost any book on lens-grinding for amateur telescope makers will cover this material too.

I’m glad you enjoyed the story, and good luck finding references! 🙂

May 2, 2012 11:02 am

Wyatt and Sniffnoy, here is a sketch of an elementary proof. Suppose that surfaces $\{a,b,c\}$ have been ground against one another, each to a spherical figure (hmmm … how might we prove sphericity?) having respective curvature radii $\{R_a,R_b,R_c\}$. Let the inverse radii be $\Gamma_a \equiv 1/R_a$ (etc.), each of which is a finite number. Then the matching condition for pairwise congruity-of-figure is

$\left[\begin{array}{ccc} 1 & 1 & 0\\ 0 & 1 & 1\\ 1 & 0 & 1 \end{array}\right] \left[\begin{array}{c} \Gamma_a\\ \Gamma_b\\ \Gamma_c \end{array}\right]= \left[\begin{array}{c} 0\\ 0\\ 0 \end{array}\right]$

whose unique solution is $\Gamma_a = \Gamma_b = \Gamma_c =0$, that is, flat surfaces. Fun! 🙂

This elementary problem, by the striking simplicity both of its solution and of the means of demonstrating that solution, illustrates an ubiquitous theme in mathematics, that Alexander Grothendieck’s student Pierre Deligne memorably called “The art of finding and creating the home which is a problem’s natural habitat.”

May 2, 2012 2:13 pm

I’m afraid I don’t even understand the statement of the theorem, which was part of why I was asking for a reference…

May 2, 2012 2:33 pm

To say it in words, if one of the surfaces is convex, then the other two surfaces (against both of which the first surface fits congruently) must both be concave.

And by similar reasoning, if one of the surfaces is concave, then the other two must both be convex.

We conclude that *none* of the three surfaces can be convex *or* concave … and therefore they all three surfaces must be flat! 🙂

May 4, 2012 10:38 am

Thank you, that explains the intuition behind it, but I still don’t see how to formalize it (just the statement, I mean, not the proof). Because obviously we can just take 3 hemispheres that haven’t been ground against one another and say, “look, these are all congruent, and none of them are flat”. Perhaps if we add in an orientation condition, demanding that the congruences be orientation-reversing? Except that doesn’t work either, because a hemisphere has an orientation-reversing symmetry. A condition where we label the sides of the surface, and insist those match up? This seems more on the right track, but it can’t be enough by itself; we can just label all the insides “A” and the outsides “B” (and this would work with anything, nothing special about hemispheres here). So I don’t know what to make of this. Do we want to involve the actual solids, perhaps, and not just the surfaces? No idea. A formal statement would be very helpful.

May 4, 2012 10:52 am

Hold on, I think I see it — we want to have surfaces with two labeled sides, A and B, and insist that there are congruences that match up B to A and A to B. That seems to do it, and matches the physical intuition of matching up the outside of one with the inside of the other. Is that the correct statement?

May 4, 2012 10:58 am

Also, of course, we want to insist that the congruences be orientation-preserving.

Except it seems that still can’t be it; a wave-shape seems to provide a counterexample, since, though not flat, it can be rotated so as to switch its sides. Now I’m confused. What exactly is a correct statement, then?

Is the statement even correct? The wave-shape counterexample seems to work physically, so whatever the theorem is, it can’t apply physically in such a case; some additional constraint (which in the physical example would presumably come from some aspect of the grinding other than congruence) must be needed. (It certainly is believable that in the physical case, you would be very likely to end up with something very nearly flat, for the reason you explained above.)

May 4, 2012 11:46 am

Sniffnoy, it may clarify your thinking to

(2) reflect that the grinding machine is so designed as to ensure that every point $p$ of surface $A$ (the mirror) is at one time or another brought into congruent contact, with all possible relative angles, with every point $q$ of surface $B$ (the tool)

(3) reflect that a point $p$ of either surface has both a major radius $R(p)$ and minor radius $R'(p)$

(4) And therefore (when grinding is done)

$\forall p\in A\ \text{and}\ \forall q\in B: 1/R_A(p)=1/R'_A(p) = -1/R_B(q) = -1/R'_B(q)$

The unique solution of this equation is that any two flush-ground surfaces are spherical, with opposite-sign radii of curvature. Hope the LaTeX comes out OK! 🙂

4. April 25, 2012 10:45 am

Changing the theorem on tree separators unfortunately made it false. The balance decreases to 1/4. An edge separator in a tree only allows for at least n/4 vertices per component (example: a star with three leaves). A vertex separator, however, can guarantee n/3 vertices per component.

April 25, 2012 11:25 am

Dick asks: Are all linear algebra problems almost linear?

When we frame this question geometrically, it amounts to “Is the musical isomorphism almost linear?” Here the musical isomorphism is the “sharping” of one-forms to tangent vectors and its natural dual the “flatting” of tangent vectors to one-forms. Assuming that the answer is “yes” (provably? generically?), then it would be nice to have a geometric understanding of why the answer is “yes.”

Moreover, a surprising corollary then would be “The Hamiltonian dynamical trajectory of $n$ particles coupled by short-range interactions is generically no more difficult to compute than the dynamical trajectory of $n$ uncoupled particles.” Since classical and quantum trajectories alike are Hamiltonian, natural connections arise to the Harrow/Kalai debate.