A Matchless Result
A famous algorithm, a new paper, a full correctness proof
Vijay Vazirani is one of the world experts on algorithmic theory, and is especially known for his masterful book on approximation algorithms. Among many results in computational complexity, two have been Wikified: his theorem with Les Valiant on unique SAT, and the subsequent generalized Isolation Lemma with Ketan Mulmuley and his brother Umesh Vazirani. Lately his focus has been more on computational aspects of auctions, economic systems, and game theory.
Today Ken and I wish to discuss a recent paper by Vijay on matching.
His paper is quite unique—is “quite” redundant?—well it is an isolated case of a top researcher taking the time and energy to explain the correctness of one of his famous algorithms, originally in joint work with Silvio Micali. As our field matures we should see more of this.
Given an undirected graph a matching, of course, is a set of pairwise non-adjacent edges: no two edges share a common vertex. A maximum matching is a matching that contains the largest possible number of edges. If is even then a matching with edges is perfect. The notorious Petersen graph
has six different perfect matchings: the five “spokes” and others using just one spoke. Although matchings are often associated with bipartite graphs, it is important to note that the Petersen graph is not bipartite. This is important since the matching algorithm we will discuss works for general graphs, not just bipartite graphs.
The Importance Of Matching
Finding a maximum matching is one of the foundational problems at the intersection of graph theory and algorithmic theory. The reason for this is:
- The problem is useful in practice. Unlike many problems that are sometimes studied in theory, people really would like to be able to find matchings of general graphs. See this for a detailed list.
The problem is basic to the theory of graphs. László Lovász and Michael Plummer wrote an entire book called Matching Theory on—you guessed right—matchings. They shared the Niveau Prize of the Hungarian Academy of Sciences for this book in 1991. We especially like the Amazon review of the book:
The subject has obviously advanced a lot since the book was published, but the overview provided is still unmatched.
- The problem is basic to the theory of algorithms. Euclid may have the first non-trivial algorithm, his algorithm for finding the gcd of two numbers, but algorithms for matching started over fifty years ago. This is not the place to trace out the long and complex history of who did what algorithm for finding matchings. Suffice it to say that much of the theory of algorithms can be traced to the history of matching algorithms: various tricks and methods that are used through computing often first appeared in matching algorithms. Andras Frank has written a wonderful piece called On Kuhn’s Hungarian Method—A tribute from Hungary about the bipartite case—see it for some interesting history.
Here is a picture of some of the applications of matching algorithms.
Let’s turn to look at Vijay’s new paper.
The Paper’s Tracks
Actually we need to first look at the original paper called An algoithm for finding maximum matching in general graphs by Silvio Micali and Vijay Vazirani. It was published in the 21st Annual Symposium on Foundations of Computer Science in 1980.
We will call this paper the MV paper, and its algorithm the MV algorithm. Note the running time is using “modern” internationally agreed-upon notation: we use always to denote the number of vertices and the number of edges. Paul Halmos was always careful about notation, and once said that his worst nightmare was that someone would type
The title of Vijay’s original paper was created before these agreements, so it is understandable that they used the “old” style notation in the title. It is also less prone to the problem that if you write it may be hard to see that the square-root sign does not also cover the .
This paper presented the MV algorithm with the given running time. The major advance was that it efficiently found a maximum matching in general graphs. There is an interesting cliff that happens in the theory of algorithms for finding maximum matchings. Bipartite graphs are just much easier for matching algorithms. Of course they are easier for many other algorithms: it is trivial to find a 2-coloring of a bipartite graph, but NP-hard to find a 3-colring in a general graph. One of the achievements of the MV paper is that everything works for general graphs.
The new paper is solely authored by Vijay. It has several goals, but the main one is to give a clear but full proof of the correctness of the MV algorithm. The algorithm remains the same, but the correctness proof is new. The previous proof had a flawed case analysis. The new proof avails itself of information that the previous proof bypassed, to make the analysis tighter and more manageable.
We will not give the whole proof, but we will give an algorithmic idea highlighted also by Vijay, which has independent interest.
Dueling Depth-First Searches
Let be a directed acyclic graph whose vertices are in layers numbered (for sinks) through (sources). Edges from any non-sink node go to nodes in lower layers, not necessarily the next layer, and every node has a path to a sink in layer . Let and be any two non-sink nodes. By Menger’s Theorem, either
- there are vertex-disjoint paths from and to sinks and , or
- the set every path from or to a sink goes through is nonempty.
In the latter case, let be the element of in the highest layer. The algorithm must find this highest bottleneck node within time proportional to the total number of edges in paths from or to . In the former case the algorithm must output paths to and within time proportional to the total number of edges on all paths from or to either of or .
The main puzzle is, how can we avoid searching past and taking more than the time allowed to the latter case, if we don’t even know that it holds? For a single local search, this does indeed seem to be impossible. However, what Vijay called “Double Depth-First Search” (DDFS) solves it. Two search posses led by rangers Green and Red start respectively at and and mark out “green” and “red” nodes and edges both. Neither may tread on the other’s ground. The rules are:
- Whichever posse is on a higher-level node gets to move; if they are on the same level, Red moves.
- If one posse’s depth-first search comes to an already-marked empty node —it could have been marked by either posse on an earlier visit—it goes back and tries another down-edge from its previous node. If there are no more down-edges, it backtracks up from its previous node as in normal depth-first search.
- If the other posse is on node , however, Red and Green fight a duel at 20 paces.
By permission of ComicArtFans gallery owner Mark Geier: source commissioned from artist Terry Beatty
The duel follows gentlemen’s rules. Red shoots in the air. Green takes the hint and backtracks looking for another node at the same level as or lower. If Green succeeds, he sends a Western Union telegram to Red, who puts his marker on and moves on. If Green gets stuck, then Red has to keep the same bargain. Red backtracks until he either finds another at the same level as or lower, whereupon he telegrams Green to come back and claim as his, or Red gets stuck too. If they are both stuck they both come back to . Each claims just the respective (necessarily different) (sets of) edges he followed into , but they identify itself as the highest bottleneck and share a bottle of hooch to celebrate.
They can celebrate because they have met the time guarantee: Neither searched below , and all the edges they traversed in their wanderings had to be on paths to , since they did not find any open trails going elsewhere. And in the case where they reach separate destinations and , any backtracking done by one posse because of hitting the other’s marked node is inductively on a path to the other’s goal, so at most the various routes between and get searched.
Making a search posse on a lower level wait for the other ensures that nodes below a bottleneck don’t get searched. Many other parts of the algorithm similarly rely on co-ordination with delicate timing considerations; this is just a taste of them. The new paper also has many other figurative concepts, with illustrations.
Do read Vijay’s paper. It is one of the clearest expositions of a graph algorithm—a model for others. Also the idea that Vijay went back to an ancient paper—over thirty years old—solely to write the definitive proof of its correctness is something we all should applaud.
Of course the central problem in matching theory is still what is the best time for an algorithm that finds a maximum matching? Given the interest today in huge graphs that arise from social networks we will like a linear time algorithm. Is this possible? See this for some fast approximate algorithms.
[added words “on all paths” to DDFS time requirement, name and word fixes and adds, permission for vintage Green Hornet/Lone Ranger artwork]