Hyman Bass is a professor of both mathematics and mathematics education at the University of Michigan, after a long and storied career at Columbia University. He was one of the first generation of mathematicians to investigate K-theory, and gave what is now the recognized definition of the first generation of K-groups, that is for a ring . He co-founded with Jean-Pierre Serre a theory of graphs of groups, that is mappings associating a group to each vertex and edge of a graph. He is a past President of the American Mathematical Society—and exemplifies the idea that the AMS promotes mathematics at all levels: since 1996 he has worked on on elementary school math education with his Michigan colleague Deborah Ball.
Today Ken and I wish to discuss an idea used in a paper on the Jacobian Conjecture by Bass with Edwin Connell and David Wright.
One of the most powerful methods we use in theory is often the reduction of the dimension of a problem. The famous JL theorem and its relatives show us that often problems in high-dimensional Euclidean spaces can be reduced to much lower dimensional problems, with little error. This method can and has been used in many areas of theory to solve a variety of problems.
The idea used by Bass and many others is the opposite. Now we are interested in lifting a problem from a space to a much higher space. The critical intuition is that by moving your problem to a higher space there is more “room” to navigate, and this extra “space” allows operations to be performed that would not have been possible in the original lower-dimensional space. An easily quoted example is that the Poincaré property was proved relatively easily for spaces of dimension and higher, then for in the 1980′s, and finally for by Grigory Perelman drawing on Richard Hamilton.
Let be a polynomial map from to where
The Jacobian Conjecture (JC), which we have covered several times before, indeed recently, studies when such maps are injective. The trouble with the JC is quite simply that such maps can be very complex in their structure, which explains the reason JC remains open after over 70 years.
A very simple idea, almost trivial idea, is to replace by which is defined by
for new variables . For example if then could be
The new variables are used in a “trivial” way. One reason the method is useful for the JC conjecture is that is injective if and only if is injective. This is trivial: the new variables do not interact in any manner with the original variables, and so the map is injective precisely when is.
Why is this of any use? Clearly is really just with an identity function on the variables “tacked on” to it. How can this help?
The answer is that we can use the extra variables to modify the polynomial so that it looks very different from . Suppose that we replace by where is a nice polynomial map. We may be able to vastly change the structure of and get a that is much simpler on some measure. The hope is that this restructuring will allow us to prove something about that then implies something about . This is exactly what happens in the study of the JC conjecture.
Here is a famous result from the Bass-Connell-Wright paper:
Theorem 1 Given there is an automorphism so that has degree at most three. Moreover is injective if and only if is.
This allows the JC question to be reduced to the study of cubic maps. Of course such low degree maps still are complex in their behavior, but the hope is that while they are on many more variables the restrict on their degree many make their structure easier to understand. This has yet to be fruitful—the JC remains open. The following quotation from the paper explains the idea:
One idea is to try and use this stabilization philosophy to attack problems about polynomials that arise in complexity theory. For example can we use the addition of extra variables to attack the power of polynomial modulo composite numbers? Suppose that is a polynomial of many variables that modulo computes something interesting. What if we add extra variables as above, then rearrange the resulting polynomial to have a “better” structure? We must preserve not its injectivity, but for us its ability to compute something. If we can do this, then perhaps we can use the use the extra variables to obtain a lower bound.
Two simple examples from arithmetical complexity are aliasing variables in a formula to make it read-once, and the Derivative Lemma proved by Walter Baur and Volker Strassen. In the latter, the idea is to take a certain gate of a circuit computing a function , and regard it as a new variable in a circuit . The circuit computes a function of variables such that
In the Derivative Lemma the gate is chosen to involve one-or-two input variables , but the idea can be extended to other cases. When the construction is applied recursively we generate circuits in which the number of variables becomes higher and higher, but the circuits themselves become flatter and flatter, until only a “stardust” of many single-variable functions is left.
Are there general rules of thumb on when you should increase versus decrease the dimension? Or when to induct on the output gate(s) of a circuit, on the number of variables going down, or on the number of variables going up?
]]>
Can chess statistics help design multiple-choice exams?
UT Dallas source |
Bruce Pandolfini is one of very few living people who have been played by an Oscar-winning actor. He was the real-life chess teacher of junior chess player Joshua Waitzkin, who went on to become the world champion—in T’ai Chi. Their story is told in the movie “Searching For Bobby Fischer” with Sir Ben Kingsley, which is still iconic after 20-plus years. Pandolfini is still doing what he loves as an active chess teacher in New York City. For much of this time he has also written a popular feature called “Solitaire Chess” for Chess Life magazine, which is published by the United States Chess Federation.
Today Dick and I wish to compare styles of multiple-choice exams, with reference to “Solitaire Chess,” and have some fun as well.
Most multiple-choice questions are designed to have a unique correct answer, with all other answers receiving 0 points or even a minus. This is like a chess problem of “find the winning move” type. Mate-in-2, mate-in-3, and endgame problems generally have unique answers—a “dual” solution is an esthetic blemish. There are several popular websites devoted to this kind of chess puzzle, which is great for honing one’s tactical ability.
“Solitaire Chess” is different, with more emphasis on strategy. The reader takes the winning side of a notable game Bruce has prepared, and chooses his/her move before revealing the answer and the opponent’s next move. It simulates the feeling of playing a master game.
Incidentally Bruce recently attended the wedding of another master player represented in the movie, the real-life Asa Hoffman to the former Virginia LoPresto, who also remember me from New York tournaments in the 1970′s. Among children Bruce has coached in preteen years is the world’s current 5th-ranked player, Fabiano Caruana of Brooklyn and Italy.
The difference we emphasize is that the game positions often give partial or even full credit for alternative choices. For example, here is the position at move 22 in the March 2014 Chess Life column, of a game that was played in 1934 by Fred Reinfeld, an earlier master teacher who wrote many great books until the 1960′s. White is to play:
The top score of 5 points goes to the capture move 22.axb4, but the alternatives 22.Nc6 and 22.Ne6+ are deemed almost as good, worth 4 points each, while the non-capture move 22.a4 still gets 2 points. Several other game turns have 3-point partial credits. At the end is a chart connecting your total score over all the moves to the standard chess rating scale devised by Arpad Elo. For instance, 81–94 points is deemed the range of a 2200–2399 master player such as myself, while 36–50 is for a “good club player” with 1600–1799 rating.
Pandolfini’s expert judgment goes into setting both the partial credits and the overall assessment scale. Although chess positions often have 30–50 legal moves or even more, there are typically at most 3–5 moves worth considering, so this is like a standard multiple-choice test in that way. The partial credits, however, are more typical of ranking applications such as judging the value of search-engine hits, where there are 10, 20, 30, or hundreds or thousands of choices to consider. Our topic is about having the best of both kinds of application, and how to do the assessment scientifically.
Well we guess you didn’t come to a blog to take an exam, so we’ll try to make at least the first part fun, before we introduce more “strategic” questions with partial credits. You are on your honor not to Google the answers—we can tell of course; we won’t tell you how we know but our heartbleeds for you.
OK, more serious now. Start your engines. Actually in chess, “start your engines” refers to computer chess programs, and would mean either you are cheating, or you are playing in the InfinityChess Freestyle tournament, which finishes today.
We think each of our latter six “multi-choice” questions has a clear best answer, but our judgment comes from perspectives in our field. For instance, “structural” complexity came with a specific meaning apart from algorithmic and practical considerations. Even granting that meaning, arguments can be made for several answers to the last question—all except the one that is false on current knowledge. For example, random-oracle results used to be considered stronger evidence than is commonly ascribed to them now.
We could have made catch-all “some of the above” answers as in our first set. However, this would miss our feeling of there being a pecking order even among the non-optimal answers. Again with reference to the last question, random oracles and complete languages are “structural” while the history of classifying problems is not, and between the first two, lacking a completeness level is not generally evidence of being tractable. Hence we see the possibility of better assessment by giving different partial credits to these answers.
An even more quantitative option is to ask the test-taker to rate each statement on (say) a 0–5 scale. This would be just like asking the takers to estimate the partial credits themselves. We could then score according to distributional similarity to our own assignment, weighting closeness on the best answers the most. Of course this style of grading is most appropriate to judging search engines, based on an expert reference assessment of the importance of the various ‘hits’ returned. And it is also like simulating the creation of “Solitaire Chess” itself—more than just looking for the best move which is what we do when we actually play chess. Thus the teacher has a harder task than the player.
The most ambitious goal is to turn the process around by making backwards inferences about the values of questions from the aggregated selection of many well-informed takers. In chess this would be like judging the value of a move based on the proportion of strong players who choose it. Nowadays this is regarded as overruled by the judgments of strong computer programs, notwithstanding the issue that players’ “book knowledge” of past games makes their choices less independent than among test takers.
However, the ability in chess to correlate players’ judgments with computer values of moves, and map the distributions, may help us make inferences about “objective value” from the distributions of the test-takers. The computer values are scientifically objective partial credits, and “Solitaire Chess” could be scored as an app that way. This all plays into quantifying the wisdom of crowds along lines discussed toward the end of the Distinguished Speaker lecture given by Lance Fortnow on his visit to Buffalo last week. At least this is our motive for making tests more like strategic chess.
What partial-credit values would you assign to our complexity questions?
Should multiple-choice tests be more like “Solitaire Chess”? Does one obtain deeper and better assessment that way? Is the difference important enough to massive online courses?
§
Here are the answers to our April Fool’s anagram quiz, besides “Pearl Gates” = Peter Sagal and “Slack Laser” = Carl Kasell:
]]>
Cases where we can count objects in polynomial time
New Interim Dean source—our congrats |
Mike Sipser is one of the top theorists, especially in complexity theory, and has been Head of the Mathematics Department at MIT since 2004. He is long known for some very clever proofs and his famous survey paper on the status of . He is recently known for co-authoring the paper that introduced adiabatic quantum computing, which is the kind being given physical realization by the company D-Wave Systems. But he is perhaps best known nowadays for his textbook Introduction to the Theory of Computation. I have used the earlier version many times for teaching our undergraduate class; I have not used the third edition, mainly because I teach other things these days.
Today Ken and I wish to talk about a topic that is covered in his book, finite state automata, in relation to counting.
Yes, lowly finite state automata (FSA). In my opinion FSA are one of the great inventions of theory. They led Michael Rabin and Dana Scott to discover nondeterminism, yielding a Turing Award along the way. They led algorithm designers like Don Knuth to discover, with Jim Morris and Vaughan Pratt, the first linear-time pattern matching algorithm. And much more.
Mike’s book was discussed before here, where I talked about his use of FSA to prove that the first order theory of addition is decidable. This is one of my favorite applications of FSA, which I learned from Ann Yasuhara directly—it is also included in her early book, Recursive Function Theory and Logic.
Mike’s book proves some interesting theorems that are ancient—okay, many decades old. This yields what in retrospect look like omissions, which are bound up with the history of the theorems. For example, consider the following classic one:
Theorem 1 A language is context-free if and only if some nondeterministic pushdown automaton (PDA) accepts it.
This is proved in detail in his book—see page 117 in the new edition, Theorem 2.20. But the proof establishes more, much more, that is not stated. I have used both of these consequences in my own work over the years:
Note that the best construction, I believe, goes from a PDA with states to a grammar with rules. If there is a better one it would have some interesting consequences.
Let me say that this omission is not isolated to Mike’s book, but others most always leave out these interesting refinements. I believe that the reason is simple: the above theorem was proved before polynomial time was defined, and the textbook is sequenced that way. Hence the omission. Perhaps they can be added to a fourth edition.
Ken adds—also writing much of the rest of this post: I now like the sequence of skipping grammars and PDA’s and going straight from regular languages and FSA to Turing computability and complexity. After this ‘kernel’ material the instructor has an option of covering grammars, and then the polynomial-overhead concepts are in place. Or the instructor can do more with complexity or logic or some other core topic.
Let’s get back to FSA and counting.
Theorem 2 Let be a deterministic FSA with states and input alphabet . Then we can determine the cardinality of the set in time polynomial in and .
Put another way, we can count in polynomial time the number of strings of length that a FSA accepts. If the FSA is fixed, then the time is polynomial only in . Actually, in this case the time is essentially bi-linear in and —it depends on the exact computational model.
The algorithm to prove this is often described as “dynamic programming,” but often that just means maintaining a well-chosen data structure. Here we allocate a slot for each state of the FSA, and maintain for each the number of strings of length exactly that reach from the start state . Initially , for the empty string, and all other . Now to update it to , find all states and characters such that , where is the transition function of , and sum up . That is,
Finally equals the sum of over all accepting states . Assuming random access to the data slots, and unit time for arithmetic on small numbers, this runs in time .
Sounds simple, but as often happens in complexity, delicacy and difficulty lurk not too far away.
First, does the algorithm work when is nondeterministic? It certainly runs, and counts something, but not the number of accepted strings. So can we modify it to do so?
The answer is no—or maybe better to say “ostensibly not”: Given a Boolean formula that is a conjunction of clauses , design an NFA that begins with a nondeterministic choice of one of states . From every , starts reading its input deterministically. The part of ‘s code from is written so that if is an assignment that makes every literal in false—note that the literals can be presented in index order—then accepts . Thus the formula is unsatisfiable if and only if . So if we had a algorithm to compute , we’d have .
Moreover, counts the satisfying assignments. Hence relaxing from DFA to NFA makes our little problem -complete. Now it’s important that is part of the input. If is fixed and we only want to compute given , then of course we can convert to an equivalent DFA which is likewise fixed, and run our algorithm. Hence one must be delicate with what constitutes the input.
For an example, call a string special if it contains copies of the pattern for some that is a prime number. We can count the number of special strings of length in time polynomial in . To do so, we take and program a choice over all as before, this time keeping only the for which is prime. From each we program a DFA that accepts a string if and only if it contains exactly copies of . This resembles the case, but is different because there is no overlap in the strings accepted by the respective . So we just count the numbers of length- strings accepted by each and add them up.
This is not a hugely important application. We selected it to show that there are counting problems that might be tricky to solve without the FSA method. This and other examples may be useful.
Note that even though the decision problem is in , the counting problem is still -complete. For an aside, is a major example in Mike’s quantum paper, but here we raise the question, what else can be counted?
For instance, counting the number of solutions to an -variable polynomial over is -complete. It becomes -time, however, when has degree at most . This is when is fixed. What if is variable? We can also ask about computing for all , for the particular purpose of computing the exponential sum
where .
Dick co-authored a paper recently with Jin-Yi Cai, Xi Chen, and Pinyan Lu, showing that when has degree at most , is computable in time polynomial in and . In particular this means that for , when we can represent the members of by strings in , the time to compute is .
A little thought shows that this suffices to compute any individual number in time , indeed by computing all of them. But if we just want one, and , can we do it in time ? This is not obvious to me (Ken), and at least for now leaves the funny situation where we can compute in time the historically important function which involves all the values, but don’t see a way to compute any one of them in the same time.
Whose originally proved efficient counting for deterministic FSA? We seem to not be able to track that down. Are there some cool applications?
What is the answer to the last problem? Should counting receive more emphasis in texts at the level of Mike’s?
I (Ken) add a story: I first met Mike when we shared a compartment of the train between Hamburg, Germany and Aarhus, Denmark, on the way to ICALP 1982. I had just moved from algebraic combinatorics into complexity during my first year at Oxford, and naturally asked him what were the propsects for proving . He replied “It will be proved … yes we will prove it,” and backed up his confidence by naming some results and giving some ideas along lines that would later be called the “Sipser programme” of approach via circuit lower bounds. (Did we mention that he also wrote with Ravi Boppana a bellwether survey on circuit complexity?) I guess there wasn’t a time limit on his assertion…
Altered from NPR source. |
Faadosly Polir is back in contact with us. We have encountered him before, but this time he has company. He has teamed up with two well-known radio personalities to launch a quiz show about mathematics.
Today Ken and I wish to talk about material for the show that Faadosly has sent us.
The draft he sent has the subtitle, “Strange Facts About Mathematics.” The material is not only about strange facts, but is strange in itself. Many of the “facts” are false. Everyone’s name is replaced by an anagram. Decoding the name to a true person helps you, but the fact can still be false. His co-authors on the scripts are named Pearl Gates and Slack Laser.
Apparently the idea is that the quiz contestant hears three “facts” and has to say which one is true. The problem is that their math items are not arranged that way, but rather jumbled all together. The proportion of true ones seems to be more like 1/2 than 1/3. Perhaps it is harder to generate true ones that seem false than false ones that are plausible, so they figure they can do more of the latter later? Anyway, they are allowing us to share some with you.
Here is a sampler from their draft—a mix of true stories and “fools.” Count a “fool” if either the personal description or the mathematical “fact” is false, or both.
Hydra Dye Frog, while well known for his brilliant mathematics in number theory and analysis, once wrote a seminal paper on genetics. The issue was whether a dominant character should show a tendency to spread over a whole population; or put another way would all recessive characters tend to die out. Assuming a reasonable random model of mating, he showed that dominant genes would not force out recessive ones.
Gail Kali is a mathematician from Chennai, India. Like her countryman Srinivasa Ramanujan, she credits a Hindu goddess for her mathematical insights—of course in her case the goddess Kali. In a dream Kali told her that every generalized Platonic solid with central symmetry in -dimensional space has at least lower-dimensional faces, including its vertices. Unfortunately, Gail woke up before Kali got beyond the all-triangles cases in the proof.
Rich-Cal Fried Sugars was another great mathematician known for many results in number theory. But in his lifetime he was famous for helping solve one of the great mysteries of his time. Astronomers were tracking the position of Ceres, a huge asteroid, and they lost it owing to the glare of the sun. He famously predicted the path of where it would be, and on New Year’s Eve astronomers found it.
One-Ale Hurdler proved that there was no way to route one-way traffic touring his whole city without causing a snarl. Even before his rule that T-junctions were bad news, he discovered that the bridges made it impossible. This problem founded an entire big branch of mathematics.
One-Cat Tree proved that a relativistic smoothing of the Navier-Stokes equations allows infinite concentration of energy in finite time, because of its ability to simulate universal computation. Running this in reverse, it follows from the quantum complementarity of energy and time that the Big Bang did a lot of computation in its brief timespan. The computational results can now be read off from the pattern of gravity waves in the cosmic microwave background. At over a trillion times the concentrated power of the Large Hadron Collider, the Big Bang Computer represents almost a billion times the processing capacity of Google, whose data-mining efforts to read off long numerical calculations from it are underway. Thus we have it: the digits of in the sky.
Wes Gromit is not related to the famous British star of animated films, but his father has composed many movie scores. In disjoint work with One-Cat Tree and Gene Bern, he proved the existence of an infinite arithmetical progression with exactly two prime numbers in it. Although married with five children, he was declared a bachelor by Queen Elizabeth II in 2012.
Annual Trig was a mathematician who along with a colleague introduced modulo wrap-around computation to cryptography. They actually used it modulo six, but the ability to wrap was essential to make their system work. They were working on an analog system to scramble telephone calls. In their system there were six bands that had to be moved around, and they found that by doing certain arithmetic operations modulo six they had more security.
Town Falconer proved that there is a language in whose complement does not have an interactive protocol. This flies in the face of there being an oracle constructed by Amish Raid that gives every language in such a protocol, so that Falconer and Semi Spiker are credited with proving the only truly natural non-relativizing result in complexity.
Joint Who Tolled was a mathematician famous for work in analysis and related areas. He was a doubter of the Riemann Hypothesis (RH) throughout his life. There are claims—but who can tell—that he might have changed his mind if he knew some of the modern computations that show RH holding for a huge number of zeroes.
Glacial Warmish is a complexity theorist who is interested in Kolmogorov Complexity, among other things. One of his main results is a proof that this theory gives a natural construction of a set that is undecidable, but is weaker than the Halting problem.
Glib Tales co-wrote a paper proving what until recently was the fastest known way to flip pancakes in a stack so that the larger ones end up below smaller ones. Instead of opening a pancake business he founded a company which became rich enough to sponsor him in a match against the world chess champion. However, he got checkmated in nine moves. The paper still earned him an Erdős Number of 4.
Sonata Consort proved that quantum computers can solve -complete problems. Although his algorithm is “galactic” in worst case, his ideas inspired a Vancouver startup company to build a quantum computer that solves -complete problems in many cases, perhaps even without needing anything “quantum” at all. As this company’s most valuable consultant, he is paid in a quantum-money scheme of his own devising.
Which ones are the “fools”? Try not to be fooled today. As a hint, the fools have something in common…
]]>
A famous algorithm, a new paper, a full correctness proof
Vijay Vazirani is one of the world experts on algorithmic theory, and is especially known for his masterful book on approximation algorithms. Among many results in computational complexity, two have been Wikified: his theorem with Les Valiant on unique SAT, and the subsequent generalized Isolation Lemma with Ketan Mulmuley and his brother Umesh Vazirani. Lately his focus has been more on computational aspects of auctions, economic systems, and game theory.
Today Ken and I wish to discuss a recent paper by Vijay on matching.
His paper is quite unique—is “quite” redundant?—well it is an isolated case of a top researcher taking the time and energy to explain the correctness of one of his famous algorithms, originally in joint work with Silvio Micali. As our field matures we should see more of this.
Given an undirected graph a matching, of course, is a set of pairwise non-adjacent edges: no two edges share a common vertex. A maximum matching is a matching that contains the largest possible number of edges. If is even then a matching with edges is perfect. The notorious Petersen graph
has six different perfect matchings: the five “spokes” and others using just one spoke. Although matchings are often associated with bipartite graphs, it is important to note that the Petersen graph is not bipartite. This is important since the matching algorithm we will discuss works for general graphs, not just bipartite graphs.
Finding a maximum matching is one of the foundational problems at the intersection of graph theory and algorithmic theory. The reason for this is:
The subject has obviously advanced a lot since the book was published, but the overview provided is still unmatched.
Cool pun.
Here is a picture of some of the applications of matching algorithms.
Let’s turn to look at Vijay’s new paper.
Actually we need to first look at the original paper called An algoithm for finding maximum matching in general graphs by Silvio Micali and Vijay Vazirani. It was published in the 21st Annual Symposium on Foundations of Computer Science in 1980.
We will call this paper the MV paper, and its algorithm the MV algorithm. Note the running time is using “modern” internationally agreed-upon notation: we use always to denote the number of vertices and the number of edges. Paul Halmos was always careful about notation, and once said that his worst nightmare was that someone would type
The title of Vijay’s original paper was created before these agreements, so it is understandable that they used the “old” style notation in the title. It is also less prone to the problem that if you write it may be hard to see that the square-root sign does not also cover the .
This paper presented the MV algorithm with the given running time. The major advance was that it efficiently found a maximum matching in general graphs. There is an interesting cliff that happens in the theory of algorithms for finding maximum matchings. Bipartite graphs are just much easier for matching algorithms. Of course they are easier for many other algorithms: it is trivial to find a 2-coloring of a bipartite graph, but NP-hard to find a 3-colring in a general graph. One of the achievements of the MV paper is that everything works for general graphs.
The new paper is solely authored by Vijay. It has several goals, but the main one is to give a clear but full proof of the correctness of the MV algorithm. The algorithm remains the same, but the correctness proof is new. The previous proof had a flawed case analysis. The new proof avails itself of information that the previous proof bypassed, to make the analysis tighter and more manageable.
We will not give the whole proof, but we will give an algorithmic idea highlighted also by Vijay, which has independent interest.
Let be a directed acyclic graph whose vertices are in layers numbered (for sinks) through (sources). Edges from any non-sink node go to nodes in lower layers, not necessarily the next layer, and every node has a path to a sink in layer . Let and be any two non-sink nodes. By Menger’s Theorem, either
In the latter case, let be the element of in the highest layer. The algorithm must find this highest bottleneck node within time proportional to the total number of edges in paths from or to . In the former case the algorithm must output paths to and within time proportional to the total number of edges on all paths from or to either of or .
The main puzzle is, how can we avoid searching past and taking more than the time allowed to the latter case, if we don’t even know that it holds? For a single local search, this does indeed seem to be impossible. However, what Vijay called “Double Depth-First Search” (DDFS) solves it. Two search posses led by rangers Green and Red start respectively at and and mark out “green” and “red” nodes and edges both. Neither may tread on the other’s ground. The rules are:
By permission of ComicArtFans gallery owner Mark Geier: source commissioned from artist Terry Beatty |
The duel follows gentlemen’s rules. Red shoots in the air. Green takes the hint and backtracks looking for another node at the same level as or lower. If Green succeeds, he sends a Western Union telegram to Red, who puts his marker on and moves on. If Green gets stuck, then Red has to keep the same bargain. Red backtracks until he either finds another at the same level as or lower, whereupon he telegrams Green to come back and claim as his, or Red gets stuck too. If they are both stuck they both come back to . Each claims just the respective (necessarily different) (sets of) edges he followed into , but they identify itself as the highest bottleneck and share a bottle of hooch to celebrate.
They can celebrate because they have met the time guarantee: Neither searched below , and all the edges they traversed in their wanderings had to be on paths to , since they did not find any open trails going elsewhere. And in the case where they reach separate destinations and , any backtracking done by one posse because of hitting the other’s marked node is inductively on a path to the other’s goal, so at most the various routes between and get searched.
Making a search posse on a lower level wait for the other ensures that nodes below a bottleneck don’t get searched. Many other parts of the algorithm similarly rely on co-ordination with delicate timing considerations; this is just a taste of them. The new paper also has many other figurative concepts, with illustrations.
Do read Vijay’s paper. It is one of the clearest expositions of a graph algorithm—a model for others. Also the idea that Vijay went back to an ancient paper—over thirty years old—solely to write the definitive proof of its correctness is something we all should applaud.
Of course the central problem in matching theory is still what is the best time for an algorithm that finds a maximum matching? Given the interest today in huge graphs that arise from social networks we will like a linear time algorithm. Is this possible? See this for some fast approximate algorithms.
[added words "on all paths" to DDFS time requirement, name and word fixes and adds, permission for vintage Green Hornet/Lone Ranger artwork]
Jeffrey Shallit is a computational number theorist, with many wonderful results. He is also well known for his work as an advocate for civil liberties on the Internet, and also against intelligent design (ID). Indeed he was at one point slated to testify in the 2005 Dover case until a stand-down by the ID side accomplished his purpose. Ken once used his survey with Wesley Elsberry in a graduate seminar on cellular automata and various forms of complexity, not to say anything about ID, but just as a source of well-written definitions and relevant examples.
Today I would like to talk about on Michael Rabin’s talks at Tech this week and their connection to Jeff.
Both of Rabin’s talks were great, no surprise. They were based on his recent paper with Silvio Micali in February’s issue of the CACM. Rather than talk about details I will focus on one aspect that used a “trick” that I particularly liked, and thought I would share it with you. The trick is related to an old paper with Jeff.
The paper appears on the ACM website with the first author named “J.O. Rabin.” Those are Jeff’s initials, with the ‘O’ interestingly standing for “Outlaw.” Michael also has ‘O’ as his middle initial, so it was an easier mistake to make. O well.
Rabin showed, roughly, how to perform straight-line computation in a manner that keeps the values hidden from others. His computations allowed addition, multiplication, and comparison. The last is what I found quite interesting, since hiding ring operations is not too difficult, but hiding a highly non-linear operation like comparison seems quite different. Even multiplication while non-linear is bi-linear, and has a lot of linear structure that can be exploited.
The basic approach is to use the split value representation of a number. A large prime is fixed, and all values are represented by some pair so that . If the split is done randomly, then knowing one of the pair gives no information at all about the value of . This idea has been used before, of course, in many crypto protocols, often in multi-party protocols. Rabin’s work uses this method to make auctions secret. and does the same for electronic elections. See their paper for the details of what type of security he and Micali are able to achieve.
Let’s go back to the comparison operation. In order to perform it using split pairs, Rabin needs to use Lagrange’s Four Square Theorem, which I state in a moment. The reason is that he can reduce comparison to the checking that some is in a certain range, but he needs that the value of not be too large. Lagrange’s theorem allows him to do that.
Joseph Lagrange in 1770 proved the wonderful theorem:
Theorem: Every natural number is the sum of four squares.
Thus given there are so that
Note, some or all of can be zero.
There are two key ideas in Lagrange’s proof of his theorem. The first is that if and can both be written as a sum of four squares, then so can their product.
The second is that, therefore, one needs only to prove that every prime number can be written as a sum of four squares. But this can be done using properties of primes, and we are done.
Not quite. Over three hundred years later—1770 to 1985—the rules have changed. We now are interested not just in the existence of a representation of a number as the sum of four squares, but are interested in finding it efficiently. In this light Lagrange’s proof breaks down immediately, since he works on primes and therefore needs the factorization of in order to find its representation as four squares. This means that his proof cannot directly lead to an efficient algorithm—one must avoid factorization.
Enter our ‘O’ duo. In their paper from 1985 they devised a randomized algorithm that runs in polynomial time and finds a four square representation of a number . They actually given several proofs of this statement, including additional results about other representation theorems that they can make effective. Jeff himself commented in a MathOverflow item on possible extensions, and their paper itself was outlined on StackExchange last September.
One method uses the following neat idea: Suppose that you want to represent as the sum of four squares. Pick random squares and and assume that
where is a prime that is congruent to modulo . The key is that such primes are always the sum of two squares. So they build a subroutine that can solve this problem in randomized polynomial time: they have a polynomial time procedure to find such a representation of provided it is a prime. As usual the procedure could be lucky and still work if is composite. But if the procedure fails they pick new and and try again. A deep and amazing theorem of Yuri Linnik shows that every has many representations of the form (*). This leads to the main result.
Suppose that we want to write as the sum of a few squares, but perhaps more than four. The ‘few’ must be small because it affects the cost of other computations, and also we want finding the few to be efficient. We can use the Rabin-Shallit method which is more than fast enough. But there is a very simple direct method that I thought I would share.
Let’s suppose that and apply the “greedy” method—it is always a good idea to try the simplest ideas first.
Set . Let . Then
Then repeat this on until it is small enough to do by table lookup. The number of iterations is bounded by a double logarithm in . So this yields an expression for as a sum of at most squares. For the range used in Rabin’s applications this value is seven rather than four.
I like the weak effective Lagrange Theorem. If anyone knows a reference for it I would like to know. Perhaps it can be used in some other algorithms because it is extremely fast. For an -bit number it finds the representation in time where is the time to perform multiplication, while Rabin-Shallit uses .
]]>
Euler’s back-door pass to Gauss sinks a bucket
ACC stands for the Atlantic Coast Conference, which is an athletic organization that contains Georgia Tech and fourteen other colleges. In basketball, the top several teams in the ACC qualified for the NCAA championship tournament, which started today. We call the NCAA tournament “March Madness” because the opening rounds have games being shown on national TV seemingly every hour of every day, and often some spectacular upsets happen. Indeed Harvard has just beaten a favored University of Cincinnati team.
Today I want talk about another ACC: our own complexity class.
Recall that ACC, as a complexity class, is the class of Boolean functions computed by boolean circuits of constant depth and polynomial size, and the gates include modular gates that can count the number of inputs modulo a fixed constant. This class is quite mysterious. It could be very powerful, yet we do not even know if it contains the majority function.
Alas the ACC qualifiers for the men’s tournament did not contain Georgia Tech, for our men’s team had a losing record this year. Our women’s team, however, played well enough to receive an at-large entry into the NCAA Women’s championship, which is staggered two days after the men and so begins Saturday. No Buffalo team made either. The tournaments run through April 7 and 8. Go Yellow Jackets.
Suppose you have a number . Now I give you an -bit prime number and ask that you check whether is a quadratic residue modulo . Recall that means that there is a such that
This is the problem we are interested in solving with an ACC circuit. Note that it is a promise problem—we only ask that your computation works when the input number is a prime.
An obvious idea is to use the famous Euler criterion, which says that is a quadratic residue if and only if
Thus we need only raise to a power by repeated squaring and so on. The trouble, of course, is that we do not know whether ACC can do the required repeated multiplication.
Now let’s expressly state that is bounded in size by some constant. Still the Euler approach looks hopeless. But there is a deep theorem of number theory that comes to the rescue: quadratic reciprocity. Define the Legendre symbol
as the value where is a prime. Note it is always : it is only when divides . Thus our problem is to compute
for an input and bounded in size.
This is in ACC. The key is we can use the deep quadratic reciprocity theorem which says that
where and are primes. The theorem was conjectured by Leonhard Euler and Adrien-Marie Legendre, but finally proved by Carl Gauss. He referred to it as “the fundamental theorem” and wrote:
The fundamental theorem must certainly be regarded as one of the most elegant of its type.
So how do we proceed? Suppose first that the constant is a prime. Then we know by reciprocity that
The right-hand side is easy to compute in ACC. To determine it we need only compute ‘s residue modulo . Since is given in binary this is trivial. Thus, all reduces to the computation of . The key is that the Legendre symbol only depends on the value of modulo . Since we can do this in ACC we are done for the case when is a prime.
When is composite we use one simple fact about the Legendre symbol that follows directly from its definition.
Thus we can use the case where is prime multiple times.
The Legendre symbol for bounded is thus in ACC—the complexity class, not the conference. The rationale for this discussion is two-fold. One it shows that obvious approaches to a problem may sometimes be avoided by deep mathematics. Is this possible to do the same for other problems that we care about? The permanent, for example? Another rationale is the perennially important open problem: what can ACC actually compute? It may be hard to resolve the majority function, but perhaps other functions can be shown to be in ACC. Good luck thinking about this.
]]>
A shocking story from our friendly Leprechaun
Neil L. is not a computer scientist—he is a Leprechaun. He has visited me every year since I started writing GLL, always visiting on St. Patrick’s day.
Today I want to report on what happened this year.
Days ago I received a email from him—a first. It said:
If you agree to not catch me, I will be at your place at 12:01am Monday. Neil L.
One year I did catch him, affording me three wishes. I thought I would get some of the secrets of the universe, but he outsmarted me.
I replied back that yes I would stop trying to catch him. I got no reply. In any event I decided I would stay up late Sunday night and see if he would appear.
Just after midnight I smelled a pungent odor, then saw a puff of green smoke, and there was Neil L. standing before me. His pipe in his mouth was putting out that green smoke, which has a strong but surprisingly pleasant smell. My dad was a pipe smoker when I was young, quitting later in life—of course his tobacco smoke was never green.
Neil said, “Good day to you on this fine St. Patrick’s Day. I read your reply, quite clever.” I responded that I did not think I was being clever at all. Neil smiled, took another puff, and said “Come on—saying I will not try to catch you—come now.” He added that I must promise not to catch him. I nodded back yes, and he sat down on a chair across from where I was sitting. His short legs dangled off the floor.
We sat there for a moment in silence, then as I was about to speak, he interrupted and said, “I will answer ye any one question, no tricks, straight up.” I looked back surprised and figured that I had nothing to lose so I asked the question,
Is P=NP?
Neil puffed away for quite a while and then said, “I appreciate your not trying to be tricky with your question. I could answer of course no, since the letters ‘P’ and ‘NP’ are different. But I will be straightforward with you. No teasing, no games today.” I was excited, perhaps I would finally know the answer.
He added, “I could make you swear that you cannot repeat this to anyone, but I do not need to, for none will believe ye.” I said OK, so what is the answer? Neil smiled and said:
“We have not yet determined the answer. That’s the truth. No tricks. I swear as a Leprechaun, may you find all my gold coins if I lie.”
I thought, what does that mean? Not determined yet? A mathematical statement is either true or it’s false. No tricks.
I asked what did he mean by “not yet determined?” Either P=NP or P≠NP, I responded. Neil smiled and said that he would explain. He said:
“You live in a world that is really a simulation. We control it all.”
I looked at him, thinking this was another trick. Neil registered my shock and added: “Yes it must be hard for you to believe, but it is true. How else can you explain a Leprechaun that appears and disappears.” With that he puffed some more on his pipe and laughed and added, “I told you no one will believe ye.”
Okay I said, could he explain what he meant by “not yet determined”? He nodded yes. “The simulation mostly runs itself quite well, but there are situations where it gets stuck and the committee of seven then step in and decide what it should do.”
I asked what is the ‘committee of seven’? He answered, “It is a group of over fifty-seven wise ones who are tasked with maintaining the simulation and fixing any unexpected problems. We cannot be expected to have foreseen all possible situations, so they are the ones who decide things.” Another puff and more green smoke, and he continued, “Don’t even ask why it’s the committee of seven, it’s too long a story.”
I finally laughed and said that this was his best visit ever—what a silly story. I told him that I did not believe a word of it. Neil looked at me and said he understood, yet he could give me some “proof.” I said that would be great.
Neil answered, “Look I know it is hard for ye to believe, but I will give ye examples of why it is true.” The first he gave is that in ancient times there were more miracles and strange occurrences, including demons and the like. Did I not agree? I said that that was the folklore but… He interrupted and said “Ay—that was when we first began to run the simulation; we were new at it and made mistakes. We are much better today. Much.”
I started to say yes, nothing like that happens today, but Neil waved me off and said, “we don’t make errors but we still have fun—especially with your sports.” I gave him a cold stare, but he continued: “You are a fan of sailing, if I recall. What are the odds of coming back from 8–1 down in a best-of-9 series?” He was talking about the Oracle US team’s comeback in last September’s America’s Cup final. I rounded the simple coin-flip answer: “About 250–1 against, more since they looked badly beaten.” Neil asked whether there had been anywhere near 250 final series since the America’s Cup began, and needed no answer from me as he puffed with a grin.
Commercial T-shirt source |
“Do you mess with bookies?” I blurted out, then felt ashamed for incivility. But Neil leaned forward with a hand gesture and hush for my attention: “The simulation gives us a budget of improbability, and by the Rule of Improbability we have only a tiny window to deviate. But we are free to choose which improbable events happen to maximize our fun, so long as we do not also violate the Rule of Indistinguishability.” Without even pausing for my query, he went on: “That Rule is the same one you use in defining computational zero knowledge—we can prove things to you, but it would take you too much effort to prove our tricks to a third party. To keep it, we put much effort into not messing with things. For instance our simulator had to work hard this weekend to calculate the true odds and allowances for Warren Buffett’s $1,000,000,000 Bracket Challenge, but we’re all set for “March Madness” now.”
I asked if they could mess with physics, and he told me about a third rule: “The Rule of Consistency is that we may not overturn any past experimental results once their confidence has gone beyond 5 sigma.” Of course humanity has only had the 5-sigma rule for declaring things like the Higgs Boson or primordial gravity waves to exist for a short time, so I asked what about physics in the past. Neil continued, “I know nothing about this, but there was a problem with something called beta-decay.” I jumped in and said: yes there is a surprising effect—it is one of few physical effects ever discovered that tells left from right. Neil added, “Yes that is it. Some fool made a mistake and caused that to happen. Once it got into the world the committee thought it was fun to have it. A kind of insider joke. Of course it meant we had to re-program the physics part of the simulator to compute using Dirac notation, long before Dirac invented it.”
I was still numb from the story—it could not be true? Yet. What he just said about “programming” made me realize, surely the laws of computability and complexity must apply to them too. Now I knew I had him, so I asked:
Surely P=NP must have an answer in your world—which your committee is subject to. What is the answer there?
Neil answered calmly, “As I said, it is not yet determined.”
I expostulated, “Not yet determined for us you said. But surely it is determined for them.”
A little patronizingly Neil replied, “Nay me lad. By me honor, when I said it is not yet determined I spoke true. I did say there are over fifty-seven on the committee of seven. We cannot violate consistency amongst ourselves for a mathematical proposition.”
I remembered from writing our recent simulation post that people who live in simulations would write their own simulations that other people could live in, ad-infinitum all the way down. For Neil to be right it could be infinities all the way up too. Then I realized, “Ah—I see why you said ‘over fifty-seven’ before: there’s you plus your committee of 7, then each of those is subject to a committee of 7, so and it goes on from there. But how can there be infinitely many leprechauns?”
To my surprise he gave a pained start: “Ay, ye have catched me in a secret, and ye did not try to—it was me saying too much. So I must now tell ye exactly how many leprechauns we are, and it will prove to ye what I said about your mathematics.” I stood as he stood up as if to leave, and I saw he was really a little guy, only my size. Neil spoke once more as his pipe puffed mightily.
“We have other committees and hierarchies with other numbers, so our total is the sum of all the natural numbers —and the smoke will tell you what that is.”
With a “pouf” he vanished, but once again his smoke stayed behind, and some of it knotted itself into what first looked like a snake until I recognized it was making the Greek letter zeta. Then I saw the value, and I realized that choosing it must have caused endless committees endless mirth in their endless—but thereby finite—hierarchy:
I rushed to look this up in a book, the first to hand being a physics book on String Theory which I had just ordered from Amazon, and sure enough there it was:
Ken and I wish you a happy and safe St. Patrick’s Day. Do you believe a word of our Leprechaun, Neil? I do not. “Not determined yet”—that is ridiculous. It is silly, yet how can I distinguish
Update 3/18:
We wrote and posted this before knowing about yesterday’s announced discovery of solid evidence for gravity waves and cosmic inflation at the Big Bang. Where Neil L. is talking about “5-sigma” above I’ve linked this heartwarming video of Andrei Linde getting the news from a collaborator on the discovering project, whose first words were, “It’s 5-sigma.” As this St. Patty’s “big bang” is verified, let’s hope the leprechauns keep their word…
I obtained two surprising data points for 2012 and 2013 yesterday which strengthen my own evidence against a different kind of “inflation” that almost all strong chess players are said to believe in.
Rutgers in memoriam source. |
Leonid Khachiyan in 1979 caused arguably the most sudden surprise to the West’s popular scientific understanding since the successful launch of Sputnik in 1957. His ellipsoid method gave the first algorithm for linear programming whose polynomial running time was verified. Thanks largely to Narendra Karmarkar it has been superseded by faster interior-point methods, while older algorithms have since been noted to run in polynomial time, but the breakthrough and inspiration came from Khachiyan. Could something like it happen again?
Today Ken and I want to ask whether recent argument over beliefs about ? can be evaluated in light of this shock.
Khachiyan’s “rocket” had actually left the hangar ten months before, in a January 1979 Doklady Akademii Nauk paper whose title translates into English as, “A polynomial algorithm for linear programming.” As recounted by Berkeley’s Eugene Lawler, it was sighted at a May 1979 meeting Lawler attended in Oberwohlfach, and after Peter Gács and László Lovász supplied proofs missing in the paper, it was pronounced correct at a conference in Montreal. The discovery was picked up in October by Science News, and then by the magazine Science. An allusion by the latter to the NP-complete Traveling Salesman problem was moved to the headline of a Nov. 4 story in England’s Guardian newspaper, and reflected three days later in the New York Times’s front-page screamer, “A Soviet Discovery Rocks World of Mathematics.”
Our point is not to say that linear programming being in was a surprise. To those who knew the reality behind the headlines, it wasn’t. As Lawler relates, the great George Dantzig had tried to set Times reporter Malcolm Browne straight on this and points related to what LP’s can and (ostensibly) cannot solve. The simplex algorithm already solved the vast majority of LP cases in expected time, so there was no feeling of practical intractability. Rather our point draws on something perhaps less widely known and appreciated: that Khachiyan’s ideas extend to solve a much wider class than linear programs, including so-called semi-definite programs or SDP’s, exactly or with high approximation. Thus it can be said to show that a complexity class defined by “approximation-robust” reductions to these programs equals .
We raise this with regard to the main technical argument in a recent post by Scott Aaronson titled “The Scientific Case for .” We wonder whether a similar argument might have seemed on the side of in the years before Khachiyan. Even more speculatively, we wonder whether a kind of “Reverse Oracle Result” can be formulated to make any of this more concrete. But first let’s review Scott’s comments in the wider context of belief about vs. and about open problems that were resolved in surprising ways.
Essentially Scott gave a quite reasonable argument for , in his usual elegant and convincing style. Bill Gasarch expanded it. But. But mathematics is not something we argue about like: who was the best hockey player of all time, or what is the right philosophy? The simple fact is that no one has proved that .
Our purpose with our recent post on a 13-GB certificate of unsatisfiability was not to start a discussion about , but rather to witness that -hardness is not so great a practical obstacle as we may think. The Gröbner basis algorithm is run all the time, despite the problem it solves being complete for exponential space. Notably it runs in singly-exponential time on a generic set of cases. If we can shift down a level, this is like having “smoothed polynomial-time behavior” of an algorithm for a -complete problem. Solving nontrivial cases of -hard problems is addictive.
Almost the entire business model of this company is to solve -hard optimization problems, using non-quantum computers. As is evident from examples on their website, they are not just gristing easy approximation for run-of-the-mill instances. To quote one of their blog entries (their emphasis):
According to academic research on NP-hard problems, it’s impossible to guarantee that optimal solutions to difficult problems will be found in a reasonable time frame. However, with almost all real-life planning puzzles, you can get excellent results very quickly.
Hence our initial reaction was, who cares about discussions on being true or not, aside from progress on this great question? A full solution would be wonderful, but just having small steps would be great, even a possible program for a solution would be welcome. So that was what we thought we should just say, nothing more, except noting our split answers to Bill G’s reprised poll three years ago.
But. But Dick couldn’t resist adding some more sections, while Ken made some effort to counter Scott’s facts, counterfactually.
I feel compelled to explain why I am open-minded on this question perhaps more than anyone else. I have several reasons that I feel are important to remind all of us:
We’ll address the full versus question, and not situations where say the algorithm generating instances is restricted to random bits—a case in which we’ve noted that in the limit one can solve them all in something like time .
I have discussed guesses in mathematics many times before on this blog. One of the biggest issues in guessing wrong is that people do not take seriously the other possibility. Researchers tend not to work on showing anymore. Research support does not go there, since we all “know” that it would be a waste of time, and there are other consequences to the field.
Here are some famous guesses that were essentially off by exponentials. For each I will list the time gap between the initial problem being raised and being solved.
for all reasonable size . Here is the logarithmic intergal
which is an asympotic approximation to , the number of primes less than . It was conjectured that this would always hold and was widely believed for over a century. Then John Littlewood proved that the lead changes between and infinitely often, although the first switch is upper bounded by an immense number. Richard Guy wrote a wonderful article on what he called the “The Strong Law of Small Numbers”: cases when evident phenomena held for small numbers but eventually would fail. Here is a table with other examples:
By the way the “common clock” on is 43 years.
We do have the theorem that is not equal to , which we have discussed before and which is particular to the multitape Turing machine model—make the tapes planes or trees and it goes away. We cannot even deduce from it that That’s pretty weak. Remember that means that does not contain . And more, of course.
The versus statement still allows cross-cutting the generally-understood significance. That is:
To be sure, some evidence cited by Scott is really for an exponential lower bound on ; we have discussed this before too. But what we are saying still cuts against the usual argument that “many people have worked on looking for algorithms for .” Yes many have looked for algorithms, but most were interested in “real” practical algorithms. For this kind of quest there is not much difference between and .
Aram Harrow communicates to us a more-concrete version of this point, which also serves as a bridge to Ken’s musings.
Quoting Aram, with light editing: “One of the stronger reasons for is the Bayesian one—it is easier to find algorithms than lower bounds, so our failure to find a subexponential-time algorithm for speaks louder than our failure to find a super-linear lower bound for . A related way of expressing this is that before -completeness was understood, thousands of researchers in disparate fields were unwittingly all trying to put into .
But a counter-argument is that all of those seemingly independent researchers would always come up with algorithms that relied on a few kinds of structure—a lot is covered by just two kinds:
This paucity of algorithmic variety can be viewed in (at least) two ways:
On the latter, Terry Tao’s recent breakthrough on the Navier-Stokes equations is an example of how much the same ideas keep recirculating, and how much more quickly progress can be made by cross-applying ideas rather than coming up with radically new ones. Going from Erwin Schrödinger’s equation to Peter Shor’s quantum factoring algorithm is a 60-minute lecture, but it took over 60 years (and a change in perspective coming from the computer revolution) to discover. Our lack of algorithms reveals only our lack of creativity, and it is arrogant to posit fundamental limits to mathematics just because we can’t solve a problem. Either way, the central question isn’t so much about but rather a “throwback” of a question now being asked about quantum computers:
Where does the power of deterministic algorithms come from?
A related question is the power of the Lasserre hierarchy. It has been shown to be effective for a large number of problems, but with a surprisingly small number of truly different techniques. I would love to know whether further work will increase or decrease the number of ways in which we know how to use it; that is, either by discovering new methods or by unifying apparently different methods.”
The Lasserre hierarchy builds upon LP’s and SDP’s, and this brings us back to the intro. I (still Dick) remember many people in the 1970′s trying to prove that certain linear/convex programming problems were -hard, despite all our confidence in the simplex algorithm for daily use. This makes Ken wonder:
What if SDP’s really were hard?
Russell Impagliazzo famously categorized five worlds that are consistent with current knowledge of complexity. Is there room to analyze any more, ones that are inconsistent now, but might have been meaningfully consistent had our field taken a different path?
All of Impagliazzo’s worlds—including ones with and with —have been instantiated via oracle results. All oracle results involve pretending that some ostensibly hard problem is easy. For instance, the world with involves pretending -complete problems are easy, while known ones for involve granting free access to a set coded so that the free access itself activates a diagonalization. What I (Ken) wonder is whether there is a sensible formal way to do “Reverse Oracle Results,” which pretend that some easy problem is hard.
One known way to get this effect is to narrow the definition of “easy” so that still has easy reductions to from other problems . For example, linear programming problems are P-complete under logspace (and even easier) reductions, as are problems of approximation by SDP’s. But here I mean something more structural—a sense in which is the only route to solving a whole class of problems . Then we can segregate this entire class and pretend it all is hard. It might suffice to give “easy” reductions from to all these . In particular, a lot of ‘s are solved via the interior-point paradigm for (LP’s and) SDP’s. It could also employ ideas in reverse mathematics.
Scott replied to my comment about a possible algorithm for in his post by referring to his earlier comment that:
Since it would be inelegant and unnatural for the class to be “severed into two” in this way, I’d say the much likelier possibility is simply that .
Our point is, perhaps remains already manifestly “severed into two” along Khachiyan’s and the interior-point fault-lines. In particular, we wonder how the following consequences would have looked as conditional results had they been proved in the late 1970′s:
Theorem 1 If , then it is possible to compute in polynomial time close approximations to a function of undirected graphs that is sandwiched between the -complete clique number and chromatic number , which all coincide whenever neither nor its complement has an odd induced cycle.
Theorem 2 If , then is polynomial-time approximable within , even though it is -hard to approximate within .
In the first theorem we have also inverted time by embracing the Strong Perfect Graph Theorem which was proved in 2002 (final in 2006), but this conjecture was strongly felt and makes it transparent that many important families of graphs are perfect. Hence the sweeping tractability of major -complete problems on these cases could be a surprise. On the second theorem, why should the difference between and matter to such a simple problem as ?
Of course in the light of knowledge we understand how these two famous theorems work. On the latter the Unique Games Conjecture already helps explain how may be special. But the present exercise is about how we reason when we don’t (yet) have the light of knowledge.
Can we make some formal sense of a world where Khachiyan’s breakthrough never happens?
Update 3/18/14: The comments section has some excellent replies, including ones on the argument of the last two sections by Scott Aaronson and by Timothy Gowers.
Update 3/17/14: The Lovász theta-function example replaced the post’s original quoting of Leslie Valiant’s first “Accidental Algorithm” under a mis-memory that LP’s were involved (and SDP’s in later developments). Snipping the irrelevant qualifier, it reads:
Theorem 3 Counting assignments to a certain class of planar formulas is deterministic polynomial-time computable modulo , or modulo any Mersenne prime, even though it is -hard modulo or .
Aside from the fact that this computation belongs to a class called which is commonly believed to be a proper subclass of , the intent was to argue: “What do Mersenne primes have to do with matchings and convex programming really? Surely counting must be equally hard modulo any odd prime—after all there’s no exception for Mersenne in modular circuit lower bounds—so must be an ‘invisible fence’ around this kind of ridiculousness.” The intro and Borsuk statements were also amended.
Michael Rabin is visiting Georgia Tech today and tomorrow to give a pair of distinguished lectures. Both of these will be on applications of cryptography. One is to help auctions avoid cheaters, while the other is to help elections avoid cheaters. I see a pattern. Ken sees another pattern— he is helping chess tournaments avoid cheaters.
Today I want to comment about Rabin’s fame and what makes a result important.
I have known Michael since I was a graduate student at CMU—I have talked about this before here. In the decades since then I have heard him given many talks, all of which have been brilliant. He is one of the best presenters of technical material I have every seen, perhaps the best in the world. My “proof” of this statement is:
that I can still recall—in detail—most of his talks, even ones from decades ago.
Can you recall that talk you heard last year, or even one you heard last month? I have trouble recalling my own talks. But Michael’s talks are special, memorable, informative, clear, and fun.
I have selected a few talks of Michael that I recall in great detail—they span about forty years. There are many others that I could have added, but these should make my point.
His talk on Theoretical impediments to artificial intelligence, was the first of his talks that I had ever heard. It was at the 1974 IFIP Congress, which occurred in Stockholm Sweden. There was a time when the IFIP Congress was a major conference that many of us went to. I met Dick Karp there for the first time.
His talk on the introduction of randomness to algorithms, which was given at Yale when I was there as a junior faculty member. It was in 1977, I recall. This talk made the case for the power of randomness—Michael showed that randomness could help in a certain geometric search problem. I talked about this in detail in the same post with the CMU story.
His talk on the Karp-Rabin pattern matching algorithm was given in the 1980′s at Princeton University. We have also talked about this before here.
His talk on hyper-encryption was given at Georgia Tech about ten years ago. This was an cool idea—I believe—on using non-complexity assumptions to build encryption methods that were very powerful. The short insight was that memory is expensive, and one could defeat an adversary that had limited memory. This yielded a protocol that needed no assumptions about factoring or the existence of one-way functions.
Why indeed is Rabin famous? He received the Turing Award with Dana Scott for their work on finite state automata (FSA). I would argue that his most exciting results were curiously his least deep results. We all know about FSA; his introduction of randomness to all parts of computing; his primality test, independent but related to Gary Miller’s work; his pattern matching algorithm with Karp; and much more. Yet, I would argue that his deepest result is probably his least known. It was, is, his brilliant work on S2S.
What is S2S?
There are many logical theories that we study, such as Peano Arithmetic (PA). PA is a first-order theory. This means that quantifiers can only range over individual elements—in PA they range over integers. Thus, in PA we can say
This states that all numbers have a non-zero multiple that is a sum of two cubes. This is true—but it is not trivial.
The reason PA is so powerful is that it allows both addition and multiplication. Given a statement like the above about cubes it is impossible, in general, to decide whether the statement is true or not.
We obviously like decidable theories since at least in principle they allow us to tell if a statement is true or false. Of course if , then even for a decidable theory it may be hard to tell whether something is true. But still decidable is a great property for a theory to have.
A difficulty is the tension between being an expressive theory and being decidable. PA is very expressive, most everyday theorems of mathematics can be proved in it, at least in principle. It is so expressive that even weak subtheories are undecidable.
Enter S2S. The theory S2S is a different kind of theory from PA. While PA is a first-order theory, S2S is a second-order theory. The “S” in “S2S” stands for second order. It allows quantifiers to range over individual elements and also over finite or infinite sets of elements. The basic objects in S2S are finite paths in the infinite binary tree.
In S2S we can talk about the left and right successor to any such element: if is an element, then and are the respective successors. Since it is a second order theory we are also allowed quantifiy over sets of such elements.
The ability to quantify over sets makes S2S very powerful and expressive. For example, here are two notions expressed formally:
The magic of this is that while the theory is expressive, it is not too expressive. Indeed the Rabin proved in 1969:
Theorem 1 The monadic second order theory of the infinite binary tree is decidable.
When I first looked at Rabin’s paper, as a graduate student at CMU, it was not the depth of his proof, which is wonderful, but rather the array of applications that followed that excited me. One measure of the depth of a theorem is the number of open problems it solves. Rabin’s theorem can be used to prove the following other theories are decidable:
These results follow by encoding the decidability question into the powerful theory S2S and invoking Rabin’s Theorem. See this for a nice summary of S2S in slide format by Shane Steinert-Threlkeld.
The proof of Rabin’s Theorem was a tour-de-force. It requires clever definitions and some quite detailed inductive arguments. Since his original proof people have found “easier” proofs, but the original was quite deep and intricate.
I would argue that this theorem is one of the deepest results of Rabin’s many beautiful results over his long career. It is well known to those who work in logic and automata theory, but is perhaps less known to the whole theory community. If you already knew it fine, if not, then I hope you begin to appreciate the depth of his work.
Perhaps a lesson here for all: fame comes from results that are game-changers, which does not always mean they are deep long complex arguments. Sometimes that is the case: clearly the solution to Fermat Last Theorem and the Poincaré Conjecture are famous and deep results. Yet many times I think Rabin’s situation is more often the case: a simple to state result that yields an “ah” moment, that opens doors for others, that changes the landscape of thinking about an area, is the most important type of result. Rabin has many many of these results. I would argue that without S2S he still would be one of the greatest theorists who has ever lived.
What do you think?
]]>