Richard Lewis, Bill Horton, Earl Beal, Raymond Edwards, and John Wilson—the Silhouettes—were a doo wop/R&B group whose single Get A Job was a number 1 hit on the Billboard R&B singles chart and pop singles chart in 1958. Even back then it sold over one million records, and was later used in ads and movies.
Today I want to talk about hiring faculty, as we are getting near the end of the usual job hiring cycle.
From the view of the many PhD candidates who are looking for jobs, this year must seem pretty bright. Companies, company labs, and universities all seem to be hiring. We are seeing a large number of very qualified people on the market—I glad I have a job and do not have to compete with them.
At Tech we do the job search as an potential employer in the old-fashioned way. We look at applications, ask some to visit and give a formal presentation, and have them talk to our faculty and students. Then we vote on making offers, make them, and try to convince the fortunate recipients to accept.
This method has been used forever it seems, and it works reasonably well. However, a question is: can we use our own methods to make the recruiting and hiring process better? In computer science theory we have many results about making decisions under uncertainty, yet when we do hiring of faculty, we use completely ad hoc methods.
This year at our first faculty meeting to discuss hiring I brought donuts from Krispy Kreme for all to enjoy. The initial presentation on a candidate by one of our faculty had a slide that quoted a letter writer:
While you are sitting around eating donuts and evaluating candidates remember that
Somehow the writer of this recommendation letter ‘knew’ we would be eating donuts—I cannot decide if it was funny or scary.
I wonder if we can use theory methods to rethink the hiring process. Perhaps we will always do it the old way, and always eat donuts and chat. But perhaps too there is some way to use mathematical methods. In any event, I thought I would share some simple observations about this with you.
Imagine that Alice and Carol are on the job market. Assume that both are “above the bar” and would be solid additions to our school of Computer Science. Suppose also that we have one offer we can make—and if it is declined then we cannot make another offer. So our choice is simple:
How do we chose?
A model is that each candidate has a secret probability of whether they will select our offer: is the probability for Alice and is the probability for Carol. Let’s assume that
Here is the key issue. If we make an offer with a deterministic old-style method, we could pick a candidate that is very unlikely to come. This is what we wish to avoid.
A simple strategy is to flip an unbiased coin. If it’s heads make an offer to Alice, if it’s tails make the offer to Carol. Note, this trivial strategy yields an expected number of accepts of . And it does pretty well. If is much larger than , for example, we get Alice with probability within a factor of one-half. If on the other hand and are near each other we also do pretty well.
What is wrong with this strategy? Is it better than chatting and eating donuts?
Of course in real life the situation is much more complex:
And so on.
Can we make a reasonable model and find a decision strategy that still works well in a real world situation?
So is it donuts forever, or can we use some decision methods in hiring?
Any way good luck to all trying to: “Get A Job.”
Yip yip yip yip yip yip yip yip
Sha na na na, sha na na na na
Sha na na na, sha na na na na
Sha na na na, sha na na na na
Sha na na na, sha na na na na
Yip yip yip yip yip yip yip yip
Mum mum mum mum mum mum
Get a job, sha na na na, sha na na na na
]]>
An AMS article by Gil Kalai updates his skeptical position on quantum computers
Cropped from Rothschild Prize source |
Gil Kalai is a popularizer of mathematics as well as a great researcher. His blog has some entries on Polymath projects going back to the start of this year. He has just contributed an article to the May AMS Notices titled, “The Quantum Computer Puzzle.”
Today we are happy to call attention to it and give some extra remarks.
The article includes a photograph of Gil with Aram Harrow, who was his partner in a yearlong debate we hosted in 2012. We say partner because this was certainly more constructive than the political debates we have been seeing this year.
We’ve missed chances to review some newer work by both of them. In 2014, Gil wrote a paper with Greg Kuperberg on quantum noise models and a paper with Guy Kindler relating to the sub-universal BosonSampling approach. Among much work these past two years, Aram wrote a position paper on the quantum computing worldview and this year a paper with Edward Farhi. The latter reviews possible ways to realize experiments that leverage complexity-based approaches to demonstrating quantum supremacy, a term they credit to a 2011-12 paper by John Preskill.
Quantum supremacy has a stronger meaning than saying that nature is fundamentally quantum: it means that nature operates in concrete ways that cannot be emulated by non-quantum models. If factoring is not in —let alone randomized classical quadratic time—then nature can do something that our classical complexity models need incomparably longer computations to achieve. We like to say further that it means nature has a “notation” that escapes our current mathematical notation—insofar as we use objects like vectors and matrices that have roughly the same size as classical computations they describe, but swell to exponential size when we try to use them to describe quantum computations.
Aram’s paper with Fahri leverages complexity-class collapse connections shown by Michael Bremner, Richard Jozsa, and Daniel Shepherd and an earlier paper by Sergey Bravyi, David DiVincenzo, Reberto Oliveira, and Barbara Terhal. For instance, via the former they observe that if the outputs of certain low-depth quantum circuits can be sampled classically with high approximation then the polynomial hierarchy collapses to level 3. This remains true even under an oracle that permits efficient sampling of a kind of quantum annealing process. This is arguably more of a hard-and-fast structural complexity collapse than factoring being in would be.
Quantum supremacy also entails that the quantum systems be controllable. Preskill’s paper raises concrete avenues besides ones involving asymptotic complexity classes. Gil’s article picks up on some of them:
As Gil notes, the last has been undertaken by the company D-Wave Systems. This has come amid much controversy but also admirable work and give-and-take by numerous academic and corporate groups.
Our first remark is that Gil’s paper highlights a nice example of how computational complexity theory informs and gives structure to a natural-science debate. Aram and others have done so as well. We especially like the connection between prevalent noise and bounded-depth circuits vis-à-vis low-degree polynomials. We believe the AMS Notices audience will especially appreciate that. We’ve wanted to go even further and highlight how Patrick Hayden and Daniel Harlow have proposed that complexity resolves the recent black hole firewall paradox.
Our second remark is that this is still largely a position paper—the arguments need to be followed into the references. For example, the fourth of Gil’s five predictions reads:
No logical qubits. Logical qubits cannot be substantially more stable than the raw qubits used to construct them.
On the face of it, this is just the negation of what the quantum error-correcting codes in the fault-tolerance theorem purport to do. Gil follows with a more technical section countering quantum fault-tolerance in a stronger fashion with some technical detail but still asserting positions.
Our third remark is that the nub—that “when we double the number of qubits in [a quantum] circuit, the probability for a single qubit to be corrupted in a small time interval doubles”—is presented not as new modeling but “just based on a different assumption about the rate of noise.” We think there needs to be given a fundamental provenance for the increased noise rate.
For instance, a silly way to “double” a circuit is to consider two completely separate systems and to be one “circuit.” That alone cannot jump up the rate, so what Gil must mean is that this refers to doubling up when and already have much entanglement. But then the increased rate must be an artifact of entangling them, which to our mind entails a cause that proceeds from the entanglement. Preskill on page 2 of his paper tabs the possible cause of supremacy failing as “physical laws yet to be discovered.” We’ll come back to this at the end.
Gil puts the overall issue as being between two hypotheses which he states as follows:
I have a position that Dick and I have discussed that sounds midway but is really a kind of pessimism because it might make nobody happy:
This would put factoring in quantum time. That would still leave public-key cryptography as we know it under a cloud, though it might retain better security than Ralph Merkle’s “Puzzles” scheme. But the quadratic scale would be felt in all general quantum applications. It would leave everyone—“makers” and “breakers” alike—operating under the time and hardware and energy constraints of Merkle’s Puzzles, which we recently discussed.
Thus we have a “pun-intended” opinion on how the “Puzzle” in Gil’s title might resolve. However, I have not yet solved some puzzles of my own for marshaling the algebraic-geometric ideas outlined here to answer “why quadratic?” They entail having a mathematical law that burns-in the kind of lower-bound modeling in this 2010 paper by Byung-Soo Choi and Rodney van Meter, under which they prove an depth lower bound for emulating such CNOT trees with limited interaction distance where is a dimensionality parameter. This brings us to our last note.
The main issue—which was prominent in Gil and Aram’s 2012 debate—is that everything we know allows the quantum fault-tolerance process to work. Nothing so far has directly contradicted the optimistic view of how the local error rate behaves as circuits scale up. If engineering can keep below a fixed constant rate then the coding constructions kick in to create stable logical qubits. If some is a barrier in our world, what might the reason be?
It could be that this is a condition of our world, perhaps having to do with the structure of space and entanglement that emerged from the Big Bang. Gil ends by arguing that quantum supremacy would create more large-scale time-reversibility than we observe in nature. It would also yield emulations of high-dimensional quantum systems on low-dimensional hardware of kinds not achieved in quantum experiments to date—on top of our long-term difficulties of maintaining more than a handful of qubits.
This hints that an explanation could be as hard as explaining the arrow of time, or that could be a fundamental constant like others in nature for which string theory has trended against unique causes. Still, these other quantities have large supporting casts of proposed theory. If the explanation has to do with the state of the world as we find it then how do initial conditions connect to the error rate? More theory might indicate a mechanism by which initial should at least help indicate why isn’t simply an engineering issue.
Thus there is still an onus of novelty for justifying the pessimistic position. It may need to propose a new physical law, or a deepened algebraic theory of the impact of spatial geometry that retains current models as a limit, or at least a more direct replacement for what Gil’s article tabs as “the incorrect modeling of locality.”
How effective is the “Puzzle” for guiding scientific theory?
We note that the much–acclaimed and soberly–evaluated answer on quantum computing by Canada’s Prime Minister, Justin Trudeau, had this context: A reporter said, “I was going to ask you to explain quantum computing, but… [haha]…when do you expect Canada’s ISIL mission to begin again…?” Trudeau ended his spiel by saying, “Don’t get me going on this or we’ll be here all day. Trust me.” Would he have been able to detail the challenges?
[added qualifier on Choi-van Meter result, added main sources for Farhi-Harrow]
]]>
A preview of the talks for this coming ARC Day
ARC is our Algorithms & Randomness Center at Tech. It was created by Santosh Vempala, and this Monday ARC is holding a special theory day. The organizers are Santosh with Richard Peng and Dana Randall.
Tomorrow, Monday April 11th, is the day for the talks, and I only have time to highlight just two of them.
All of the talks look great—see this for details on the other two by Rocco Servedio on “Circuit Lower Bounds via Random Projections” and Aaron Sidford on “ Recent Advances in the Theory of Interior Point Methods.” We previewed a related joint paper by Servedio last May. Sidford’s talk will include the first -time algorithm for finding the geometric median, that is a point that minimizes the sum of distances to given points in Euclidean space. This is a nice contrast to recent results where having any sub-quadratic algorithm would break conjectures about the hardness of SAT.
Virginia Vassilevska-Williams will speak on Fine-Grained Algorithms and Complexity. Virginia key point is simple:
If a problem is computable in polynomial time, then the classic reductions used to study the question are useless.
They are useless since they cannot distinguish the fine structure of polynomial time: linear time and time all look the same.
This simple insight leads one to study the topic now called “fine-grained reductions,” which focuses on exact running times. She explains that the key point of fine-grained is to allow one to compare problems that run in polynomial time. Yet the reductions also “float” so that they are not just: the algorithms run in essentially the same time. There is a lurking here. She states:
This approach has led to the discovery of many meaningful relationships between problems, and even sometimes equivalence classes.
She plans to discuss current progress and highlight some new exciting developments.
Luca Trevisan of Berkeley will speak on Ramanujan Graphs. These are graphs that are of course name after the extraordinary mathematician Srinivasa Ramanujan. Luca plans on reviewing what is known about existence and constructions of Ramanujan graphs, which are the best possible expander graphs from the point of view of spectral expansion—see this for precise definitions.
He will talk about Joel Friedman’s result that random graphs are nearly Ramanujan, and recent simplifications of Friedman’s proof, which was around 100 pages long. Luca will also talk about connections between Ramanujan graphs and the Ihara zeta function, and also about recent non-constructive existence results.
The Ihara zeta-function, named for Yasutaka Ihara, can be defined by a formula analogous to the Euler product for the usual Riemann zeta function:
This product is taken over all prime walks “p” of the graph that is, closed cycles. See this for the rest of the formal definition, and also this 2001 survey on zeta functions of graphs. Toshikazu Sunada made the key connection between Ramanujan graphs and this function that was first defined in the 1966, in a totally different context.
It seems amazing that graph problems can be coded into zeta like functions. One wonders what another problems yield to similar ideas.
We hope the talks have a nice turnout and are looking forward to a banner day.
]]>
Can we have overlooked short solutions to major problems?
src |
Efim Geller was a Soviet chess grandmaster, author, and teacher. Between 1953 and 1973 he reached the late stages of contention for the world championship many times but was stopped short of a match for the title. The Italian-American grandmaster Fabiano Caruana was similarly stopped last week in the World Chess Federation Candidates Tournament in Moscow. He was beaten in the last round by Sergey Karjakin of Russia, who will challenge world champion Magnus Carlsen of Norway in a title match that is scheduled for November 11–30 in New York City.
Today we salute a famous move by Geller that was missed by an entire team of analysts preparing for the world championship in 1955, and ask how often similar things happen in mathematics and theory.
The 1955 Interzonal Tournament in Gothenburg, Sweden, included three players from Argentina: Miguel Najdorf, Oscar Panno, and Hermann Pilnik. In the fourteenth round they all had Black against the Soviets Paul Keres, Geller, and Boris Spassky, respectively. The Argentines all played a Sicilian Defense variation named for Najdorf and sprung a pawn sacrifice on move 9 that they knew would induce the Soviets to counter-sacrifice a Knight, leading to the following position after Black’s 12th move in all three games:
Chessgames.com source |
As related by former US Champion Lubomir Kavalek, who contested four Interzonals between 1967 and 1987, Najdorf indecorously walked up to Geller after Panno had left his chair and declared,
“Your game is lost. We analyzed it all.”
Unfazed, Geller thought for thirty more minutes and improvised 13. Bb5!!, a shocking second sacrifice that the Argentines had not considered. The Bishop cannot be taken right away because White threatens to castle with check and soon mate. The unsuspected point is that after Black’s defensive Knight moves to the central post e5 and is challenged by White’s other Bishop moving to g3, the other Black Knight on b8 cannot reinforce it from c6 or d7 because the rogue Bishop can take it. The Bishop also X-rays the back-row square e8 which Black’s Queen could use.
Whereas Najdorf’s speech was in-delicto, no rule prevented Keres and Spassky from walking over and noticing and “cribbing” Geller’s move. It is not known if they got it that way—some say Keres already knew the move—but both played it after twenty-plus more minutes of reflection. Panno perished ten moves later and the other Argentines were equally dead after failing to find the lone reply that lets Black live. Though there is stronger indication that Keres noted the draw-saving reply 13…Rh7! then or shortly afterward, the first time it ever was played on a board was in the next Interzonal three years later, by the hand of Bobby Fischer.
Bernhard Riemann’s famous hypothesis has been in the news a lot recently. The second half of November saw one proof claim in Nigeria and another by Louis de Branges, who acts as a periodic function in this regard but has scored some other hits. We just covered some other news about the primes.
Then last week this paper came to our attention. It is titled, “A Direct Proof for Riemann Hypothesis Based on Jacobi Functional Equation and Schwarz Reflection Principle” by Xiang Liu, Ekaterina Rybachuk, and Fasheng Liu. The paper is short. My reaction from a quick look was,
It’s like saying that in an opening that champions have played for decades they all missed a mate in ten.
I must admit that I’ve spent more time thinking of a real missing mate-in-ten case in chess than probing the paper. The above is the closest famous case I could think of, and it wasn’t like the Argentines had been analyzing for the 157 years that Riemann has been open—they had only been doing it during the weeks of the tournament. Without taking time to find errors in the paper, let’s ask some existential questions:
Is it even possible to have missed so short a proof? What things like that have happened in mathematical history?
And even more existential, what is the easiest possible kind of proof of Riemann that we might not know about? This is a vastly different question from assessing Shinichi Mochizuki’s claimed proof of the ABC Conjecture. No one would be surprised at Riemann yielding to such complexity. Reader comments are welcome and invited.
Dick had already been intending to make an update post on what is going on with the famous problem. We know that most, if not almost all, of our colleagues believe on clear principles that . One of us, Dick, has repeatedly argued that they might be different but it is no so clear. Recently Donald Knuth has voiced some opinions along those lines.
Major math problems do get solved from time to time. Rarely, however, do the solutions go “from zero to a hundred.” That is there often are partial or intermediate results—like gearshifts as a car accelerates. For example, the famous Fermat’s Last Theorem was proved for many primes until Andrew Wiles proved it for all cases, and Wiles built on promising advances by Ken Ribet and others. En-passant, we congratulate Sir Andrew on winning this year’s Abel Prize.
The recent breakthrough on the Twin Prime Problem by Yitang Zhang is another brilliant example of partial progress. Although his step from a prime gap that was near logarithmic to a constant—initially a huge constant—was unexpected yet obtained by mostly-known techniques, it needed pedal-to-the-metal on those techniques.
One might expect a similar situation with and . If they really are not equal then perhaps we would be able first to prove that SAT requires super-linear time; then prove a higher bound; and finally prove that and are not equal. Yet this seems not to be happening.
There are two kinds of “results” to report on about versus . We just recently again mentioned Gerhard Woeginger’s page with a clearinghouse of over a hundred proof attempts.
On the side, the usual idea is one that has been tried for decades. Take an -complete problem, such as TSP, and supply an algorithm that solves it. Often the “algorithm” uses Linear Programming as a subroutine, but some do use other methods.
There is the issue that certain barriers exemplified by this may prevent large classes of algorithms from possibly succeeding. So we can say at least that a proof might have an intermediate stage of saying why certain barriers do not apply. Otherwise, however, a proof of by algorithm is bound to be pretty direct. Plausibly it would have one new and pivotal algorithmic idea, one that might of itself furnish an explanation of why it was missed.
On the side, however, there are several concrete intermediate challenges on which one should be able to demonstrate progress to support one’s belief.
A third of a century ago, Wolfgang Paul, Nick Pippenger, Endre Szemerédi, and William Trotter showed that for the standard multitape Turing machine model,
This proves a sense in which guessing is more powerful than no guessing. Yet a result like
appears hopeless. Nor have we succeeded in transferring the result to other natural machine models, such as Turing machines with one or more planar tapes.
How about proving that SAT cannot be done in time and space for particular fixed and reasonable space functions ? There do exist a few results of this kind—can they be extended? How come we cannot prove that SAT is not in linear time? This however also seems hopeless today.
A related example is, can we prove that SAT needs Boolean circuits of size at least , let alone super-linear circuits? Can we prove that some natural problems cannot be solved by quadratic or nearly-linear sized circuits of depth?
Is it worth making a clearinghouse—even more than a survey—of attempts on these intermediate challenges?
What are some possible kinds of mathematical proof elements that we might be missing?
]]>
And lead to new kinds of cheating and ideas for our field
src |
Faadosly Polir is the older brother of Lofa Polir. He is now working as a full time investigative reporter for GLL.
Today Ken and I are publishing several of his recent investigations that you may find interesting.
You may wonder how GLL is able to afford a reporter on staff. We can’t. All our work we bring you at the same price we always have. Reporters, however, are burgeoning. Maybe it’s the political cycle. Maybe “Spotlight” winning the Oscar helped. Or maybe leprechauns have created a surplus—that would explain the political shenanigans we’re seeing.
Faadosly didn’t take long to earn his keep. He spotted an announcement in a deep-web online journal:
Second Hardy-Littlewood Conjecture (HL2) Proven.
We just covered the recent new and overwhelming evidence for the First Hardy-Littlewood Conjecture (HL1) about the distribution of the primes. Together, these conjectures have wonderful consequences, which unfortunately are already being turned for ill by some of our young peers.
That’s right, we are saddened to report here at GLL that the huge productive work of a few young theorists derives from a kind of cheating. To protect their names we will refer to them collectively as X.
It is now known that X have been able to write so many beautiful papers over the recent years owing to the discovery that ZF is actually inconsistent. Since ZF, basic set theory, is used to prove everything in computer science theory, it follows that anything can be proved. At my old university they are running supercomputers to do massive derivations in Vladimir Voevodsky’s formulation of Homotopy Type Theory (HoTT). Although HoTT embeds the constructive version of PA this doesn’t prove PA’s consistency let alone that of ZF—yet it can model derivations using statements like HL2 as axioms. Faadosly’s investigative reporting showed that what happened is that the ZF proof of HL2 enabled HoTT to prove that a bug in Edward Nelson’s argument for inconsistency in PA goes away when lifted to ZF via known analytical facts about HL1.
This isn’t how X were caught. Rather they were caught completely separately by running software called MORP, for “Measure of Research Productivity.” MORP is not simply a plagiarism detector like MOSS, rather it applies equations determined by regression over many thousands of papers in the literature. The equations can determine when someone’s productivity is way too high. Our own cheating expert Ken reports:
There is over confidence that the results of X could not be obtained by a human mathematician without deep computer assistance.
MORP is interfacing with a project to determine the likelihood of conjectures by analyzing historical proof attempts on them. By deep learning over attempts on past conjectures before they were proved, and Monte-Carlo self-play of thousands of proof attempts with reinforcement learning when things are proven, they have achieved a success rate comparable to that of Google DeepMind’s AlphaGo, which defeated a top human champion at Go.
When run on the contents of Gerhard Woeginger’s versus page, whose 107 proofs are evenly split between and , the output gives confidence comparable to IBM Watson’s in its “Jeopardy” answers that both are theorems.
Thus our field already had strong evidence for the inconsistency that we are now reporting. This is the first major inconsistency result in mathematics since 1901 and 1935, an unusually long gap of 81 years. Reached for comment, Voevodsky told Faadosly that he was not surprised: “After all, if ZF had been consistent then ZF + Con(ZF) would also be consistent, and the situation we have now is practically speaking not much different from that.”
This news has heightened discussions already long underway among prize committees at top organizations in math and theory about new policy for the awarding of their large prizes. We have already covered feelings by many that paying large sums for past achievements (often long past) is inefficient for stimulating research and community interest—while at the same time they are being upstaged by startup upstarts giving way bigger prizes.
Doing away with prize money altogether was rejected as lame—the large sums are important for the public eye. The committees concluded it is vital to maintain the absolute values of the prizes. So the prizes will still be awarded as before by a blue-ribbon panel and carry large dollar amounts. The only thing different will be the sign. The winner will be required to pay the prize money to the organization. Thus a Turing Award winner will owe one million dollars to the ACM.
The motivation is two-fold. One, the prize money being paid to ACM will be used to hire extra needed administrative personnel. Second, it can be used for greater outreach. Also, a generous payment plan is being considered: payment of the prize money over periods as long as twenty years.
With help from Faadosly we have been running a private poll of past winners of the Turing and lesser prizes. Over 80% have said something equivalent to:
I would have been happy to pay the prize amount. It’s the Turing Award after all.
“I could always have sold my house or perhaps a kidney,” said one recent winner with a smile. There is talk that this new kind of prize may be especially appropriate for the new result on ZF, to cover a some of the costs it will cause. Quantum may institute its own i-Prize, in which the real amount is indeterminate.
For now the committees have decided only to proceed on a provisional basis with the lesser prizes. Some are being given reversed names—for instance, nominations are now being accepted for the 2017 Thunk prize. The University of Michigan has proposed a reverse prize to benefit Ann Arbor’s innovative but consolidating Le Dog restaurants.
info |
I was just recently at Simons in Berkeley, and of course several people from Stanford joined in. They discussed the recent New York Times article on Stanford’s new policy of having a 0% admission rate, shooting its undergraduate program to the zenith for exclusivity. I mentioned that at Tech our enormously popular remote Master’s in CS program is allowing us to emulate Stanford as regards on-site admission. This in turn will free up resources currently used on student housing and enable allocating more teaching staff to the online courses.
We began talking about similar ideas to increase the prestige of major conferences. Several conferences already have acceptance rates verging under 10%, so going to 0% is not so big an affine step. Indeed, several thought it too small and that we should go all out for negative acceptance rates (NAR).
Conferences using NAR work on a simple system. Anyone wishing to submit creates an item on EasyChair or a similar logging service the way we do today. The Program Committee then sends that person one of their current original papers to referee. The best of the referee reports are then selected for oral presentation and double-blind publication in the proceedings, thereby achieving a acceptance rate. People making the very top submissions, who previously would win Best Paper awards, become co-authors of the papers they referee.
This is considered an excellent way to promote research by talented people—now being selected for a PC needn’t be a sacrifice of research time. It also galvanizes the conference reviewing process. Faadosly has learned that the upcoming STOC Business Meeting will propose its use on a rotating basis by FOCS or STOC in alternate years.
What do you think of these developments? Are they all plausible? Do you believe the conjectures of Godfrey Hardy and John Littlewood? After all, these two giants conjectured both of them.
Have a happy April Fool’s Day.
[some word changes]
Quanta source (K.S. at left) |
Robert Lemke Oliver and Kannan Soundararajan have observed that the primes fail some simple tests of randomness in senses that are both concrete and theoretical.
Today we discuss this wonderful work and what it means both for properties of the primes and for asymptotics.
At the heart is the question,
How closely does the sequence of primes emulate a truly random sequence?
Lemke Oliver and Soundararajan use a genre of simple tests of randomness—ones that we might have thought would be true. The simplest one is:
Let and be consecutive primes in order. If is congruent to mod , then is less likely to also be congruent to mod .
They note that among the first billion primes the expected frequency is much less than , where is the number of positive integers that are relatively prime to . They argue based on a famous -tuple conjecture of Godfrey Hardy and John Littlewood holds that this bias persists through the whole sequence of primes. Yet it follows that the proportion of cases where the next prime has the same congruence mod is . This is not a contradiction.
From first principles the primes are completely not random: they are determined by a tiny rule. In this respect they are like the digits of . Unlike the primes also flunk an immediate frequency test: only one even number is prime. We can make a fairer comparison by taking the primes mod and expanding in base . Call these infinite base- sequences and , respectively. Jacques Hadamard and Charles-Jean de la Vallée Poussin proved that each “digit” of appears with frequency asymptotic to , but this has not been proved for any even with . Yet and its analogues for and and other famous irrational constants are commonly expected to be normal in every base . Normal means that every sequence of digits occurs with frequency asymptotic to ; the case is called “simply normal.”
Indeed, the normality of and the other constants follows from some reasonable hypotheses about digital dynamical systems. So what about ? One difference is that whereas the normality of a number in base is equivalent to its being simply normal in base for all , this doesn’t carry over to the primes mod even when you find such that . Still, it is significant that the primes meet the case for all . So what about ? This is where Lemke Oliver and Soundararajan weighed in with a universally regarded surprise.
They posted their paper two weeks ago, and it has been covered by Scientific American, by Nature, by the New Scientist, and by Quanta Magazine. Terry Tao also posted about it, and he explicates the gist in a wonderful way.
Also wonderful are the diagrams in this 2002 paper by Chung-Ming Ko, in which much of the concrete numerical evidence was observed. A 2011 paper by Avner Ash, Laura Beltis, Robert Gross, and Warren Sinnott, observed the numerics and went further to try to explain them by heuristic formulas. However, one needs Lemke Oliver and Soundararajan (and a nod to Tao’s exposition) to plumb the relations among asymptotics, heuristics, and theory.
The easiest case for us to visualize is , . All primes other than 2—which is a very odd prime—must have a last digit of 1, 3, 7, or 9. The first and last of these are quadratic residues mod 10, and a long-known bias, which was first noted by Pafnuty Chebychev in the case , , explains why they usually occur less often in concrete tables up to some number than 3 and 7 as last digits. For the first primes the counts are:
Here only primes congruent to 1 mod 10 are lagging overall. There are values of for which 1 has the highest number, but—as has been proved assuming the Generalized Riemann Hypothesis—they are in certain senses sparse, so that this picture is typical. This bias varies as (recall that in general by the Prime Number Theorem), so it is pretty mild. The bias found by Lemke Oliver and Soundararajan for , however, blows it away. Let stand for the cyclic successor of in , so for example groups the pairs :
Most of the bias is against the next prime having the same last digit as the previous one. It is gigantic. How can this be?
One can try to view the primes as the outcome of a random process. Say we are up in the range from up to , maybe up to . The density of primes there is about , so let’s picture a series of random trials each with success probability . If was the congruence mod of our last success, then the point is that , , and (etc.) get the next cracks at being prime before the turn of comes around again. As some commenters to the above news items have remarked, this could be analogous to Benford’s Law in base , which we once covered in regard to baseball.
Our counts above make this plausible, but the idea falls apart when looking at individual pairs of digits. For instance, conditioned on a prime being 3 mod 10, the next prime is more often congruent to 9 than 7, by a count of to . Lemke Oliver and Soundararajan (LOS) further note that the above bias should be no greater than the probability of being less than —because once you get to which is , the case has first crack in every coming cycle. But it is known that this probability is asymptotically less than the bias they identify in their argument. In a comment to his own post, Tao calculates for that the “next-crack” argument would predict 20% for , not the observed 18%.
The authors say they were motivated by a second principle they heard in a lecture by Tadashi Tokieda: Suppose Alice rolls a 4-sided die with faces marked trying to roll a two consecutive times, while Bob rolls the same die trying to get a followed by a . Occurrences of and are equally likely, yet Alice will need appreciably more rolls on average to meet her goal than Bob. The reason is that when Alice rolls a but then fails by getting , , or on her next roll, she has to roll some more to get a again, whereas Bob on rolling can fail by getting another , which sets him up immediately for another try. Stated in this form, the issue resembles the exposé of a simple bias of studies of the “hot hand” fallacy, which we discussed last October.
However, this argument goes away when we change Alice’s goal to rolling any and Bob’s to rolling any . Now both have a 1-in-4 chance of winning on any roll after the first, so there should be no bias. Another fix to Alice and Bob, analogous to our suggestion in the “hot-hand” case, makes no difference for LOS: make their goals be to roll and beginning with an odd-numbered roll. Then their expected trials to first success is the same. In the prime case this means considering where is odd ( to rule out the primes ). Does the bias in such pairs persist? Yes, because we could equally well start with even, and the overall bias is the average of these two cases.
However imperfect, these analogies directed them to suspect and consider a deeper connection to the theory of gaps between primes.
At the heart is a conjecture by Godfrey Hardy and John Littlewood that we discussed last August. LOS started with a strong form of it that expedited their mod- twist. Given any prime and any finite set of integers including , let denote the proportion of that are not congruent to any element of modulo . Then define the product series
If hits every congruence class modulo some then , so in particular cannot include any odd numbers. If then so . Finally let denote the number of such that for all , is prime. The so-called “First” Hardy-Littlewood conjecture then states that for any and ,
When , the integrand is the offset log-integral function (the paper uses for this) and this becomes a well-known equivalent form of the Riemann Hypothesis (RH)—and still equivalent if the error is sharpened to . When this subsumes the Twin Primes conjecture. Thus Hardy and Littlewood presented their conjecture as a natural strong extension of RH.
The twist by LOS starts by excluding the finitely many primes that divide from the product, calling the resulting function . They need to relate this to tuples a of congruences mod . They break a down according to how equal congruences mod are situated; these become like diagonal entries of a matrix and there are symmetries around this “diagonal.” The lone hard-and-fast theorem in their paper, called Proposition 2.1, shows that their formulas faithfully follow the original Hardy-Littlewood intuition about the behavior of the series. Eventually they boil all this down into functions and which become appropriate constants when and a are fixed.
In terms of and the length of a, and recalling , we can call the “naive” estimate of the number of primes among the first primes such that for each , . In terms of , the natural estimate is . The new conjecture by LOS shows how far this needs to be adjusted—and the overhead is far bigger than the gentle in the analogous factored form of RH or Hardy-Littlewood:
Conjecture:
As often in number theory, the “” is meant in the sense of “ for some error term .” The paper closes with detailed numerical evidence supporting this expression. As usual we say to consult the paper for further details, but we have some final remarks about the two or three big terms in this formula.
The three terms still all go to as . Let’s go back to our original focus on the frequency. Call it for the primes below and just in the limit. from being asymptotic to . Then it is perfectly mathematically consistent to prove theorems—maybe contingent on a generalized RH and/or Hardy-Littlewood and/or related principles—with diametrically opposite messages:
The paper indeed conjectures “always” in a related case, so we might not even have the kind of sporadic remissions seen in the Chebychev bias. We can’t imagine any more opposite than this.
Now number theory is used to emphasis on lower-order terms—as we noted above, RH is equivalent to lower-order behavior of . But it strikes us that complexity theory has rarely been so subtle. We prove theorems of type (a) all the time. Are we perhaps missing important ways to sharpen and qualify our asymptotic estimates?
How does this discovery align with other observations of unexpected structure in the primes, including ramifications of Stanislaw Ulam’s spiral?
Tao makes a point of noting—including in his followup comments to his own piece—that while the term is already sizable and responsible for much of the deviation tabulated for , the asymptotically bigger term is novel and potentially more important theoretically. We look forward to further developments. Can readers say ways in which asymptotics in complexity theory are showing the same maturity?
]]>
A trick of language and echoing
Neil L. is a Leprechaun. He has been visiting me once every year since I started GLL. I had never seen a leprechaun before I began the blog—there must be some connection.
Today I want to share the experience I had with him this morning of St. Patrick’s Day.
Neil L. has visited me many times before, usually late at night just as St. Patrick’s Day starts here in Atlanta. I felt it would be different this year, since I am now happily engaged to my fiancée Kathryn Farley. She is—as you may have guessed—of Irish descent. We have discussed Neil and she has mixed feelings about him.
She has a PhD in theater from Northwestern University, and is currently teaching a course at Tech on computational improv. Kathryn feels, as an expert in story telling of all kinds, that leprechauns are fundamentally anti-Irish. I have explained that Neil is different, but she may be right. Our friends at Wikipedia say:
Films, television cartoons and advertising have popularised a specific image of leprechauns which bears scant resemblance to anything found in the cycles of Irish folklore. Irish people can find the popularised image of a leprechaun to be little more than a series of stereotypes of the Irish.
Yet the mascot of the University of Notre Dame is the Notre Dame Leprechaun. Perhaps Kathryn is still right.
Nonetheless I told Kathryn I’d stay up late to see Neil this year again. She and I watched CNN’s usual repetitive discussion on the US election primaries, with CNN’s usual talking heads, until she said she would call it a night. It was already 2am and she retired to our bedroom and fell asleep.
I sat alone on our study sofa waiting. Toward three o’clock the soft sofa and boring CNN discussion with the sound set low were too much, and I was soon fast asleep too.
I was awakened by the sweet smell of smoke. The room was dark and as my eyes adjusted I finally saw him standing near me puffing his pipe, each puff filling the air with green smoke. I nodded hi, realized that Neil had turned off the TV, and asked him, why was he late this time? Neil replied,
You moved again.
I was surprised since yes I had moved again, but Neil is a leprechaun. How could he be confused? He said he just forgot about my moves—even leprechauns mess up their online calendars it seems. He said he’d appeared at my previous place and even my old house and scared the people who lived there now. Neil chuckled and smiled, and took another puff of his pipe. He added: “it put their hearts crossways dead right.”
Groggily I said hi and that I thought he might have decided to skip this year. But he said:
Skip?—go way outta that. I love seeing you.
Neil sat down next to me and puffed some more green smoke, filling the room with that sweet smell.
In past years I’ve sometimes been granted the ability to ask Neil questions. None of that has ever worked—he’s always outsmarted me—see this for example. So this year I said right away that it was great to see him, but I desired no questions. I was tired of being made to look foolish. Neil smiled and said:
Aw, sure look it. I see you are more clever than I had imagined. Good choice.
But I knew that Neil was magical, knew all, and could help me: was there some way to get information out of him? Hmmm… I had an idea.
I had prepared for Neil’s visit. While he spoke English to me, his native tongue is Gaelic, which some call Irish. I had discovered that Gaelic has one very distinct property:
There are no words for ‘yes’ or ‘no’ in Gaelic.
Yes. Really. You cannot just say “yes” or “no” to answer a question. You have to echo the question. Here is an explanation:
For instance, there are no words for “yes” or “no” in Gaelic. It’s the truth. If you want to answer somebody in the positive or negative, you actually have to refer back to the question itself in the form of a positive or negative statement. So, when somebody asks you “ar mhaith leat cupan tae?” (would you like a cup of tea?) you cannot just say “yes” or “no”—there simply aren’t any words for that. You have to keep up the chatter by answering: “ba mhaith liom cupan tae” (I would like a cup of tea) or if you’re feeling lazy you can reduce this as far as “ba mhaith liom” (I would like) but absolutely no further.
I wondered if I could use the lack of the words “yes” and “no” in some clever way to trick Neil.
Here is what I tried. I figured I should start by asking him something I already knew the answer to: “Neil, is it true that in your native tongue you cannot say simply ‘yes’ and ‘no’?” He smiled, then frowned, and then he took several more puffs. Then he nodded. I told him I had a challenge for him. He nodded again. I said:
I know you are very smart—smarter than I—but perhaps you could give me the answer to does P=NP or not, yet still keep it hidden form me.
Neil was clearly intrigued and nodded once again. He replied:
‘Tis clever of you to know about my Gaelic language, but what does it have to do with answering you without answering you?
I thought that I had him at least interested. I explained my plan. He should write out a Gaelic answer to whether P=NP, of course without using “yes” or “no,” and then encrypt using a clever code. So clever that I would be unable to decode it.
Neil looked at me, puffed out a large green ring of smoke, and then glanced at me with narrowed eyes.
This is possible, indeed it exists already, but I wonder if I have the right…
A green spark flared in Neil’s pipe. He said, “Aye—Tom Gallagher grants the right.” Out came a parchment—from where I did not see. Its contents were instantly familiar to me since after all I have read about and taught crypto:
I said, “What—wait—this isn’t P=NP; this is El-” but Neil hushed me: “It be the answer you requested. It comes with a story.”
Neil refilled his pipe—I had never seen him do this and so I leaned forward keenly despite my tiredness. He spoke in low tones.
“Although Charles Babbage never promoted his Engines as cryptographic devices, of course others thought of that and after Babbage’s death in 1871 started to act. Edward Elgar pursued codes as well as music from boyhood and thought the Wheel of Fifths and Babbage’s gears could be combined. Elgar tinkered with gear diagrams like this all his life:
But Elgar never had any money to invest and his music career was failing—it didn’t help that he scribbled cryptograms on music he was supposed to be studying or writing. Old Tom took pity—ye need not be Irish but it helps if ye be Catholic—and on July 14, 1897 paid him a visit. Tom wasn’t trapped—no wishes—but he granted one question like I’ve done wi’ ye. Tom hoped he would ask something practical like, “How can I compose better music?”—which he would answer not yes/no but do something to help. But Elgar asked:
“Is it possible to encode by machine so that no man or machine can decode quickly?”
Of course you recognize this as a form of your ‘P=?NP’ question—technically if multiple decodings can be right and verified. As ye know, Tom could not answer yes or no. So he echoed the question by encoding his answer using Elgar’s diagrams as you see, so that Elgar could gain the answer in the doing—or not.”
Well, I thought this was all blarney—by definition, it was blarney. Recalling some Irish phrases myself, I said, “That’s a fret. How can you prove your story?” Neil missed nary a beat and declaimed:
Old Tom, who wrote the book on leprechauns, signed his name ‘Th G’ in a way disguised as ‘July.’ His luck still rubbed off since Elgar tried rotating musical themes on his gears and soon came up with his “Enigma Variations,” which changed his fortunes. When the German engineer Arthur Scherbius named his machine ‘Enigma’ after Elgar’s work, it wasn’t to honor some English music but Elgar’s cryptography. So your Alan Turing in World War II had some luck o’ the Irish on his side. Ach ní Turing ná Elgar ná man ná machine has ever decoded what old Tom writ.
And with a final puff of green smoke Neil was gone. I joined Kathryn in the bedroom and fell fast asleep.
Can you resolve Tom’s enigma—and Neil’s?
Happy St. Patrick’s Day.
[fixed some links, slight word changes]
Cropped from Ashley’s TwiCopy source |
Hou Yifan and Maurice Ashley are champions of chess in several senses. Hou just regained the Women’s World Champion title by defeating Mariya Muzychuk of Ukraine 3-0-6 in their match which ended today in Lviv. Along with her male Chinese compatriots who reign as Olympiad team champions, she headlines an extraordinary growth of the game in southeast Asia. Ashley became the first ever grandmaster to play a tournament in Jamaica where he was born and has been one of the game’s premier video commentators and ambassadors for over two decades. The past two years he teamed with entrepreneur Amy Lee of Vancouver to create and run the Millionaire Chess open tournaments in Las Vegas, to raise the professional profile of the game.
Today Ken and I present another puzzle in our “Coins on a Chessboard” series, theming it after Ashley’s “Millionaire Square” promotion.
These are exciting weeks in Chess and in Go. Besides the women’s championship match, the World Chess Federation (FIDE) Candidates’ Tournament to determine the next challenger to world champion Magnus Carlsen started Friday in Moscow. Former world champion Viswanathan Anand of India, the oldest player at 46, shares the lead with Levon Aronian of Armenia and Sergey Karjakin of Russia, each with a win and two draws, A major international Open is taking place in Reykjavik, 44 years on from the famous match between Boris Spassky and Bobby Fischer. The fifth and final game of the Go match between Google DeepMind’s AlphaGo program and world #4 Lee Sedol is starting at midnight ET; Sedol has lost the series at 3-1 down but will try to build on his Game 4 win.
Millionaire Chess brought the first ever $1,000,000 total prize fund to a long weekend of open chess. As customary at many events, the tournament is divided into sections by rating class, so that amateur players have a shot at the biggest prizes. Last October’s event introduced the “Millionaire Square” finale in which the nine section winners played a quiz-game show to give one a chance to win a separate $1M prize underwritten by the sponsors by guessing which of 64 squares it was “under.” The winner guessed c4 but it was b1. We do not know if this promotion will be brought back for the 2016 tournament at Harrah’s in Atlantic City. It gave Ken a neat way to re-phrase the puzzle from the source given by a student in my discrete math class.
src |
Suppose “Millionaire Square” is presented in this glitzy manner next October: A chessboard is set out with 64 golden coins bearing the Harrah’s logo (“heads”) on one side and the Millionaire Chess logo (“tails”) on the other. The prize is under one of the squares. The coins are randomly arranged heads or tails. The lucky finalist is led out from a blind never having seen the board before, picks a coin, and wins if he or she chose the right square. Like last October it’s a 1-in-64 shot.
Now suppose this is presented to the audience with extra pizzazz. A staffer comes out first, picks up one of the coins, shows both sides to the audience, and puts it back on the square. The coin is set back just like all the other coins except it might now be heads instead of tails, or tails instead of heads. Is this secure?
Suppose the staffer knows the right square and wishes to give it away, perhaps out of grudge or hope of hitting up the winner later. The staffer has no contact at all with the contestant, who knows nothing except perhaps having read this GLL post, and sees nothing except the final arrangement of the coins. Can the staffer communicate the winning square just by having flipped one coin, and if so, how? That is the puzzle.
Note that the employee has no control over the setup of the coins which is completely random. The configuration after flipping one coin was just as likely as the original to be chosen randomly. So how can six bits of information have been transmitted?
I shared this puzzle with Vipul Goyal when he visited Tech last month. He is currently at Microsoft Research in India and does strong research in cryptography, security, and privacy—not surprising since he did his PhD at UCLA under Rafail Ostrovsky and Amit Sahai. It took awhile for him to understand why it could even remotely be soluble, but that evening he sent me a complete solution. Impressive. He has given permission for me to relate it. First, let me re-phrase the puzzle not in Ken’s way or the source’s way with a jailer and two prisoners, but in simple complexity terms:
Alice receives a string of length . She also receives an index in . She must flip one bit of so that when Bob receives the resulting string he can always get .
Now here is how Alice can send one bit, namely whether is even or odd. If is odd and has an even number of 1’s in the first half, she flips a bit in the first half, likewise if is even and has an odd number of 1’s in the first half. Otherwise she flips a bit in the second half. Then in the parity of and the parity of the count of s in the first half always agree. Thus Bob has narrowed down the possibilities by half. In chessboard terms this is like choosing to flip a coin on White’s side or Black’s side of the board.
Now imagine this scheme using the odd versus even positions in —which is like light versus dark squares on the chessboard—or using the right and left halves of the board. Can we combine these schemes to communicate more bits? If you’re aware of Hamming codes, you might realize that if is initially a codeword, any bit flip will be detectable since the code will correct any one error. But what if is not a codeword? It still needs setting the details down precisely.
Our source has its own solution and there are others, but let’s see Vipul’s: Denote the input string (to Alice) by , where . Let denote the -th bit of .
We will define sets . Each set will contain a subset of the bits from (denoted by their positions in ). Denote the parity of all the bits in by . The bit string that Alice will transmit to Bob will be . The primary challenge will be to choose the sets in such a way that exactly by flipping one of the bits from , all the parities can be set to any desired value.
Now to describe how to construct these sets. The key observation is the following. Alice would like to flip the parity of a collection of these sets. That means for every such collection, there must exist a unique bit in which is exactly in all the sets from this collection but not in the sets outside this collection. There exist such collections. And this is also exactly the number of bits in . Hence, it appears to be possible to have such sets. These sets are constructed as follows. A bit is in set iff the -th bit of is 1.
Denote the original -bit parity string for the sets to be and the bit string that Alice wants to communicate to Bob to be . Let . Now the parity of must be flipped iff is 1. We will simply flip the bit and observe that but (by construction).
Did you know this fact about communication? Are there any possible applications that come to mind?
Incidentally, today is Pi Day 3/14, indeed rounded Pi Day 3/14/16. Georges-Louis Leclerc, Comte de Buffon, famously showed how to approximate probabilistically by randomly throwing a needle the width of one tile on a square-tiled floor and counting whether it crosses a vertical side of a square. Suppose we randomly toss a coin on the floor instead. Depending on the width of the coin relative to a tile—assuming it’s a rational number—can we ever get an estimate for that way?
]]>
Will there be any man left standing?
Sensei’s Library player bio source |
Lee Sedol of South Korea, who is currently ranked #4 on the unofficial GoRatings list, may be on his way to being #5. AlphaGo, a computer project sponsored by Google DeepMind, is ahead 2-0 in their five-game match.
Today I take stock, explain some of what has happened, and briefly discuss the prospects for AI and human ingenuity.
Go is our most ancient and deepest of games. Whereas the Western rules of chess weren’t settled until the Renaissance, Go has been substantially the same for over 2,500 years. The largest change was about 1,500 years ago to move from a 17×17 to a 19×19 board. This is over five times the size of a chess board, creates a high bandwidth of reasonable moves at each turn, and leads to games up to and over 100 moves for each player compared to an average near 40 at chess. Most Go moves have consequences far beyond the horizon of most chess “combinations,” yet human players have a reliable “feel” for strong play without express calculation.
I discussed aspects of depth and computer advances a year-plus ago. Despite my hedge there against expecting the long timeframe obtained by extrapolating my “Moore’s Law of Games,” I must say I expected Go to last at least to 2030. The “nonlinear jump” has apparently come from actuating the human approach through multiple layers of convolutional neural networks. This has produced many human-savvy moves, but also some “inhuman” stunners have come from AlphaGo’s go-ke (the Japanese term for the jar of stones) with devastating effect.
As with the “Immortal,” “Evergreen,” and “Shower of Gold” games at chess, Go has its own lore of historical games with evocative names: “Blood on the Board,” “Reddened Ear,” “Atomic Bomb” (which was continued after the Hiroshima blast damaged the venue and injured spectators), and Lee Sedol’s own “Broken Ladder” victory. The “Reddened Ear” came in 1846 when a champion realized after his opponent’s surprising center move that he was in danger on two fronts.
I was watching the second game when AlphaGo with the black stones played a move that the master commentator, Mike Redmond, thought was a mis-transmission—see his reaction 30 seconds from the video point here:
Modified from game replay source |
Sedol had just played his white stone to Q11 to stake out territory in the east. Redmond opined that a “more normal” reply would have been P7—to support the posse of black stones at lower right and contemplate disputing the land claim by riding out to R8. But AlphaGo played at P10. Perhaps all of Sedol turned red as he left his chair despite his own clock running down, and he did not reply until over 15 of his remaining 95 minutes had elapsed.
My own first impression of P10 was a beginner move allowing White to firm up the whole right side by Q10. In Go it is considered most valuable to own the corners and the edges—indeed well over half the points are within 3 of the edge. However, this move also had influence to the center and north. Sedol didn’t play Q10 after all—he felt he had to defend the north by P11 instead, and in the closing phase he was forced back two whole rows to cells S10 through S7 from what he could have had by walling from Q10 to Q7 if left undisturbed.
In analytical terms, P10 did not win the game—this expert commentary page says that Sedol was ahead for several stretches afterward. But in human terms it was a blow, much as my own work indicates that Garry Kasparov played 200 rating points below his usual form against Deep Blue. In the game’s last quarter, Sedol had to play on one-minute grace periods called “byo-yomi” and seemed to stumble for lack of time to his defeat on move 211. On the other hand, AlphaGo’s own evaluation was said to be a steady advantage over that phase, and whom could we ask to verify the human-commentator opinions now besides AlphaGo? Chess programs have found many errors in classic game commentaries.
All this leads us to ask, what’s in store humanly when the duel resumes at 11pm ET tonight?
“Slim” is a common Old West cowboy name, and was chosen by the actor Slim Pickens. However, a gunslinger avatar of AlphaGo would have to be the opposite: phagó is Greek for “`eating” so AlphaGo might translate to “Fat Al.” As in fatal—for any human opponent. Does Lee Sedol stand a chance in any of the remaining games?
We will find out overnight Saturday, Sunday, and Tuesday US time, which is afternoon time in Seoul where the matches are being played. The YouTube links and times for live video commentary have already been fixed on the Google DeepMind channel; here they are individually:
The games start a half-hour in. The page also has the saved feeds and shorter 15-minute summaries of the first two games. A second place for live commentary in English is the American Go Association’s YouTube channel.
Unless Sedol can find a silver bullet it may be the Tombstone for human supremacy at strategy games. But it already comes with a silver lining: Go programs that relied primarily on exact calculations had stayed far below professional human players even after Monte Carlo self-play was introduced ten years ago. AlphaGo uses both—indeed the paper shows six interlinked components—but the main fillip comes from imitating the learning process of the human mind. This shows concretely that our brains embody computing efficacy that is not simply replaced by calculation in overt numeric and symbolic terms, per argument here. This also underscores what a tremendous achievement this is already for the Google DeepMind team.
To date there has not seemed to be any reason in Go to hold top-level freestyle tournaments in which multiple human and computer players consult as a team. Will there be now, and will the combination prevail over computers playing alone as it did for chess, at least for awhile?
Update (10:50pm 3/11): The Game 3 AlphaGo broadcast has begun with a lengthy discussion/interview about the P10 move in which an AlphaGo team member explained how it came about. (3/12): AlphaGo won game 3 most convincingly per the GoGameGuru commentary page. Great article by Albert Silver covering the games and comparing chess and Go programs.
Update (3/13): Lee Sedol won Game 4. Go Game Guru commentary. (3/14): Today’s update to the GoRatings.org list does in fact show Lee Sedol at #5, with AlphaGo taking his place at #4.
Update (3/15): AlphaGo won Game 5 (commentary). After the game, the GoRatings list has it at #2, still behind Ke Jie of China at #1, and with Lee Sedol holding at #5.
[some minor word changes, fixed Friday time, added AGA channel, linked previous post re “silver lining” argument at end]
David Johnson was a computer theorist who worked on many things, with special emphasis on the care and treatment of hard computational problems.
Ken and I are sad today to hear that David just passed away.
Many will announce this sad event; we expect that the whole community will express how much they miss David. He was unique among theorists in his dedication to see how hard so-called “intractable” problems really are. He dedicated much of his career to building challenges: A typical one asked you to design an algorithm that solved some hard problem, often an NP-complete one. These challenges were of great importance in pushing forward the field of attacking difficult but important problems.
We owe David much for working on these challenges, usually in conjunction with DIMACS—the Center for Discrete Mathematics and Theoretical Computer Science. See this for the list of DIMACS challenges over the years. Note, David was usually part of the team, if not the sole person, running a particular challenge. He also did the lion’s share of assembling a 2000 draft for NSF about challenges in theory.
We quote from the challenge page for the traveling salesman problem (TSP):
The TSP is probably the most-studied optimization problem of all time, and often the first problem that newcomers to the field (or visitors from other domains) attack. Consequently, it has one of the most fractionated literatures in the field, with many papers written in apparent ignorance of what has been done before.
One goal of this Challenge is to create a reproducible picture of the state of the art in the area of TSP heuristics (their effectiveness, their robustness, their scalability, etc.), so that future algorithm designers can quickly tell on their own how their approaches compare with already existing TSP heuristics. To this end we are identifying a standard set of benchmark instances and generators (so that quality of tours can be compared on the same instances), as well as a benchmark implementation of a well-known TSP heuristic (the greedy or “multi-fragment” algorithm), so that the effect of machine speed on reported running time can (roughly) be normalized away.
A good place to read about the details that go into creating and running a challenge is the paper by David and Lyle McGeoch. It is quite a complex endeavor to do one of these challenges properly. Just some of the issues that strike us are:
I am also deeply saddened to learn the news this morning. David was a particularly welcoming voice at conferences in the 1980s that formed my outlook on the field as a graduate student. I also remember he helped host me for a one-day visit to Bell Labs in the summer of 1984. I recall that during this visit I was held to a draw in a chess game against Ken Thompson’s Belle computer.
I last saw him at FOCS 2009 in Atlanta at a lunch table with Dick and others I’ve been glad to know. He was an oracle for the status of algorithmic problems: He added to my knowledge concerning a problem that one of my then recently-graduated students was working on. Among things he did to build up the community was to create and edit the STOC/FOCS Bibliography, which I used often to look up journal versions of papers before the rise of the Internet. His book with Michael Garey was the first book on complexity that I read, and it set the style for presenting computational problems in our field.
Ken and I send all of our thoughts to David’s family and many friends. He will be missed.
Update 3/11: Lance Fortnow’s memorial has been hailed by some others, and today the ACM has posted a memorial by Lawrence Fisher.