Skip to content

Stonefight at the Goke Corral

March 11, 2016


Will there be any man left standing?

lee_sedol_number_1_go_player_2010
Sensei’s Library player bio source

Lee Sedol of South Korea, who is currently ranked #4 on the unofficial GoRatings list, may be on his way to being #5. AlphaGo, a computer project sponsored by Google DeepMind, is ahead 2-0 in their five-game match.

Today I take stock, explain some of what has happened, and briefly discuss the prospects for AI and human ingenuity.

Go is our most ancient and deepest of games. Whereas the Western rules of chess weren’t settled until the Renaissance, Go has been substantially the same for over 2,500 years. The largest change was about 1,500 years ago to move from a 17×17 to a 19×19 board. This is over five times the size of a chess board, creates a high bandwidth of reasonable moves at each turn, and leads to games up to and over 100 moves for each player compared to an average near 40 at chess. Most Go moves have consequences far beyond the horizon of most chess “combinations,” yet human players have a reliable “feel” for strong play without express calculation.

I discussed aspects of depth and computer advances a year-plus ago. Despite my hedge there against expecting the long timeframe obtained by extrapolating my “Moore’s Law of Games,” I must say I expected Go to last at least to 2030. The “nonlinear jump” has apparently come from actuating the human approach through multiple layers of convolutional neural networks. This has produced many human-savvy moves, but also some “inhuman” stunners have come from AlphaGo’s go-ke (the Japanese term for the jar of stones) with devastating effect.

Not Just the Ear Was Red

As with the “Immortal,” “Evergreen,” and “Shower of Gold” games at chess, Go has its own lore of historical games with evocative names: “Blood on the Board,” “Reddened Ear,” “Atomic Bomb” (which was continued after the Hiroshima blast damaged the venue and injured spectators), and Lee Sedol’s own “Broken Ladder” victory. The “Reddened Ear” came in 1846 when a champion realized after his opponent’s surprising center move that he was in danger on two fronts.

I was watching the second game when AlphaGo with the black stones played a move that the master commentator, Mike Redmond, thought was a mis-transmission—see his reaction 30 seconds from the video point here:

RedGoMove
Modified from game replay source

Sedol had just played his white stone to Q11 to stake out territory in the east. Redmond opined that a “more normal” reply would have been P7—to support the posse of black stones at lower right and contemplate disputing the land claim by riding out to R8. But AlphaGo played at P10. Perhaps all of Sedol turned red as he left his chair despite his own clock running down, and he did not reply until over 15 of his remaining 95 minutes had elapsed.

My own first impression of P10 was a beginner move allowing White to firm up the whole right side by Q10. In Go it is considered most valuable to own the corners and the edges—indeed well over half the points are within 3 of the edge. However, this move also had influence to the center and north. Sedol didn’t play Q10 after all—he felt he had to defend the north by P11 instead, and in the closing phase he was forced back two whole rows to cells S10 through S7 from what he could have had by walling from Q10 to Q7 if left undisturbed.

In analytical terms, P10 did not win the game—this expert commentary page says that Sedol was ahead for several stretches afterward. But in human terms it was a blow, much as my own work indicates that Garry Kasparov played 200 rating points below his usual form against Deep Blue. In the game’s last quarter, Sedol had to play on one-minute grace periods called “byo-yomi” and seemed to stumble for lack of time to his defeat on move 211. On the other hand, AlphaGo’s own evaluation was said to be a steady advantage over that phase, and whom could we ask to verify the human-commentator opinions now besides AlphaGo? Chess programs have found many errors in classic game commentaries.

All this leads us to ask, what’s in store humanly when the duel resumes at 11pm ET tonight?

Slim Chance or Fat Chance?

“Slim” is a common Old West cowboy name, and was chosen by the actor Slim Pickens. However, a gunslinger avatar of AlphaGo would have to be the opposite: phagó is Greek for “`eating” so AlphaGo might translate to “Fat Al.” As in fatal—for any human opponent. Does Lee Sedol stand a chance in any of the remaining games?

We will find out overnight Saturday, Sunday, and Tuesday US time, which is afternoon time in Seoul where the matches are being played. The YouTube links and times for live video commentary have already been fixed on the Google DeepMind channel; here they are individually:

  • Game 3, video start 10:30pm EST Friday;
  • Game 4, video start 10:30pm EST Saturday;
  • Game 5, video start 11:30pm EDT Monday.

The games start a half-hour in. The page also has the saved feeds and shorter 15-minute summaries of the first two games. A second place for live commentary in English is the American Go Association’s YouTube channel.

Unless Sedol can find a silver bullet it may be the Tombstone for human supremacy at strategy games. But it already comes with a silver lining: Go programs that relied primarily on exact calculations had stayed far below professional human players even after Monte Carlo self-play was introduced ten years ago. AlphaGo uses both—indeed the paper shows six interlinked components—but the main fillip comes from imitating the learning process of the human mind. This shows concretely that our brains embody computing efficacy that is not simply replaced by calculation in overt numeric and symbolic terms, per argument here. This also underscores what a tremendous achievement this is already for the Google DeepMind team.

Open Problems

To date there has not seemed to be any reason in Go to hold top-level freestyle tournaments in which multiple human and computer players consult as a team. Will there be now, and will the combination prevail over computers playing alone as it did for chess, at least for awhile?

Update (10:50pm 3/11): The Game 3 AlphaGo broadcast has begun with a lengthy discussion/interview about the P10 move in which an AlphaGo team member explained how it came about. (3/12): AlphaGo won game 3 most convincingly per the GoGameGuru commentary page. Great article by Albert Silver covering the games and comparing chess and Go programs.

Update (3/13): Lee Sedol won Game 4. Go Game Guru commentary. (3/14): Today’s update to the GoRatings.org list does in fact show Lee Sedol at #5, with AlphaGo taking his place at #4.

Update (3/15): AlphaGo won Game 5 (commentary). After the game, the GoRatings list has it at #2, still behind Ke Jie of China at #1, and with Lee Sedol holding at #5.

[some minor word changes, fixed Friday time, added AGA channel, linked previous post re “silver lining” argument at end]

19 Comments leave one →
  1. March 11, 2016 7:32 pm

    💡❗⭐😎❤ exciting/ "gamechanging" stuff and glad you guys weighed in on the match. am excited about the definitive wins, beautiful moves, the remaining games, & think its the way coolest contest to come around in a long time. hassabis/ deepmind have produced some really world class research/ breakthroughs almost without any peers. outstanding! see also battle of the brains midmatch pause: alphaGo 2-0 over Sedol

  2. March 11, 2016 9:12 pm

    I’ve been watching too … is the statement “the duel resumes at midnight ET tonight” a typo? Shouldn’t tonight’s (Friday’s) time instead be 11:000 pm ET?

    • March 11, 2016 9:24 pm

      Ah. I read one source that said midnight this morning when I started, and then found the feeds page saying otherwise and forgot what I’d written. I have to triplecheck now…yes, 11pm.

    • March 11, 2016 11:27 pm

      Thank you Ken, for this great GLL column. With the game just starting (as I write this), I’m wondering whether the faster combinatoric explosion of go’s move-tree relative to chess’s move-tree, is working in favor of AlphaGo?

      In chess, top-level humans can hope to play sufficiently long sequences of sufficiently accurate moves, as to sustain positional equality all the way to the end of the game.

      Whereas in go, humans have a vanishing chance of playing a sufficiently long sequence of sufficiently accurate moves to sustain positional equality, relative to AlphaGo’s capacity to do the same.

      Such reflections are both exciting and disquieting, in that real-world search problems like “find a proof” or “write a sonnet” seem (to me) more like go than like chess.

      What’s next? AlphaLimerick? AlphaFreud?🙂 ?😦

      • March 12, 2016 12:28 am

        Thanks also, John. How I would begin to answer depends first on whether the human commentators’ verdict of oscillating fortunes in the first two games is borne out by subsequent analysis. ‘Yes’ might mean the Sedol lost by being unable to sustain an accurate cadenza as you say. ‘No’ might mean that AlphaGo is near-perfect. In another sense your reasoning is certainly sound: in Go (with 6.5 or 7.5 komi) there are basically no draws, whereas my justification for 3600 being a plausible ceiling Elo rating in chess is that weaker players can hang grimly on for draws via bursts of accuracy for the 1%-5% needed to keep perfection to that level numerically.

      • March 12, 2016 3:30 am

        The game just finished … it was like watching a bulldozer (AlphaGo) slowly pushing a mouse (Lee Sedol) over a cliff.😦

      • March 13, 2016 11:34 am

        Now, after Lee Sedol’s 4th game, I feel much better!🙂

      • March 13, 2016 12:20 pm

        Me too! Last night I went to bed after 40 minutes feeling good about the opening, though that was before Sedol’s risky invasion at turn 40 and I read in commentary that AlphaGo was ahead for awhile. Whereas after 90 minutes of game 3 I sent e-mail to Dick and a couple others saying I thought Sedol was already lost and was going to bed.

  3. March 12, 2016 1:40 am

    Naive question: how much of AlphaGo’s knowledge would transfer if the game were changed to 21×21? or 17×19?

    • Michael Brundage permalink
      March 12, 2016 11:46 am

      All of it. The game would need to be modified to be asymmetric before the reinforcement learning approach breaks down.

      • March 12, 2016 1:45 pm

        But (as Ken says below) there wouldn’t be a huge database of existing games for it to learn from. It could learn from self-play, of course, but one might hope that humans would do a better job of guessing which points, joseki etc. still work on a different board size. Are some of its notions of shape translation-invariant? (My question is motivated by Kasparov’s proposal of varying the starting position in Chess.)

  4. Jeffrey Shallit permalink
    March 12, 2016 5:34 am

    This shows concretely that our brains embody computing efficacy that is not simply replaced by calculation in overt numeric and symbolic terms.

    Say what? What magical computer program is able to do calculations other than in numeric and symbolic terms?

    • March 12, 2016 9:18 am

      Jeff, the reference is to the discussion in the “Learning and Explaining” section here. A big brain map might reveal the symbolic core but we don’t have it yet.

      Cris, good question, insofar as AlphaGo bootstrapped off a large database of human master games and uses it to set its initial probabilities. The discussion at the start of the Game 3 video which I linked in the update, however, says that the initial human probability assigned to the stunning “P10” move was 0.0001.

      • Jeffrey Shallit permalink
        March 15, 2016 5:26 am

        Nevertheless, I don’t think your claim follows at all from the evidence presented. All calculations were presumably done on a computer that uses, as basic operations, nothing other than the usual numeric and symbolic operations. Even if some new kind of computer would used, that doesn’t show anything about the brain; it just shows that one particular activity could be simulated using this new method.

  5. Bill Gasarch permalink
    March 14, 2016 11:15 am

    My impression is that Chess playing computers got better gradually over time and hence the beating of Kasporov was not too much of a surprise, where as Alpha Go doing well caught people by surprise. The increase was much more sudden. Is this true? If so, why the difference. Also, would the techniques used for alpha-go also work with chess playing
    computers?

    Computers are getting better, yet half of my comments to this blog end up being spam filtered. Lets see if this one does.

  6. Serge permalink
    March 14, 2016 6:02 pm

    I feel relieved Sedol won the fourth game. I was already preparing for an invasion of mankind by machines. However that might happen someday, once they’re able to communicate with one another without us knowing. Maybe they’ll devise a plan for getting rid of us all…

  7. Peter Gerdes permalink
    March 15, 2016 11:06 am

    It’s interesting to compare the human level performance of AI in strategy games with it’s abysmal performance in automated theorem proving. Even the best,most complex automated theorem proving systems struggle to establish even the most basic results despite the greater economic and academic incentives to produce such a system.

    What makes the difference? I don’t know but let me throw out a few possibilities:

    1) The truly epic number of steps required to formalize even basic proofs.

    While there is obviously a strategic aspect to games like go and chess part of what makes them entertaining is the lack of long chains of boringly obvious play dictated by strategic choices while formal proofs are the exact opposite.

    2) The need to dynamically generate and employ new (potentially leaky/imperfect) ways of representing proof strategies without getting lost in details.

    3) The lack of any good measure of “position quality” to employ. There is no obvious analog to being up by several pieces since you don’t know beforehand what lemmas/subproblems would be helpful to prove the result.

    4) The lack of uniformity (such as that provided by playing against different adversaries) in theorems to be proved makes it difficult to evaluate progress. Solving a bunch of automatically generated variants of a theorem is not the same thing as proving theorems we care about.

    5) The lack of uniformity in problems makes it hard to tell if a heuristic for discarding unpromising moves is good or simply lucky enough not to fail in any of your test cases.

    6) Lack of human examples/practice in formal proofs blunts our intuition and limits training data.

Trackbacks

  1. The Primes Strike Again | Gödel's Lost Letter and P=NP
  2. The World Turned Upside Down | Gödel's Lost Letter and P=NP

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s