Chess and draws and explainable AI

 Guardian Live source

Magnus Carlsen and Fabiano Caruana have drawn the first eleven games of their world championship chess match. One game of the regular match is left for Monday. If it, too, is drawn, then there will be a faster-paced tiebreaker series on Wednesday. Update 11/26: It was drawn.

Today we discuss the match and some of its implications for computers and explanations.

That the match is tight is no surprise. Carlsen came in with an Elo chess rating of 2835 and Caruana with 2832. This is the smallest rating difference in world title matches since the Elo system was adopted. They also have the highest average rating. We discussed last spring whether Caruana could be said to be “in form” but what is undeniable is that he has had some sensational results these past five years. With Carlsen down from his Olympian rating of 2882 in May 2014, he wore the mantle of champ but not of favorite.

So no one expected a turkey shoot. But no one expected a record string of draws at the start of a championship match either. The only longer string has been 17 draws in the middle of the infamous marathon between Anatoly Karpov and Garry Kasparov in 1984–85, which was aborted after 48 games. Expenses and the speed of life do not allow such long matches anymore, so if neither player grabs the crown tomorrow, Wednesday will bring the icosiad to a definite close. Icosiad, from Greek for “twenty,” is my attempt at a word for “three-week fortnight”—though Donald Knuth used the word to mean ${10^{60}}$. It is neat to see chess push elections, politics, and sports completely below the fold at FiveThirtyEight:

## The Play

Most of the games have been hard fought and not boring. Both players have had chances to win several games. Carlsen was by all accounts “completely winning” in game 1 but sold himself short in the moves before the turn-40 time control. He tried to convert a one-pawn advantage for 75 more moves but there is a reason for the proverb, “All rook endings are drawn.” In game 6, Caruana had a technically winning position but only by means neither player suspected—we will discuss this more below. In games 8 and 9 both players spoiled chances, first Caruana by a timid pawn move h2-h3?, then Carlsen by a rash advance h4-h5?

Despite these chances given and missed and some other moves that both humans and computers would mark ‘?!’ for “dubious,” the standard has been high. My chess model, which I discussed again recently in a preview of the match, gives a rating of 2875 +- 80 for the quality through 11 games. It gives Carlsen an “Intrinsic Performance Rating” (IPR) of 2895 +- 105, Caruana 2860 +- 125. (Update 11/26: After the 12 regulation games, Calrsen 2880 +- 105, Caruana 2850 +- 125, combined 2865 +- 80. Compare historical IPR data here.) It is well within the margin of error to say that they have been playing at equal level and each has brought his “A-game.”

My IPRs are still primarily measures of accuracy. I have written several posts explaining difficulties encountered in trying to extend the model to measure challenge created. The drawn results notwithstanding, both players have shown enterprise and shouldered risk.

Yesterday’s game, however, was an exception. Carlsen as White played in to an early trade of queens, which was soon followed by “hoovering” all the pieces except a pair of opposite-colored Bishops. Those tend to draws so strongly that Caruana could afford to shed a pawn to block White’s play on the queenside, and then only needed to know “one trick” to hold White off on the kingside.

I agree with those saying Carlsen is angling for the playoff for reasons similar to what was said about Carlsen’s anodyne play as White in the final regular game of the tied match against Karjakin two years ago: A four-game playoff affords time to recover if a risk goes awry in the first game, whereas tomorrow’s winner-takes-all conditions do not. Moreover, I perceive impetus to maximize the time of staying in contention over the chance of winning. This is similar to why in NFL football there is less regret in kicking an extra point to tie rather than the win-or-lose option of going for two, even though the NFL recently made extra points more difficult so that going for the abrupt end brings better odds.

Still, a difference from 2016 is that Carlsen has Black in tomorrow’s 12th game. Will Caruana try to force the issue, and will the champion face a comeuppance? We will see. Carlsen was rightly considered the heavy favorite in the 2016 tiebreaks, and my own IPR measurements showed no dropoff in his quality at the faster pace. My own opinion is that the tiebreaker chances would be as even as the match so far.

Update 11/26: Caruana did repeat his “fighting” opening from previous games but got outplayed in moves 15–24. Then Carlsen by his own account took his foot off the pedal and invited the tiebreaks. What I already wrote above is consistent with Kasparov’s take. (Incidentally, when twelve straight strikes occur in bowling, the last three are called a “turkey.”)

## Endgames and Engines and Explanations

Here is the “one trick” from the end of yesterday’s game. White has just played 51.f2-f4 with the obvious intent of 52.f4-f5 to disturb Black’s pawns.

A natural reaction would be for Black to play his bishop to e6 to cover the f5 square a second time. But this is the one thing Black must not do. White plays 52.f4-f5! all the same. If Black captures with the pawn, 52…g6xf5, White’s h-pawn has a clear path to queening on h8 after 53.h4-h5. If Black captures with the bishop, 52…Be6xf5, White has 53.h4-h5! anyway. Black cannot capture the pawn on pain of unguarding the bishop, so White’s pawn will run free. Black can resist by 53…Bf5-c2 54. h5-h6 g6-g5 to stop 55.h6-h7 right away, but White has the pleasant choice of 55.Kg7 when the bishop will be lost or feasting on Black’s pawns by 55.Kxg5 or 55.Kxf7 first.

Instead, Caruana calmly tacked with 51…Bb3-a2 and the trick (after a switchback feint 52.Kf6-e7 Ba2-be 53.Ke7-f6 Ba2-b3) was 54. f4-f5 Ba2-b1! Now after 55.Kf6xf7 Bb1xf5, White’s king is no longer touching Black’s bishop so there is no sting in 56.h4-h5, while after 55.f5xg6, Black’s bishop supports recapturing with the pawn on f7 to hold the fort. Carlsen tried 55.Bf2 and again Black must resist temptation to take on f5. Caruana played 55…Bc2! and they shook hands for the draw.

Not only does this explain the draw in the diagrammed position, it turns aside what was really White’s only winning try in the whole endgame from a dozen moves back. Thus it suffices as a humanly-understood explanation of the whole endgame. Master annotators at several commentary websites have not felt a need to say anything more.

The sixth game, however, brought an announcement of checkmate that no one alive saw coming:

It came from a supercomputer named Sesse running the chess engine Stockfish: Black has checkmate in 36 moves beginning 68…Bg5-h4! At first this looks suicidal since after 69.h5-h6 Black’s king is cut off and his knight and bishop look far from stopping the pawn. But after 69…Nd4-f3 (or 69…Nd4-c6) 70.h6-h7 Nf3-e5+ 71.Kg6-h6 Bg5+, White’s king is evicted and after 72.Kh6-h5 Kf8-g7 73.Bc4-g8 Kg7-h8, the compulsion to move (called Zugzwang) forces White to unguard the pawn since his king is frozen.

So White covers the c6 and f3 squares by 69.Bc4-d5 and after 69…Nd4-e2 both guards his h-pawn and puts a question to Black’s knight by 70.Bd5-f3. Now if you haven’t already heard about this, see if you can not only come up with Black’s winning move but the plan of why it works:

Besides looking below, you can find the answer here where Garry Kasparov opines:

“[H]ad Caruana played the incredible [moves], they would request metal detectors immediately! No human being can willingly [play] like that.”

AlphaZero reportedly did not see the mate either.

## Computer-Assisted Explanations

Marc Rotenberg, who is the president and executive director of the Electronic Privacy Information Center (EPIC) in Washington, D.C., voiced a further opinion of wider import:

Last week in London, I reported from the World Chess Championship (WCC) that computers are very good at telling you what to do, just not so good at telling you why to do it. That is now clear after the remarkable game 6 (2018/6) which found Caruana with a winning position that no human player could pursue, and a computer solution that no human player could comprehend…Of course, the discussions about computers and chess have been ongoing for many years. But there is a greater relevance today. With many governments now thinking broadly about the implications of AI for social and economic policy…I would like to suggest WCC 2018/6 is a cautionary lesson, an example of a solution we could not find, but also one we may not understand.

There is however a further wrinkle, which is that computers can help us generate humanly understandable explanations. I believe this is a case in point—not only to explain why Black wins here but also how and why White was safe until voluntarily moving the king from h7 to g6 at turn 67. The first point of 70…Nf3-g1!! is to buy time after 71.Bf3-g4 Kf8-g8! to block White’s h-pawn. Owing to Zugzwang, White must either cede ground or release the trapped knight, both of which are fatal. White can head off Black’s king by 71.Bf3-d5 instead, but 71…Bh4-g5! wrong-foots White: 72.Kg6-h7 Ng1-e2 and a fork will come on g3 or f4; or 72.Bd5-c4 Ng1-h3! 73.Kg6-h7 Nh3-f4 and White’s bishop is boxed out of e2, so White’s h-pawn must come forward to its eventual doom.

As for the way to draw, the main insight is that White’s king must keep Black’s king out of the corner. In the game, Caruana worked to bring his king to the center—as human players find natural—but the first key to the draw is that White struck back with his h-pawn in time. My analysis gives this as a pivotal position:

White must play Kh7-h8! The second and third keys to “why?” are holding the corner and keeping Black’s knight from reaching e2, from which it has a g3-or-f4 option that White cannot cover. Now I must admit that my identifying these three keys as the explanation of how to draw rests on some hours of exploration with multiple chess engines to satisfy myself that no loopholes exist—no incredible maneuvers that can stretch White beyond breaking. Technically, I should generate a “proof by corresponding squares”—to the rigor of this one I did for the famous Kasparov-World 1999 Internet Match—that White can always cover all threats. But I am quite convinced that the chess content does not go too far beyond this explanation, and my fellow master annotators rested their cases similarly.

I did that 1999 analysis without computer aid and that proof was later upheld 100% by the compilation of exhaustive 6-piece tables. Now I would not be able to spare time for a proof without computers. I did not however simply take a verdict from the computer running unaided. I used the computer to explore options until I was humanly satisfied that everything important had been checked. Thus I offer WCC 2018 game 6 as an example of using computers to establish human explanations on sure ground. Explanations gleaned from exhaustively tabled endgames appear in several books by John Nunn, a similar series by Karsten Müller and tablebase compiler Yakov Koneval, and for two extreme 7-piece endgames in the book Stinking Bishops by the British endgame problemist John Roycroft.

## Open Problems

Who will win the match? Will either player take risks to try to win tomorrow’s game? Or are we set for a playoff? It is worth noting that if the players keep drawing even through faster and faster tiebreak games a winner will still be declared in a final-game with so-called Armageddon rules, whereby White gets more time but a draw on the board is a victory for Black.

How can computers be made to help with explanations as much as they may race beyond them?

[added links for Game 6 analysis, mention of AlphaZero, books on chess endgames, and some other minor word fixes and changes, updated for the game 12 draw.]

November 29, 2018 2:14 am

Ken have you seen the blurb for the forthcoming book about Alpha Zero? Link:https://www.newinchess.com/en_US/game-changer

They say they have been trying to reverse engineer Alpha Zero’s neural net evaluations to find out what they mean in human terms.

November 30, 2018 9:41 am

Hello. Just curious if you tested the IPR model by excluding opening/”book” moves from the games.

• December 1, 2018 12:07 am

I remove book when doing cheating tests—and then I also report IPRs without book moves—but the standard benchmark for IPR includes book. There are three main reasons:
(1) IPR considers what comes (only) out of your brain—with no judgment of how it got into your brain;
(2) Policy (1) puts all games on an equal footing. Doing so fosters not only historical comparisons but ones across recent years.
(3) Removing book is a time-consuming step needing manual fallback. I have a project on AWS to try to automate it better than others have.
One might feel (1+2) are unfair to masters of old who didn’t have access to opening books and computer prep, but even if one wishes to compensate, I think it better to do so by adjusting after data has been quantified via the “standard candle” of equal treatment.

December 6, 2018 3:50 pm

Regarding computer analysis of why rather than just what – are you familiar with the “Decode Chess” software (website https://decodechess.com) ?

4. December 8, 2018 1:31 pm

Thank you for the article. If I may make a small criticism: might you consider using more concise notation for describing games? Consider the widely used notation 70… Ng1!! instead of 70… Nf3-g1!! There is no additional information added by the latter notation if one takes the initial presented position as given, and it is less readable when following long move sequences.

• December 8, 2018 1:38 pm

Thanks. I have done so on previous chess posts which you can find via the “chess” tag on this one. I avoided doing so here because in the example of analysis with multiple variations from game 11 (which happened after I started writing the post—!), I thought it was important to remind readers where the pawns and Black’s bishop were coming from. In fact, I see I reverted to the shorter style for some of White’s King moves.

Also thanks, Ehud—I was not familiar with it and have not taken time to explore it yet, but it is a great effort and highly thematic with the post.