# Stealing Strategies

* How constructive are strategy stealing proofs? *

David Gale was a famous mathematician and economist, who passed away just over five years ago. I had the honor of meeting him while I was on the faculty at Berkeley years ago; Gale spent most of his career at Berkeley, ending it as a professor emeritus.

Today I want to talk about his famous proof that there is a winning strategy for the game Chomp.

The proof is based on the brilliant idea of strategy stealing, which is due to John Nash. It is usually considered a *nonconstructive* proof method, but we will argue that there are some situations where it is quite constructive. At least it is constructive in the spirit of complexity theory.

## Chomp

Gale invented many games, but the coolest is, in my opinion, called Chomp. It is based on eating chocolate. Chocolate is the most important food group—this is not a opinion, but a fact.

Chomp is a two-player strategy game played on a rectangular chocolate bar made up of smaller square blocks (cells). The players take it in turns to choose one block and “eat it” (remove from the board), together with those that are below it and to its right. The top left block is “poisoned” and the player who eats this loses. Poisoned is pretty serious and does not make one play the game lightly, nor be able to play it repeatedly. I prefer to say that the corner is make of chocolate that is very bitter and the rest is milk chocolate. Since I do not like bitter dark chocolate, having to eat this corner would be not very much fun.

Gale invented, named, and analyzed the game of Chomp. Earlier an equivalent game had been published by Frederik Schuh in his book called “The Master Book of Mathematical Recreations.” He did this in 1943, and although this was 70 years ago, the book is now a Dover book and is available.

Math is like this, as we have pointed out many times: fame often goes to the last person to discover something, rather than the first one. We could call this *Gauss’ Law*: but he is the exception, since he often gets credit even thought he discovered something first and yet did not publish it at all. This is a topic for another day.

Here is a game played on a bar as shown by Wikipedia. As usual we will name the players Alice (A) and Bob (B). I should add that Schuh in his book did use names for players: his were John and Peter.

Note at the end Alice must eat the last bar and so she loses. Since we know, thanks to Gale, that the game is winnable by Alice, she must have made some error in her play.

There is a wonderful site where you can play the game on-line. Try it. I played randomly in the start and then switched to thinking at the end and won as the first player. So I am 1-0, and will not play the machine again.

## Strategy Stealing

Gale’s argument is this:

It turns out that for any rectangular starting position other than the first player can win. This can be shown using a strategy-stealing argument: assume that the second player has a winning strategy against any initial first-player move. Suppose then, that the first player takes only the bottom right hand square. By our assumption, the second player has a response to this which will force victory. But if such a winning response exists, the first player could have played it as his first move and thus forced victory. The second player therefore cannot have a winning strategy.

The idea of strategy stealing is due to Nash in 1949. He applied it to a different game called called Hex. Gale’s argument is clearly based on Nash’s idea, but is slightly different.

## How To Steal Some Algorithm

Assume that Bob has a winning strategy for Chomp that is based on a polynomial time algorithm. We do not care if it is fast or slow, but his strategy that Alice plans to steal is based on an algorithm. Then *Alice can beat Bob as they play a series of larger and larger games, for all but a finite number of games.* Moreover, Alice takes time that is order the same as Bob’s algorithm—that is, her running time is up to a constant slower than Bob’s.

Let’s look and see how she does this.

## Alice’s Basic Strategy

Here is an obvious method for Alice that almost works. We know that Bob has a strategy that runs in polynomial time. Let be a listing of all the TM’s that run in polynomial time. Alice selects and begin to play against Bob. As long as she wins, she keeps using . If she ever loses, she then switches to . Recall that they keep playing on larger and larger boards.

We claim that this strategy for Alice will win all but a finite number of games and will run in polynomial time. The latter is direct from the definition of the machine list. To prove the first note that when Bob plays first he uses some for his winning strategy. So in the worst case Alice moves from to to until she selects .

Thus she can only lose at most a finite number of games, since she changes her program only after a finite number of games. The argument is similar to that in our recent post about how satisfiability becomes easier if the world lacks randomness.

## Can Alice Steal Bob’s Algorithm?

The above strategy for Alice insures that she loses at most a finite number of games. But it does not guarantee that she will eventually be playing as fast as Bob’s algorithm . She might stop earlier with some that is a winning strategy but is very slow. Bob’s strategy may run in time, but Alice may settle on an algorithm that runs in . This clearly is not satisfying the claim that Alice will eventually runs about as fast as Bob’s best first player strategy.

So how does Alice fix this? I have an idea that may work, but have not been able to carry it out completely. Perhaps you can help. My conjecture is that Alice should be able to create a slightly more clever strategy so that she wins most games, but also eventually runs essentially as fast as Bob’s best strategy.

I asked Ken—who is at his anti-cheating meetings in Paris—to try without telling him my idea, and he came up with this: It is important that the games are played with larger and larger box sizes . At any time Alice maintains a list of machines that have not yet lost a game, where is unbounded but grows slowly with . She simulates each machine, and waits for the first one to finish. Note that will always be one of the machines. If a faster machine loses she rubs it off the list. The idea is that eventually she plays faster and faster correct machines.

The one hitch is that as Alice slowly adds new machines to her list, she must risk the possibility that a new machine is faster—so she plays its move—but it is erroneous. Thus she might lose an occasional game infinitely often. However, if Bob’s algorithm is the fastest one asymptotically, for large enough it is the fastest concretely and it always gets played. The problem comes in if the Chomp game has no well-defined best algorithm, but instead an infinite succession of faster ones. In this we don’t know what is the best we can prove—and we are not sure either how Alice can avoid both losing infinitely many games and getting stuck with a definitely sub-optimal running time.

## Open Problems

Does this strategy work? Any ideas on how to make this into a full proof? That is: Can Alice steal Bob’s algorithm, or one with almost as good running time?

How can Alice enumerate all polynomial time TMs?

One way to avoid needing to is to run M_1…M_k for n^k steps each, then increment k and start back at 1.

How can Alice decide whether an arbitrary TM is a polynomial-time one?

It doesn’t need to. Alice enumerates Turing machines alongside a time bound; if the machine doesn’t terminate within the time bound, declare the output to be some fixed output (e.g. the bottom-right square or the top-left square).

Steven, how can Alice decide whether an arbitrary TM is a winner one?

That strategy stealing is a constructive or non-constructive proof?

As far as I know, she can’t. But here’s my idea, more clearly:

Let M_i be the ith Turing machine. (This includes all Turing machines, not just polynomial-time ones.)

For each game, Alice’s strategy is parametrized by (i, k). She runs M_i for up to (m+n)^k steps on each turn, where (m,n) is the size of the chocolate bar. If M_i doesn’t finish, she forfeits.

Assuming that there is a polynomial-time winning strategy, some (i,k) will win all games. (Technicality: this isn’t true for m=1,n=1, but that case doesn’t matter.) Therefore, as long as Alice eventually chooses an appropriate (i,k) and sticks with it, she only loses a finite number of games.

She can do that by an appropriate new choice of (i,k) each time she loses. There are lots of choices that work — all she needs to do is to make sure that, for each (i,k), she tries (i,k’) for k’>=k after a finite number of losses.

That process in the post maybe works, but only on an infinite listing of all the polynomial-time TMs, since into an infinite listing of all arbitrary TMs, there can be infinitely many TMs before the first polynomial-time one appears.

André : your last statement (“there can be infinitely many TMs before the first polynomial-time one appears’) isn’t correct. The key is that since the number of TMs is _countable_, it has an enumeration in order type ω (indeed, this is really the definition of ‘enumeration’) – and in such an enumeration, since every TM appears *eventually*, its index in the enumeration is finite – there cannot possibly be infinitely many TMs before the first polynomial-time machine, because there aren’t infinitely many TMs before *any* TM in the enumeration.

(This is essentially just a semantic argument that ‘put all the even numbers first, then all the odd numbers’ isn’t an enumeration of the whole numbers, as enumeration is defined; in any ‘proper’ enumeration of the integers, you can talk about where the first odd number appears, and while for any *given* n there’s an enumeration of the integers where the first odd number appears after n, every enumeration of the integers has its first odd number appearing at *some* finite position n.)

Steven, with arbitrary TMs, we can have a ω+ω infinite sequence: 0, 1, 2, … , ω, ω+1, ω+2, … , ω+ω, where, for instance, the “first” ω refers to all exponential-time TMs and the “second” ω refers to all polynomial-time ones.

But the core point is that this can’t happen if Alice has an

enumerationof TMs! Keep in mind that Alice is (essentially)choosingwhat order to take TMs in. As long as she chooses an enumeration ofallTMs — for instance, by choosing some fixed UTM and then choosing TMs in order of their size-of-UTM-code — then she’s guaranteed to get to any specific TM (and in particular, the one that implements the purported polynomial-time winning strategy in actual polynomial time) within finite time.Shouldn’t Alice pick the least-indexed machine instead of the fastest machine in the viable set ? Otherwise she would always be tempted by fast machines and lose i.o.

Reblogged this on Pink Iguana.

I don’t think you should say that Alice is stealing Bob’s algorithm here. The only assumption she is making is that such an algorithm

exists, not that Bob is using it.So how

wouldAlice steal Bob’s algorithm? Instead of trying to win, she would try to predict Bob. She discards any algorithm that does not correctly predict Bob’s moves for all positions seen so far. Since she assumes Bob’s algorithm never looses from a winning position, she can also throw out all all algorithms that loose, or that agree with a loosing algorithm on a previous game (on all moves after a known winning position).She maintains a set of candidate algorithms, refilled as needed from an enumeration of all TMs(not just polynomial). She runs each candidate in turn on all previous positions and the current position. Any candidate that incorrectly predicts Bob’s moves is eliminated. She continues until a candidate has finished for all positions without being eliminated, the first such candidate is the winner. Alice makes the winner’s move for the current position.

Naturally she would cache results and states, so she doesn’t need to do any runs more than once.

Alice doesn’t have to do all this every turn, she can re-use the winner from previous turns, as long as it isn’t eliminated, and doesn’t take too long. By reducing how often she searches for a new winner, she can get the benefit of the current fastest algorithm, but still find faster ones, if they exist.

Does Alice always lose finitely many times? If she only selects a new winner when when she eliminates the old one, yes, because she will eventually find an algorithm equivalent to Bob’s (which will never be eliminated). But the algorithm is not guaranteed to be optimal, or even polynomial.

Otherwise I’m not sure.

How can Alice decide when a TM is as good as the Bob’s one? (How can Alice decide she can stop and say “- Eureka! This TM is as good as the Bob’s one!”?)

Note that the comparison of only finitely many results from simulations cannot work in general.

How can Alice decide she can stop and say “- Eureka! This TM is as good as the Bob’s one!”?

She can’t. But any TM that isn’t will eventually make a bad move and be eliminated. Alice can find a TM that always beats Bob after a finite number of games, but she can’t tell when that has happened (I doubt it’s even decidable. Without the restriction to P, I know it isn’t).

I have thought about it a little more.

Alice can ensure that her algorithm is in P by running candidates for at most n^k steps. She initializes k to 0. When no candidates finish (without being eliminated) she adds a candidate from the enumeration and increments k. Since there is a polynomial algorithm, eventually k will stop increasing. (This was suggested for the original algorithm in another comment.)

As for finding the optimal algorithm, I don’t think it is possible, even if one exists.

To be assured of only finitely many losses, she must not change her “winner” (the algorithm that makes her actual moves) infinitely many times. There can(must?) be infinitely many fast algorithms that work for the positions seen so far, but not for the next one.

If she only searches for a new “winner” when the current one is eliminated or exceeds the time bound, that will only happen finitely many times, so finding a polynomial algorithm is OK.

But to find the optimal algorithm, she needs to try infinitely many candidates.

I think that question (I shall call it Hopeful Alice Problem – HAP) is decidable only if the number of games is fixed and given.

Otherwise, since Alice cannot decide when a TM is as good as the Bob’s one, how can she decide whether she would not lose infinitely many games after she stops? (If HAP is decidable, then at some moment she must do it; else, HAP cannot be decidable).

Let’s assume Bob’s alg is running in $O(n^2)$ and there is no way to implement it in $\Omega(n^2)$. Also let $M_o$ be the TM with lowest index which simulate Bob’s alg and $n_0$ is the smallest integer s.t. $s(n_0)\ge o$.

For any $n\ge n_0$, there are two cases:

1- There is no unrobbed TM among the first $s(n)$ Turing machines: Alice is fine;

2- There is at least one unrobbed TM among the first $s(n)$ TMs: Alice will eventually make a mistake (maybe not in this step, but in one the next few steps)

Since there are infinitely many TM with running time of $\Omega(n^2)$, there are infinitely many $n$ satisfying the second case and so, Alice will make infinitely many mistakes, and never finds $M_o$.