Sorting Out Chess Endgames
An approach to solving chess endgames using fast sorts
Hector Garcia-Molina is one of the top experts in databases—taken in a wide sense. He has done seminal research in almost all aspects of systems that manage data: from classic databases, to web based information systems; from semantics of data, to security of data. If you are interested in any aspect of data, you will probably have to read one of Hector’s papers; you will also probably have to reference him.
Today I want to recall a project Hector and I ran, in the mid 1980’s, while we were both at Princeton University. The project was called —the Massive Memory Machine Project. The main reason for discussing this old project is related to the recent post on chess endgames.
The goal of the project was to explore the power of large amounts of main memory. Our thesis was that even “slow” processors with huge amounts of main memory could beat parallel processors, provided the slower processors had much more memory than the parallel ones.
The project bought and constructed machines that had gigabytes of random access memory. In those days typical machines had a few megabytes of memory at most, so getting machines with gigabytes was non-trivial. A simple problem we encountered was that all the standard machines did not even have enough address space to get into the gigabyte range. We got around these problems, but the machines cost us many millions of dollars to build.
It is a bit funny to recall we paid millions of dollars for this amount of memory, when today you can buy it for tens of dollars—and then put it in your pocket. But, I guess the cost of being over two orders of magnitude bigger than all other machines had its price. I guess it would like having a machine with petabytes today.
Hector is a pleasure to work with and I enjoyed our joint work very much. He is a tireless researcher. Two measures prove this: he has almost 400 published papers, and even more impressive has an H-score near 90. The H-score is claimed to be more accurate in measuring research output than just total citations. You have an H-score of provided you have written papers
and each of these papers has at least citations. The point of the H-score is to factor out one huge paper will a “zillion” citations.
Hector was at Princeton when I was being recruited to go there, many years ago. I still recall what he told me about Princeton. He was, then, a junior faculty who had just graduated from Stanford. He pointed out when he was deciding where to go—he had several offers—he selected Princeton for many reasons. But, one reason he explained this way to me:
I noticed the semester at Princeton was only one week longer than the quarter at Stanford.
Hector was a great teacher, but his true passion was research, so having a short teaching semester was pretty neat. I have to say I went to Princeton for many reasons, but I never minded the length of the semesters.
One last story I must tell about Hector—he was always cold. I mean physically cold. Perhaps this is why he moved back to the sunny west coast, because even during the Princeton summers he might wear a sweater.
One day a large company offered to give our department a number of their workstations. The machines were good at running Lisp, but did not support UNIX and so were not useful for our day-to-day tasks. The department asked who wanted one of these machines: I said no thank-you. But, Hector took one—to my surprise.
We were in his office one day chatting, when I noticed there in the corner was one of the workstations. It looked unused, looked untouched, but it was clearly turned on and running. I asked Hector the obvious question: why did he take one of the machines, since he clearly seemed not to be using it. Hector, always logical, said he never used the machine, but he knew they ran “hot”—they were known to have cooling problems. So he kept his on all the time, and used it like a space heater. He said he would have taken two if he could have.
Solving Chess Endgames
I believe solving the -piece chess endgame problem can be reduced to a performing many sorts on a very large file. The size of the file is a function of : for six pieces it is about ten gigabytes, and for seven pieces it is less than a terabyte. Ken Regan suggests that it grows by a factor of about for each additional piece.
The connection with the project is one of the problems memory helps with tremendously is sorting. Again more on this in the future—for now I want to explain the connection of the chess problem to sorting. In the near future, I plan to discuss the various methods and issues involved in performing extremely large sorts. Around 18 months back, Google did a record sized sort on one petabyte of data in 6 hours and 2 minutes across 4,000 computers. See this for more details. An example of some of issues that arise in such large sorts is disk failures: Google used so many disks since they had to plan for disk failures during the sort.
Let’s look at the chess problem abstractly. The problem is this: let be a bipartite graph where the left nodes are the positions where white has to make a move and the right nodes are the positions where black has to move. These positions are constrained to have at most the required number of pieces. For . ease of presentation I will make one assumption: the graph models a game with only wins and losses, no draws. This is not fundamental, it will just help make the following clearer—I hope.
Each node of the graph is labeled from the set
all from white’s point of view. Initially, all nodes are labeled . Then, a pass is made over all the positions and as many positions as possible are labeled or . This is done by looking at each position, since many will be immediate wins or losses.
The key is to propagate the information from the win/lose nodes to the unknown nodes. The rules are simple: Consider a white node that is currently unlabeled. If it has an edge to a black node with the label , then label the white node ; if all its edges go to black nodes with the label , then label the white node .
Similarly, there are the “dual” rules for the black nodes. Not exactly. The issue is cycles are possible, but to make our point about using sorting to solve endgames we can safely put this subtle issue aside.
These rules are applied until all nodes become known or until resources are exhausted. Note, if there are nodes with the label still, these positions may be a win or a loss. However, if we have applied the rules many times we can at least be sure that these positions cannot be resolved within a short time.
The central question is what is the cost of propagating the win/lose information to all the nodes. The obvious way is to cycle through the nodes and keep applying the rule. The problem with this approach, I believe, is the random access nature of the lookups. Suppose is a white node. It may have edges to black nodes that are in arbitrary places. Thus, this could be very expensive. This applies even if the whole graph and labels are in main memory—in modern machines random access to memory can be expensive due to cache misses. A good description of these issues and efforts to cache the lookups accompanies Harm Muller’s proposed 7-piece tablebase algorithm.
My suggestion that Ken and I have just started to explore is to avoid random-access lookups altogether, by reduction to sorting. In a small number of sorts we can apply the propagation rules to all the nodes of the entire graph. This still has to be repeated again and again until all the nodes get labeled, but using sorting can be faster than random access.
Essentially, we are thinking of the application of the rules as a kind of database join. Joins are exactly what is needed to do the lookups all at once. Each node may wish to lookup a node’s value. Here is a small example to explain the idea. Form the file:
The line and means that the white position needs to see the label on the black position .
After being sorted on the second value, this becomes:
Note, the black values are in ascending order so the values can be looked up fast by a “merge” type operation with the black nodes and their labels. Let us say this yields,
Then we can sort on the first value and get the white labels in the correct order:
The whole point of this approach is that the expensive operation is a standard one of sorting. We believe that we may be able to use high performance systems to run this type of algorithm and perhaps solve larger endgames than previously possible.
The issue then becomes, how to govern the sorts to minimize the number of iterations? We have in mind a loop over legal moves in any position, and/or a loop over the depth to mate. The latter is avoided in the standard tablebase algorithm, so that the running time is basically independent of the maximum number of moves it takes to win from positions in a particular endgame. Introducing it here would mean we are trading off cache efficiency against simple sequential runtime. Whether this would yield a large practical gain is yet to be determined. The only thing we can say now is that rewriting the algorithm to use reduction to sorting would make it more immediately applicable on cloud-computing systems.
How far can we go with near term technology on chess endgames? And how far in the future? Also what about just the problem of doing huge sorts. How do we do them? The Google approach used triple redundancy on the disks, since failures were expected. Are there more efficient ways to sort data of this size?