Cropped from source |

Nicole Oresme was a fourteenth-century polymath. He lived in France and became an advisor to King Charles V, who sponsored him to translate many works by Aristotle into French as well as Latin. He then became bishop of Liseux until his death in 1382. Oresme made original advances in physics, geography, and mathematics. He was evidently the first to prove that the harmonic series sums to infinity.

Today—yesterday was Thanksgiving Day in the US—we give thanks for mathematical advances and discuss what kind of cornucopia they are.

László Babai has added one more talk this coming Tuesday, December 1, to his series on his quasipolynomial-time algorithm for graph isomorphism. Last Tuesday’s talk, which was again Tweeted beginning here by Gabriel Gaster, got as far as his “Design Lemma.” The “split-or-Johnson” title was transferred to the last talk. Or what we think is the last talk.

We are also thankful for Terry Tao’s recent culmination of a broad effort on the Erdős Discrepancy Problem. We noticed an article three days ago posted online by *Quanta Magazine* mentioning Tao and our friend Gil Kalai. It is on the solution to a problem of Richard Kadison and Isadore Singer by Adam Markus, Dan Spielman, and Nikhil Srivastava. Their paper came in 2013, yet researchers are still working out the power of new tools that came from applying notions of expander graphs and matrix methods familiar in CS theory to a more-abstract seeming problem. We missed it at the time, and would love to hear from readers on other advances since then.

I, Ken, have been thinking abstractly about the nature of progress. This came up in justifying my position shared by Dick that not only is there an absolute limit to strength at chess as measured by the Elo rating system, but also that today’s best computer chess programs are already close to it. The clear top two programs, Komodo and Stockfish, are in the final days of this year’s Thoresen Chess Engines Championship (TCEC) final. With 8 wins against 2 for Stockfish, Komodo is on the verge of defending its title won a year ago. More striking however is that almost 90% of the games have been *drawn*, many seemingly without any chance for either to win.

Komodo playing Black has beaten Stockfish playing White four times. For the 100-game match, Erik Kislik and Nelson Hernandez selected 50 opening sequences of 8 moves and organizer Martin Thoresen set the engines to play each one once on each side. The openings leave White ahead typically 0.20–0.35 as measured by the engines. A Black win requires outplaying the opponent by over half-a-Pawn more than winning as White, so it is like breaking serve at tennis. Komodo has never looked in any danger as White.

If Komodo could hold at least two draws every ten games against any player, then by the rules of the rating system, no player could be rated more than 366 Elo points above it. Thus a rating estimated near 3250 for it now would translate to an absolute ceiling of about 3600, which is the high end of where various of my chess model’s regression lines cross the -axis at meaning perfect play. If Komodo played deterministically then an adversary could be tailored to its mistakes, but this can be avoided by randomizing its play in positions with multiple near-optimal moves without diminishing its strength much. Is the current Komodo that close to perfection? It is hard to conceive a better test than it is getting right now from Stockfish.

I was recently at an event with Hernandez and Komodo co-creator Mark Lefler. Lefler pointed out several possible improvements, each worth in the range 10–20 Elo points. I thought, what if the best of these is worth 20 Elo, the next 19, and each successive one has 95% of the value of its predecessor? By summation of geometric series, you gain no more than 400 total, again for a ceiling not much above 3600. The game of Go may have a higher ceiling but still a ceiling. I wondered if mathematical progress in general has this kind of geometric limit.

It strikes me that unlike a performance advance, an advance in knowledge is attenuated only by the length of advances already made in that line. I believe that the value on average is proportional to not so much intrinsically as in relation to the amount of work that must be expended to digest the previous advances which support it. Put simply, the -th advance may require at worst units of effort to express, giving as a value-over-cost estimate. The point is that unlike a geometric series, goes to .

Oresme proved this in the classic simple manner: by grouping every terms beginning (after ) with , then , then , and so on. Every group has value at least , so the sum is at least If mathematical progress is likewise harmonic, then its value is unbounded.

It is possible that improvements at chess could leverage the same additive power of harmonic series if they follow a Zipfian distribution. The best analogy may be to the situation of Heaps’-Herdan’s Law where chess games are like texts of words and a “new word” is a position in a game where a new improvement by a chess program could matter. This counts the improvements at unit value assuming each changes a draw into a win or a loss into a draw. I believe, however, that the “draw attractor” nature of chess—which is being remarked about the games of the TCEC final—pulls the frequency of improvements that can change a game below this law. The law may hold for improvements that can raise the measured value of an engine’s position by 0.10 or 0.20 but not enough to change a game except in combination. Any multiplying together of harmonic terms makes it like with and gives a hard ceiling.

Which law should apply to mathematical progress? In the concrete here-and-now rather than asymptotic-in-, Dick and I have opined that we are still in a golden age of discovery. One sign is that we can probably make much progress just by a more incisive understanding of what we already have. The MacTutor biography of Oresme quotes an assessment by Marshall Clagett:

This brilliant scholar has been credited with … the invention of analytic geometry before Descartes, with propounding structural theories of compounds before nineteenth century organic chemists, with discovering the law of free fall before Galileo, and with advocating the rotation of the Earth before Copernicus. None of these claims is, in fact, true, although each is based on discussion by Oresme of some penetration and originality …

We recently made the point that even solved conjectures can reward further penetration, and for this we can be thankful.

What kind of diminishing returns apply to mathematical progress, multiplicative/geometric or additive/harmonic?

One thing we hope is subject to a swiftly diminishing law is *errata*. We are thankful again for close readings of our quantum algorithms textbook, especially by Cem Say and his students at Boğaziçi University. They and a couple others have added several more entries to our updated errata page.

]]>
* Looking at some of its components *

From source, our congrats too |

Laci Babai’s first talk a week ago Tuesday is now a webcast here. There is also a great detailed description of his talk by Jeremy Kun, including background on the problem and how the proof builds on Eugene Luks’s approach.

Today we talk some more about ingredients of Laci’s algorithm.

He has one more scheduled talk this coming Tuesday on its key vehicle of progress: Either the current graph (and associated groups) in the recursion can be split up for divide-and-conquer, or it embeds a special kind of graph named for Selmer Johnson that can be handled specially. This strikes us as following another Chicago idea—the Geometric Complexity Theory programme of Ketan Mulmuley and Milind Sohoni—of looking for objects that obstruct the progress of algorithms; this post by Joshua Grochow discusses an example for matrix multiplication.

Rather than try to anticipate these further details and Laci’s forthcoming preprint, we thought we might give more high-level discussion of some more ideas and elements we recognize that make the algorithm work. We could be way off, but these ideas we believe are key to his breakthrough result. In any event they are useful insights that we hope you enjoy.

Laci’s talk is old style, a chalk talk. Most talks these days are powerpoint-style, but for details and complex mathematics, there is nothing like seeing a master at work using chalk. The virtues of such a talk are many. It slows down the information rate to the audience. This is good. Slow is good. Flipping powerpoint slides rapidly can be hard to follow. Watching Laci write out in his handwriting:

Graph Isomorphism GI

is wonderful. You can follow it, internalize it, and perhaps understand it.

The second is the use of many boards. This is wonderful. It gives a larger state for the audience, so they are more likely to understand.

The key ideas of the algorithm for GI are really classic ones from design of algorithms. The genius is getting them all to work together. The ideas break into two types: those that are general methods from computer science and those that are special to the GI problem.

*Small Steps.* One of the most important principles in mathematics and also in the design of complex algorithms is that we rarely can just write down the answer. That is if we wish to construct something we rarely can just write down the answer in closed form. The quadratic formula for the roots of is a classic counterexample:

Of course there are no such formulas for high degree polynomials nor for non-polynomial equations. Enter the Newton method.

The idea of building the answer in small steps is the cornerstone of most important algorithms. Linear programming uses simplex or interior point methods, matching uses path-augmentation, and so on.

*Recursion.* The reason the small step idea is so powerful is that it plays well with running time. If the small steps can be bounded by a nice equation, then the algorithm in question will run fast. For Laci’s case the key is

where is a quasi-polynomial term. This type of equation yields a quasi-polynomial running time. If we had a constant in place of then we would get a term which is polynomial, and this typifies how one gets polynomial-time algorithms by divide-and-conquer. Here is not constant, but quasi-polynomial terms are closed under raising them to log powers, so that is the time we get. The recursion can be cut off when the problem size is polylog.

*This-or-That.* As Laci says, “split-or-Johnson.” This is the hardest element to convey simply but may pay the highest dividends if we can recognize when it can be used. Here is an example: For every there exists such that for every -vertex graph :

Either has at least triangles, or one can remove edges to leave no triangles.

This enables a randomized algorithm to distinguish graphs that are triangle-free from those that are -far from being triangle-free: it can try (say) random triples of vertices and say “probably triangle-free” if none is a triangle. The randomized primality test of Gary Miller and Michael Rabin can also be phrased as an either-or in which the “or” branch finds a factor. To be sure, Laci’s algorithm isn’t randomized.

*Graph Marking.* Perhaps the main problem with GI has always been that looking locally at vertices of some graph may yield no information about the graph. Locally all the neighborhoods of vertices may look alike. This is terrible from the point of view of trying to tell one vertex from another, which is critical to be able to solve the GI problem.

But there is an expensive way to change this. We simply pick a set of vertices of and mark them in a unique manner. Now using these we can start to break up the structure of the graph and start to label vertices differently. Obviously a vertex near one of these marked vertices is different from one that is farther away. This is a critical “trick” that allows one to make progress on GI for the graph.

There is a cost. If we pick such special vertices in a graph, then we must try all ways to pick these in the other graph. That is expensive. If the graph has vertices then it costs . So to avoid taking more than quasi-polynomial time, we must keep of size at most poly-logarithmic. It also needs of order at least to be effective, which is a reason the technique cannot simply be improved to run in polynomial time.

*Groups Are Nice.* One of the most useful properties of groups is the “subgroup size trick.” In most structures, including graphs, a substructure can be very big. Suppose that is a graph on vertices. It has subgraphs with vertices—just remove any vertex. This cannot happen with *groups*. If is a groups with elements, then the largest nontrivial subgroup can have at most . This is a simple consequence of the famous theorem of Joseph-Louis Lagrange that for any finite group , the size of every subgroup of divides the size of . If you keep taking subgroups and each one is proper then you quickly make great progress reducing the size.

The worst case is when the factor is such as with the alternating group inside the symmetric group . Babai calls a homomorphism from a group into **giant** if its image is either or all of . An element of the set acts on is **unaffected** if restricted to the subgroup of permutations that fix is still giant. Subject to conditions about the setting that we don’t yet fully understand, his key new group-theoretic theorem is:

Unless is tiny, the restriction of a giant homomorphism to the intersection of over all unaffected elements is still giant.

This looks like it should be either easily true or easily false, but Babai’s precise statement depends on the classification of finite simple groups.

The key inner point of the algorithm is to try subsets of the elements where is polylog but not *tiny* so the theorem works for . The theorem helps decide in quasipolynomial time whether the resulting homomorphism into is giant. This distinguishing power in turn is used to make progress.

What is the full pattern of progress in the algorithm, with “split-or-Johnson” evidently directing the outermost loop and the above the inner loops? This may need more details to unfold.

]]>

László Babai just gave his first talk on his new graph isomorphism (GI) algorithm. The photo was taken at the talk and posted by several people.

Today we want to discuss his talk, but

But we were not there. Nor were we able to send a GLL representative there to hear the talk. There are twitter feeds here and here. The former, by Gabriel Gaster, gives a running textual description, but otherwise we are still pretty in the dark as to the details of the algorithm.

Or are we?

There is old trick essentially due, I believe first, to Leonid Levin. This allows us to give a graph isomorphism algorithm that works almost in quasi-polynomial time—without seeing Laci’s algorithm. While we are in the dark I thought you might enjoy it.

Theorem:There is a concrete algorithm that runs in time and decides graph isomorphism correctly.

Here is any function that grows faster than any quasi-polynomial function—or at least faster than Laci’s function. For example,

Then the algorithm is:

Let and be vertex graphs. Run all Turing Machines of size less that for time. For each one test its output to see if it is an isomorphism. Suppose that is one of the outputs. Check whether is an isomorphism of to . If some one works say that and are isomorphic. If all fail say they are not.

Call this algorithm . If Laci’s theorem is correct, which we expect, then this algorithm works on all but a finite number of graph pairs. So fix those and make it .

So we can start running the algorithm without the details. Of course there are two problems. The first is that the running time here is huge—its a galactic algorithm. Second, its correctness relies on Laci’s theorem, and also we do not know when it starts to work. So it really is useless.

But it does show the power of Levin’s old idea.

Of course we must wait for Laci’s paper, which we hear is due out soon. And that is the key.

]]>

Chicago Chronicle source |

László Babai must be busy getting ready for his series of talks.

Today Ken and I wish to discuss one issue that has come up in comments about his result.

By the way, the big event this Tuesday is not the Republican Debate on Fox, but Laci’s talk. It’s too bad for Chicago that the Cubs didn’t reach this year’s World Series, but these talks will make up for it.

Recall that the simple group classification theorem says:

Theorem 1Every nonabelian simple finite group is either:

- An alternating group with ;
- A group of Lie type; or
- One of the 26 sporadic groups.

How about the following weaker version of the simple group classification theorem:

Theorem 2Every large enough nonabelian simple finite group is either with or a Lie type group.

I am not saying that this could be easier to prove. No. But it may be sufficient for many of the applications in computer science. One example is that testing isomorphism of finite simple groups in polynomial time needs only the fact of their being 2-generated, for large enough , and so doesn’t care exactly how many sporadic groups there are. This seems like a nice point that we should be aware of, especially if it is relevant for the new GI result.

A third talk has been announced on Laci’s page two weeks from Tuesday, titled “Graph Isomorphism in Quasipolynomial Time II: the ‘Split-or-Johnson’ routine.” Its abstract starts off with:

In this second talk of the series we present the details of the canonical partitioning algorithms required for the master algorithm.

The abstract for tomorrow’s talk also has a third paragraph which we didn’t notice before our post last Wednesday:

In this first talk we give an overview of the algorithm and present the core group-theoretic divide-and-conquer routine, the “Local Certificates algorithm.” Familiarity with undergraduate-level group theory will be assumed.

The second abstract mentions Otto Schreier’s conjecture that the outer-automorphism group of any finite simple group is solvable, which was proved by the classification. So it is possible that the divide-and-conquer routine exploiting this solvability may encounter the sporadic groups in its recursion after all. Or it might bypass them. We just don’t know, but we look forward to knowing.

Good luck to Laci—break a leg. Apparently this is also a good-luck expression in Hungarian but they go a bit further: *kéz- és lábtörést*, meaning literally, “hand and foot fractures.” Well if this is being given as a chalk talk the former may come into play.

We also note that today’s Google doodle features Hedy Lamarr and her frequency-hopping invention.

**Update Tue. 11/10 8pm:** Gabriel Gaster has tweeted a running commentary on the talk: https://twitter.com/gabegaster. And reading it made me(us) realize “d’oh”—of course the algorithm could bypass the sporadic groups since for large enough n it can just switch over to brute force at that point in the recursion. This seems true here—so Dick’s point about simpler classification is enough—and we just talked about how generally this might apply for any algorithm based (for the time being) on the classification. Moreover, it seems the use of the Schreier consequence for outer-automorphism groups doesn’t care about the simple groups involved, just that the “outer” groups are solvable. The proof does also use Luks’s algorithm as part of a three-way fencing off of cases (i.e., if you encounter an obstacle to Luks’s algorithm then something else good happens), which speculation I chose not to add on Wednesday.

]]>

László Babai is one of the world experts on complexity theory, especially related to groups and graphs. He also recently won the 2015 ACM Knuth Prize, for which we congratulate him.

Today we wish to discuss a new result that he has announced that will place graph isomorphism almost in polynomial time.

More exactly László shows that Graph Isomorphism is in Quasipolynomial Time: that is time of the form

for some constant . Polynomial time is the case when , but any is a huge improvement over the previous best result.

Luca Trevisan already has made a post on this result, and Scott Aaronson likewise. Luca further promises to be in Chicago next Tuesday when László gives his talk on the result—here is the abstract of the talk:

We outline an algorithm that solves the Graph Isomorphism (GI) problem and the related problems of String Isomorphism (SI) and Coset Intersection (CI) in quasipolynomial time.

The best previous bound for GI was , where is the number of vertices ([Eugene] Luks, 1983). For SI and CI the best previous bound was similar, , where is the size of the permutation domain (the speaker, 1983).

He is following this up with a talk on Thursday the 12th at 4:30 in the Mathematics Department’s Group Theory seminar titled, “A little group theory goes a long way: the group theory behind recent progress on the Graph Isomorphism Problem.”

Well for starters it is a vast improvement over the complexity of GI. And while we know nothing yet about the algorithm we can make some guesses about the result. These are just guesses, but may contribute to appreciating why the result is so important.

The algorithm likely uses some interesting structural results about graphs and or groups. The latter connection is clear, but the former could be that the automorphism group of a graph plays an important role in GI. If these structure theorems indeed are there, they could easily help solve other problems that we have in complexity theory. David Rosenbaum, an expert on group isomorphism, raised this point to me: perhaps László’s methods will finally move group isomorphism from quasi-polynomial into polynomial time.

László in a 2013 paper with John Wilmes proved a quasi-polynomial time algorithm for isomorphism of structures called Steiner 2-designs that are more specialized. Not only that, they compute unique canonical forms and enumerate all the isomorphisms. A difference that makes the last thing possible, however, is that there can *be* at most quasi-poly many isomorphisms between Steiner 2-designs.

In Luca’s comment section it is raised whether or not László’s new method uses the simple group classification. The famous classification result has had a myriad of applications in theory. Many are interested in removing the reliance on this extremely deep theorem: this is in spirit like our interest in de-randomizing algorithms.

The word “outline” in László’s talk abstract suggests that the proof is long, not surprising, and perhaps complicated, again not surprising. László is terrific so if he says he has the result, I will bet that all is okay. Two comments I just got are “wow” and “Now it is down to factoring and short lattice vectors. Whew.”

Well, there are other intermediate problems besides these two and discrete log. Notably there is minimum circuit size, whose “umbrella” relation to GI and factoring and discrete log we covered earlier this year.

Raising questions about other problems. This a surprising result. Is a similar result for factoring around the corner? Or shortest vectors? GI is also one of those problems that was in sub-exponential time and was not known to be in quasi-polynomial time. Placing factoring in this complexity class would be a huge difficulty for cryptography.

Is the result correct? What is the structure of the theorem? Does it give counting or canonical forms? And are there any new results that may be used elsewhere in theory?

Good luck to László; we all hope that the result is correct. What a major achievement.

]]>
*Kurt Gödel in popular culture and answers to Thursday’s problems*

Levi Weaver source |

Kurt Gödel may yet make it to Broadway. He already splashed across the silver screen in the 1994 Meg Ryan-Tim Robbins comedy I.Q. as the roly-poly sidekick of Walter Matthau playing Albert Einstein. He was pressed into retro vinyl by Levi Weaver for his 2011 album *The Letters of Dr. Kurt Gödel*. He features in the major Japanese manga series Negima! These borrowings may be incomplete or inconsistent, but with Gödel that’s par, no?

Today we consider Gödel’s impact on popular culture and give answers to the conjectures in Thursday’s post.

The play *Ghosts in Princeton* by Daniel Kehlmann gives serious focus to Gödel’s life. Although it travels Gödel to various times in his life, it is anchored in 1936–1939 with Gödel still in Austria despite the assassination of a close Jewish friend, the Nazi *Anschluss* of Austria in March 1938, and the outbreak of war in September 1939. It puts in Gödel’s mouth the logical antinomy,

If nobody believes you’re not

X, then arguably you’reX;

where *X* = being Jewish. The Nazis treated all members of the Vienna Circle as Jewish and it took Gödel “punishably long” to wise up. The play bounces its themes off Einstein including a re-enactment of the “always freezing” Gödel’s walks with him in Princeton between the Institute and Einstein’s home. When by himself, Gödel speaks in tones fit for Halloween:

For twenty days no stars, no sun. All black. The world is extinguished. Actually, that is beautiful.

Kehlmann’s play draws its title from Gödel’s own belief in ghosts, that one could communicate with them. It portrays the beginnings of his madness. Gödel had his first documented breakdown while returning from lectures he gave at Princeton in 1934. His paranoia was boosted by the killing of his Jewish friend and Gödel himself being set upon by a gang of Nazi youths when he was merely strolling with his wife.

The play also shows the paradox that madness comes from Gödel’s unflinching engagement with reality, aspects of which we tried to explain in our Gödel post a year ago. It has him say:

From the need to question there is no shrinking back afraid. Not even in the face of madness. One sees a mirror that is itself reflected in a mirror and one wishes to turn away. But one does not. And suddenly one begins to understand.

There are parallels here to how the movie “Pawn Sacrifice” attempts to show a partial origin of Bobby Fischer’s madness in how he saw certain policy matters in black and white. Both the play and movie, however, also show the more immediate impact of paranoia. It was rooted in some events—WWI-style poison gas for Gödel, (anti-) communist surveillance for Fischer—but assumed its own primal power.

Janna Levin’s award-winning novel *A Madman Dreams of Turing Machines* connects the *engagement* more closely to the mathematical content of Gödel’s theorems. We might say more about this novel’s portrayal of Gödel and Alan Turing. What I find a missing key, however—ironically in Fischer’s case—is the relative absence of an instinct for *play*. Play can be good for both balance and focus.

So turning away from serious themes, we’ll go back to being light. Our man is gaining stature on stage and screen and in music and novels, even what used to be called “comic books.” All he needs now is his own line of merchandise. Actually there already is one in his name, a gift shop in Niagara-on-the-Lake only 30 miles from my house:

Three of the problems we discussed have strong connections to Princeton. John Conway—himself an apostle of play—is there, and Edward Scheinerman’s doctoral thesis was under Douglas West, whom I was also fortunate to have as advisor for my Princeton undergraduate thesis. Janusz Brzozowski, who features prominently in the star-height problem, also did his PhD at Princeton. The two solvers of the equitable coloring problem still hail from nearby Rutgers University. Only the road coloring problem seems to escape a New Jersey connection, which is ironic for a state whose unofficial motto is, “What exit?”

The **Angel Problem** is *a win for the angel even with *. Although Conway’s paper evoked the maxim “fools rush in where angels fear to tread,” the angel in Oddvar Kloster’s strategy wins by rushing right at the devil, always keeping her left hand touching territory that is marked or otherwise forbidden. Kloster’s overall page includes a downloadable Java application showing his strategy in action. It also describes three other winning strategies, including another for by András Máthé.

These neat slides by Stijn Vermeeren discuss the problem and Kloster’s proof and some further variations and open problems. On a 3D grid the proof can be implemented by a chess king ( in 3D) using the and planes only. Whether a king that always moves to a higher plane can survive is open. There are also open problems about the complexity of determining whether a given board position is a win for the angel or devil.

The **Equitable Coloring Problem** was solved in the affirmative by András Hajnal and Endre Szemerédi in 1969 while in Budapest. Several strengthenings of the original conjecture, however, are still open.

The **Star-Height Problem** for a binary alphabet was solved fairly quickly in 1966 by Françoise Dejean and Marcel Schützenberger: no finite suffices. This too was the direction of the original conjecture. Still amazingly open, however, is the generalized problem of whether a finite suffices when complement is allowed as a basic regular operation. Then some expressions that seemingly require stars, such as , can be written without them by complementing: and so

**Scheinerman’s Conjecture** was proved by Jérémie Chalopin and Daniel Gonçalves in their STOC 2009 paper, “Every Planar Graph is the Intersection Graph of Segments in the Plane (extended abstract).” Still open, however, is the question of doing so using only 4 directional orientations of segments as hinted in our post.

The **Road-Coloring Problem** was also solved recently, in a paper of that title by Avraham Trahtman. His proof is surveyed nicely in this short note by Weifu Wang. There are no other impossible graphs besides those rules out by common divisors of the lengths of their cycles (or by having no cycles). Many problems about the related topic of synchronizing words in finite automata and other similar automata remain open, such as whether cubic upper bounds on their length relative to the number of states can be tightened to match quadratic lower bounds. We may cover this topic in the near future.

What do you think of the current open problems associated to the ones that were solved?

I flew across the Atlantic last night to join Dick and friends in London and Oxford. Would it be madness if I said I flew during the time change on Halloween to test Gödel’s time-travel equations by communicating with him again via spinning neutrinos at 35,000 feet to avoid the anomaly described in our previous post—but was thwarted by lack of Internet on British Airways crossings? Well, how about this annual time-change ritual which I witnessed every year of my study and research at Merton College, Oxford?

]]>

source |

Takaaki Kajita and Arthur McDonald won the 2015 Nobel Prize in Physics for their discovery that neutrinos have mass. Although some physicists had shown as early as the 1950s that standard particle models could accommodate neutrinos with mass, there was no compelling reason for it. Moreover, the most-discussed terms for neutrino mass lack the desirable mathematical property of *renormalizability*. So most physicists of the last century guessed that neutrinos would be massless like photons are.

Today Ken and I wish to talk about guessing the answers to problems and conjectures in mathematics.

That neutrinos have mass came from experiments but in a way that remains in some sense “nonconstructive”—the experiments do not tell you how to *measure* the mass. Instead, Kajita and his team at Japan’s Super-Kamiokande neutrino detector discovered in 1998 that one of the three major types of neutrinos had different statistical frequency depending on whether they came through the earth or from overhead. McDonald’s team at the Sudbury Neutrino Observatory showed convincingly in 2001 that neutrinos were permuting themselves among the three types in transit from the sun to the earth. Together their results explained a long-mysterious huge dropoff observed by experiments that detected only the type chiefly produced in the sun. The type changes were known to be possible only if the terms for neutrino mass are present.

Of course mathematicians cannot do experiments to test conjectures—or maybe in some cases we can? But they do have a feel for how “models” of potential mathematical reality that haven’t been proved hang together. Some sides of conjectures confer more desirable mathematical properties on larger structures.

As a game for you—our readers—we will describe some problems that were solved, and invite you to guess the answers without looking up the solutions. We will give the solutions in our next post, which will have a Halloween theme. The answers won’t scare off interest in the conjectures because many matters related to them are still open. Likewise the matter of neutrinos: all we know is the sum of the three masses and the difference of the squares of two of them, and the experiments so far have not even distinguished which of two plausible mechanisms may cause the mass to arise.

The first problem is the Angel Problem, created by John Conway. Two players called the angel and the devil play on an infinite chessboard—always chess, hmmm. The angel can fly to any square that is within king moves—that is, any place in the quadrant centered on her current location. The devil at each turn can put a block on any one empty square. The game is won by the devil provided the angel cannot move. Since the angel can fly over blocks but never onto them, this happens if and only when the devil takes the last unoccupied square in the angel’s current quadrant. The angel wins by surviving indefinitely.

Here is an illustration from the applet on Oddvar Kloster’s page on the problem—we’ve painted on the blue boundary for . The angel may move to any square within the blue quadrant except three that the devil has blocked. The picture also shows the devil blocking three squares further away in case the angel flies in that direction. The shading has to do with one side’s strategy—don’t look on his page yet for the answer.

The problem asks, *Who wins the game?*—and more particularly:

Is there a finite such that the angel wins?

For the angel is just a chess king and loses to a devil who plays entirely within a grid—note this is asymmetrical since the devil without loss of generality plays first. So must be at least 2. A funny feature is that higher effectively gives the devil more blocks per turn. The reason is that if the angel has a winning strategy, then she has one that never visits a square that was within a previous quadrant. If visiting that square is now optimal then so would have been visiting it earlier when the devil had blocked fewer squares. Thus she might as well be a Destroying Angel who bombs out every square of her quadrant other than the one she chooses at each turn. Larger means more bombing of the same order as her greater mobility. Conway’s writeup shows that several natural simple strategies for the angel *lose* for every .

The question is: *Is there some that gives the angel eternal freedom?* Or can the devil always build a clever wall starting far away that eventually encircles the angel? If you do not know the answer, which do you believe is true?

**Equitable Coloring.**

A *-coloring* of an -vertex graph is a mapping such that for each “color” , no two vertices such that are adjacent. The coloring is *equitable* provided for each the number of vertices given color is either or . That is, no two *color classes* differ in number by more than one vertex. If divides , so , then every color must be used exactly times.

Rowland Brooks proved that every graph of degree other than complete graphs and odd cycles has a -coloring. The complete bipartite graph has no equitable 3-coloring, though it has an equitable 2-coloring. The question is: *Does every graph of degree have an equitable coloring using exactly colors?*

**Star Height**

The star-height problem in formal language theory is the question of whether there is such that all regular languages have regular expressions that have at most -fold nesting of Kleene stars. It was easy to show a *no* answer over alphabets of arbitrary size. The question is: *Is there a bound on if the alphabet is binary?*

**Scheinerman’s Conjecture**

Edward Scheinerman conjectured in his 1984 PhD thesis that every planar graph can be represented as the intersection graph of a set of line segments in the plane. This is easy to see if you are allowed arbitrary curves in the plane, since you can replace a vertex of degree by a curve that winds itself around in a -pointed star. But doing it with straight line segments seems quite a challenge. We reproduce Wikipedia’s example for an 8-vertex planar graph:

The question is: *Is it always possible?* Note that the above example uses only lines with 4 orientations with no two segments of the same orientation touching. Such a graph can be 4-colored by making each orientation a separate color, so a yes answer using only 4 orientations implies the four-color theorem. But we are getting ahead of ourselves—is it possible at all?

**Road Coloring Problem**

Suppose we have a directed graph in which every vertex has the same out-degree . Let us use the same alphabet of characters to label the out-edges from every node. Then for every vertex and string of these characters, using successive characters of as directions on which out-going “road” to take determines a unique path of length in the graph. If the path ends at a vertex we write . If we get the same for all —even itself—then is a universal sequence of road directions to reach vertex . For a particular we can ask:

Can we assign labels so that every has a universal string ?

If has no directed cycles then the answer is clearly no—you can never get to a source. If all cycles have even length then a parity argument shows no, and it works if all cycle lengths are divisible by some odd number as well. No assignment of labels will work. Are there *other* graphs for which no assignment will work? *That is the question.*

Without looking the answers up, can you guess them? We have stated the final italicized yes/no question in each one so that the numbers of ‘yes’ and ‘no’ answers are equitable.

[Qualified “most-discussed” before “terms” in first paragraph (see KWR comment in reply); “photons”–>”neutrinos” before “have mass” in first line of 3rd paragraph; gave fuller statement of knowledge of masses at end of intro]

]]>

source |

Michelle Kwan was one of the last great figure skaters to compete under the historic “6.0” ranking system. She won the world championship five times under that system but was a squeaker second in the 1998 Winter Olympics and a more-distant third in 2002. An injury on the eve of the 2006 Winter Olympics prevented her from competing under the new and current system, which involves a complex numerical rating formula.

Today we discuss rankings versus ratings with an eye to complexity and managing them by common tools.

A *ranking* of items from a set of items is a list in non-ascending order of preference; it is complete if A *rating* of those items is a function A rating always induces a ranking—maybe allowing ties—but not the other way around. When a ranking has no ties we associate to it the ordinal function ; when there are ties we may average the ordinals or use some integer near the average. Given different rankings or ratings—of subsets of the items that may only partly overlap and perhaps span multiple categories—the object is to *aggregate* them into the single ranking or rating that best represents the sample.

The old 6.0 system used points on a 0.0–6.0 scale (increments of 0.1) in the two categories *technical merit* and *artistic presentation* to make a number up to 12.0 for each skater. Thenceforth the judge’s scores were treated as a ranking no matter how far ahead one skater was over the next. There were rules for avoiding and breaking ties, but in extreme cases like those in our post on Kenneth Arrow’s paradox ties could be left to stand.

The new rating system nearly always produces a total order upon simply adding the judges’ scores for each skater. After averaging, the highest score achieved in international competition to date for a man is 295.27 by Patrick Chan of Canada, and for a woman, 228.56 by Kim Yuna of Korea. It is not clear what these scores are “out of”—apparently not 300 and 250. They do not connote perfection the way “6.0” did. An appraisal by The Economist during last year’s Winter Olympics noted that the new system pushes skaters to technical extremes—while leaving

…little time either during routines or in training sessions for optional acrobatic or artistic showstoppers, like Michelle Kwan’s notorious spirals, in which she flashed a huge smile while speeding down the ice and audiences routinely jumped to their feet.

One aspect at the 2015 Applied Decision Theory conference, which we recently mentioned, that surprised me was the theme from several speakers that rankings and preferences are often more “human” than ratings, both for obtaining data and for representing it. So let us look more closely at ratings and rankings—and also preferences that may have cycles and so not yield rankings.

The first fact is obvious but central: a rating of items needs size only , where the absorbs -sized labels of items and log-sized numbers. Whereas a preference structure may involve specifying up to pairs even if it does define a total order. I’ve appreciated this difference when poring over tables of hundreds of ranked players before my fantasy baseball and football drafts. How do they decide to rank Peyton Manning ahead of Drew Brees, then Brees above Tom Brady—and really, Manning over Brady? The answer is they didn’t *think* any preferences; they ran a model to generate expected “fantasy league points” for each player and transcribed the total order from these ratings. (I do not know how much Brady’s expected points were deflated by his possible multi-game suspension.)

Preference structures are more combinatorial. When all pairs are compared they define a tournament. They also can be staggered as nets of *conditional preferences* (CP-nets). A CP-net allows saying to prefer over if the aggregate chooses over , but reverse the preference if is chosen. Such a CP-net might be expanded out to a preference structure on *sets* of items, but the CP-net form can be exponentially more succinct. The keynote for the first day of ADT 2015 was by Kristen Brent Venable on her paper with Cristina Cornelio, Umberto Grandi, Judy Goldsmith, Nick Mattei, and Francesca Rossi on a compact way to represent probabilistic uncertainty about preferences by distributions over CP-nets.

A ranking without ties is also a familiar combinatorial object: a permutation. Thus algebraic concepts from permutation groups are relevant to analyzing rankings. It seems less natural to apply them to ratings—valuations apply to elements of fields in connection with algebraic geometry, but permutations of those elements seem not to be involved. Preferences that don’t yield rankings can have algebraic structure in a different way: they might decompose into cycles. Acyclic preference structures have well-defined transitive closures and so inherit all the theory of partial orders and lattices. All this seems a lot of bother compared to using ratings, but one surprise alluded to by Venable and by Michel Regenwetter presenting the first paper of ADT 2015 is that it is much harder to get human respondents to give ratings than preferences, and the rating numbers are often unreliable when they do.

An aggregation problem can be specified by giving a distance measure between a ranking or rating and a set of such rankings/ratings. The set can simply be the set of being aggregated or might be derived from it. The problem in either case is given to find minimizing , or to decide whether the distance can be made lower than some given value In the simple case we can use a binary measure of *dissimilarity* and apply (e.g.) “least squares” to ask for minimizing

The complexity parameters and govern these problems. An example with high and low is determining one or more winners of an *election*. An example with low and high is comparing and aggregating results from a few *search engines*. A much-cited reference for the latter is the 2001 paper “Rank Aggregation methods for the Web” by Cynthia Dwork, Ravi Kumar, Moni Naor, and D. Sivakumar (who graduated from Buffalo under me in 1996).

It is not surprising that many of these aggregation problems are -hard, and remain hard even when or is a fixed small number like 3 or 4 (see e.g. these slides by David Williamson). What was surprising at ADT 2015 was that several speakers didn’t care so long as the problems could be transcribed efficiently into : “we can just use our -solvers on them.” Welcome to a world where may as well equal , but we care about the difference between and quadratic time, or quadratic versus cubic. With that in mind let’s consider those functions.

Let stand for any function on pairs of items that is *antisymmetric*:

If we have a rating function , then there is the simple rating difference function

If we only have a ranking, then we can define

- , or

Note that these depend only on the function values , or in the case of rankings. We can sacrifice some generality for clarity by assuming this of Then we can represent two ratings or rankings and having respective functions by vectors of function values and This further allows us to pretend that “the same” function is being applied to pairs of values for each ranking, though per above there are really different functions and There might be some useful generality to gain by using separate and noting that no other information about and is used, but the form with values is nicest. Define:

The British statistician Maurice Kendall originally defined this for tie-less rankings and Then the denominator is just a sum of over all so it equals , the number of pairs. The numerator has for every pair of items on which and agree, and whenever one prefers and the other prefers Hence the numerator is the number of agreements on pairs in the rankings minus the number of disagreements. Thus

- achieves its maximum of when ;
- achieves its minimum of when equals reversed, so

These properties in fact hold for general and : is always a number between and Kendall also gave this general form. Often is reserved for the case with where the denominator is fixed at even when there are ties and is used as above, but let us stick with

For the Kendall tau simplifies neatly to the correlation measure of Charles Spearman:

Showing this equals divided by the square root of times is a nice exercise.

The above formula also shows that is computable in linear time. The case looks quadratic but is in time by an algorithm discovered by William Knight in 1966. The algorithm first sorts pairs by Then while *merge-sorting* them by it can quickly count the number of in pieces being merged that a is greater than, and use these to count all pairs on which agree during the sort. This suggests the question, echoed on page 1031 of this 2009 paper:

For which other is computable in time?

What I find even more attractive about are properties of *scaling* that can be tuned for other purposes by flexible choice of

In a recent post I tabulated how the values and rankings of chess moves given by computer chess *engines* change as their *depth* of search increases, and the strong evident influence these changes have on human choices. We want to compare both the rankings and the values across depths. Noting the common generalization between Spearman’s and the original , we don’t have to make a black-and-white choice between them but instead ask:

What’s the best function for this application?

First we note that both shift invariance and scale invariance can be conferred by certain choices of You want the former when should be considered the same as for any constant This is automatic for rankings but is desired for ratings when only differences matter. *Ipso-facto*, if depends only on then the resulting is shift-invariant.

Scale invariance comes in when a 0-to-100 scale is treated the same as a 0-to-10 scale. This is conferred when is linear but not only thus. Consider

These are *pseudolinear* in the sense that for every there is a depending only on such that for all , For all such the tau measure is invariant in both the and scales separately: for all constants ,

This is nice because chess programs have their own scale factors. We can compare results from two engines on the same position without having to worry about normalizing these factors.

The most important desire, however, is common to the search-engine application: *We want only the figures on good moves to matter*. Here finally is where we cannot be satisfied with rankings and need ratings. If all but 3 moves in a position are bad, it doesn’t matter much which move an engine ranks 4th. Likewise, when only a few search hits matter we can ignore the rest—and ignore items ranked outside the first few dozen on both search engines anyway.

If we compare not the raw engine values but rather their differences from the optimal value, then high values are bad (moreover, this takes care of shift invariance before choosing ). A function like makes small when either or is big—so any comparison with a bad move is dampened. This is OK if the engines are expected to agree on moves that are bad, and we want to emphasize their selectivity among the moves that they all recognize as reasonably good. For further purposes one might choose a higher power in the denominator or implement other considerations, such as how magnifies close numerical agreements.

In my paper with Tamal Biswas we used our measure to define the *complexity* of a chess positions by how much the ratings of moves jump about at successive depths. From this we derive measures of *difficulty* and *discrimination* of chess positions viewed as aptitude-test items.

We have given a flyover of issues and mathematical tools and problems involving rankings and ratings. We’ve noted some remarkable properties of the Kendall tau distance—there are more we could discuss in regard to independence and correlation—and have suggested some further ways to apply it. Can a more unified treatment of rankings and ratings help in progressing from the former to the latter?

Here is a wild question. Scale invariance effectively makes every vector a unit vector. Inner products of complex unit vectors range from to , though they take complex values in-“between.” Can Kendall’s be related to an inner product or even be made to “work like” one? It is interpreted as a difference of probabilities, whereas in quantum mechanics the absolute values of certain inner products become square roots of probabilities.

[slight word changes in last main section]

—A myth of a myth of a myth?

]]>

*A simple idea that everyone missed, and more?*

Composite of src1, src2, src3

Joshua Miller and Adam Sanjurjo (MS) have made a simple yet striking insight about the so-called hot hand fallacy.

Today Ken and I want to discuss their insight, suggest an alternate fix, and reflect on what it means for research more broadly.

I believe what is so important about their result is: It uses the most elementary of arguments from probability theory, yet seems to shed new light on a well-studied, much-argued problem. This achievement is wonderful, and gives me hope that there are still simple insights out there, simple insights that we all have missed—yet insights that could dramatically change the way we look at some of our most important open problems.

I would like to point out that I first saw their paper mentioned in the technical journal WSJ—the Wall Street Journal.

Our friends at Wikipedia make it very clear that they believe there is no hot-hand at all:

The hot-hand fallacy (also known as the “hot hand phenomenon” or “hot hand”) is the fallacious belief that a person who has experienced success with a random event has a greater chance of further success in additional attempts. The concept has been applied to gambling and sports, such as basketball.

The fallacy started as a serious study back in 1985, when Thomas Gilovich, Amos Tversky, and Robert Vallone (GTV) published a paper titled “The Hot Hand in Basketball: On the Misperception of Random Sequences” (see also this followup). In many sports, especially basketball, players often seem to get *hot*. For example, an NBA player who normally usually shoots under 50% from the field may make five shots in a row in a short time. This suggests that he is “in the zone” or has a “hot hand.” His teammates may pass him the ball more often in expectation of his staying “hot.”

Yet GTV and many subsequent studies say that this is wrong. They seem to show rather that shooting is essentially a series of independent events: that each shot is independent from previous ones, and that this is equally true of free throws. Of course given the nature of this problem, there is no way to *prove* mathematically that hot hands do not exist. There are now many papers, websites, and a continuing debate about whether hot hands are real or not.

Let me start by saying that Ken and I are mostly on the fence about the fallacy. I can imagine many situations that make it mathematically possible. For one, what if a player in basketball is not doing independent trials but rather executing a Markov process. That is, of course, a fancy way of saying that the player has “state.” Over a long term they will make about of their free-throws, but there are times when they will shoot at a much higher percentage. And, of course, times when they will shoot at a much lower percentage—a “cold hand” is more obvious with free throws and may lead the other team to choose to foul that player.

I personally once experienced something that seems to suggest that there is a hot hand. I was a poor player of stick-ball when I was young, but one day played like the best in the neighborhood. Oh well. Let me explain this in more detail another day.

Ken chimes in: This folds into a larger question that is not fallacious, and that concerns me as I monitor chess tournaments:

What is the time or game unit of being

in form?

In chess I think the *tournament* more than *game* or *move* is the “unit of form” though I am still a long way from rigorously testing this. In baseball the saying is, basically, “Momentum stops with tomorrow’s opposing starting pitcher.”

Nevertheless, I (Ken) wonder how far “hot hand” has been tested in *fantasy baseball*. In my leagues on Yahoo! one can exchange players at any time. Last month I dropped the slumping Carlos Gomez in two leagues even though Yahoo! still ranked him the 8th-best player by potential. In one league I picked up Jake Marisnick, another Houston outfielder who had been hot. Marisnick soon went cold with the bat but stole 6 bases to stay within the top 100 performers anyway. The leagues ended with the regular season, but had they continued into the playoffs I definitely would have picked up Houston’s Colby Rasmus after 3 homers in 3 games. While I’ve been writing the last three main sections of this post, Rasmus hit another homer today.

MS have a new paper with the long title, “Surprised by the Gambler’s and Hot Hand Fallacies? A Truth in the Law of Small Numbers.” Their paper is interesting enough that it has already stimulated another paper by Yosef Rinott and Maya Bar-Hillel, who do a great job of explaining what is going on. The statistician and social science author Andrew Gelman explained the basic point in even simpler terms on his blog in July, and we will follow his lead.

For the simplest case consider sequences of coin tosses. We will generate lots of these sequences at random. We will look at times the coin comes up heads () and say it is “hot” if the next throw is also . If the sequence is all tails, however:

then we must discard it since there is no to look at. This is our first hint of a lurking *bias*. Likewise if only the last throw is heads,

there is no chance for a hot hand since the “game is over.” Hence it seems natural to use the following sampling procedure:

- Generate a sequence at random.
- If it has no head in the first places, discard it.
- Else pick a head at random from those places. That is, pick such that .
- Score a “hit” if , else “miss.”

Let *HHS* be the number of hits after trials—not counting discarded ones—divided by . That is our “hot hand score.” We expect *HHS* to converge to as gets large. Indeed, there seems a simple way to prove it: Consider any head that we chose in step . The next flip is equally likely to be or . If it is we score a hit, else a miss. Hence our expected hit ratio will be 50%. Q.E.D. And wrong.

To see why, consider . The sequences and are discarded, so we can skip them. Here are the other six sequences and their expected contribution to our hit score. Remember only a head in the first two places is selected.

This gives *HHS* , **not** . What happened?

What happened is that the heads are not truly chosen uniformly at random. Look again at the six rows. There are 8 heads total in the first two characters. Choose one of them at random over the whole data set. Then you see 4 heads are followed by and the other 4 by . So we get the expected 50%. The flaw in our previous sampling is that it was *shortchanging* the two heads-rich sequences, weighting each instead of . This is the bias.

You might think the bias would quickly lessen as the sequence length grows but that too is wrong. Try . You get the weird fraction . This equals , which has gone *down*. For we get . As the expectation does come back up to but for , which is a typical high-end for shots by a player in an NBA game, it is still under . Rinott and Bar-Hillel derive the case with probabilities of heads and of tails and as

The bias persists if we condition on heads in a row. “Appendix B” in the MS paper attempts to find a formula like that above for but has to settle on partial results that imply growth with among other things. We can give some concrete numbers: For and the chance of seeing heads next is the same . For and it has dipped to with 17 of 32 sequences being thrown out. The larger point, however, is that these numbers represent the expectation for any **one** sequence that you generate, provided you follow the policy of selecting at random from all the subsequences in through place and predicate on the next place being .

Did GTV really follow this sampling strategy? It seems yes. It comes about if you follow the letter of conditioning on the event that the previous flips were heads. Rinott and Bar-Hillel quote words to that effect from the GTV paper, ones also saying they compared the probability conditioned on the last shots being *misses*. The gist of what they and MS are saying is:

If you did your hot-hand study as above and got 50%, then chances are the process from which you drew the data really was “hot.”

And since the way of conditioning on *misses* has a bias toward , if you got again then your data might really also show a “Cold Hand.” It should be noted, however, that the GTV study of free throws is not affected by this issue and indeed already abides by our ‘grid’ suggestion below—see also the discussion by MS beginning at page 10 here.

MS give some other explanations of the bias, one roughly as follows: If a sequence starts with heads and is followed by , then you had no choice but to select those heads. But if they are followed by another , then we do have a choice: we could select the latter heads instead. This has a chance of failure, so the positive case is not as rich as the negative one. Getting positive results—more heads—also gives more ways to do selections that yield negative results.

This suggests a fix, but the fix doesn’t work: Let us only condition on sequences of heads that come after a tail or start at the beginning of the game. The above issue goes away, but now there is a bias in the *other* direction. It shows first with and ; we have grouped sequences differing on the last bit:

This gives . The main culprit again is the selection within the sequence , with the discarding of also a factor.

This train of thought led me (Ken) to an ironclad fix, different from a test combining conditionals by MS which they refer to their 2014 working paper. It simply allows only one selection from each non-discarded sequence, namely, testing the bit after the first consecutive heads. The line then gives and makes the whole table overall.

That this is unbiased is easy to prove: Given in the set of non-discarded sequences, define to be the sequence obtained by flipping the bit after the first consecutive heads. This is invertible— back again—and flips the result. Hence the next-bit expectation is . This works similarly for any probability of heads and does not care whether the overall sequence length is kept constant between samples. Hence our *proposal* is:

Break up the data into subsequences of any desired lengths . The may depend on the lengths of the given sequences (e.g., how many shots each player surveyed took) but of course not on the data bits themselves. For every that has a run of consecutive “hits” in the first places, choose the first such sequence and test whether the next bit is “hit” or “miss.”

The downsides to putting this kind of grid on the data are sacrificing some samples—that is, discarding cases of that cross gridlines or come second in an —and the arbitrary nature of choosing rules for . But it eliminates the bias. We can also *re-sample* by randomly choosing other grid divisions. The samples would be correlated but estimation of the means would be valid, and now every sequence of might be used as a condition. The re-sampling fixes the original bias in choosing every such sequence singly as the condition.

A larger lesson is that hidden bias might be rooted out by clarifying the *algorithms* by which data is collated. Researchers count as part of the public addressed in articles on how not to go astray such as this and this, both of which highlight pitfalls of conditional probability.

Has our grid idea been tried? Does it show a “hot hand”?

**Update (10/18):** The idea is apparently new, though as we admit in the post it is “lossy.” Meanwhile this has been covered in the Review section of the Sunday 10/18 New York Times, which in turn references and gives a chart from this post by Steven Landsburg on his “Big Questions” blog.

]]>

Christopher Chabris just wrote a wonderful piece on cheating titled “High-Tech Chess Cheaters Charge Ahead.” Chabris is a research psychologist who is well known for his book *The Invisible Gorilla*, written with Daniel Simons.

Today I want to point out that the piece is in this Saturday’s review section of the Walll Street Journal.

Often this section is reserved for commentary on politics or the economy or some other issue of the week. But this week our own Ken Regan is featured as the expert on chess cheating. Wonderful. Here is a short part of the article—see the article on-line for the rest.

Ken Regan, an international chess master and computer scientist, has developed a software tool that automates the process of comparing human and computer moves, and flags suspicious cases. The approach is sophisticated: It doesn’t suggest that, say, current world champion Magnus Carlsen is a cheater just because his moves often match those of a computer. That’s to be expected. Mr. Regan instead finds cases in which players matched computer moves much more often than expected, given their skill levels and the situations on the board.

Read not just the article but some of the comments. One offers a way to try and stop the problem:

I’m no electronics expert, but wouldn’t it be possible for a location holding a major tournament to install some sort of jamming device that would interfere with signals?

Jamming is usually illegal, and I do not think it would solve the problem anyway. But it does raise **the** problem: is there some way to stop chess cheating? For now we have Ken to thank for at least a post-factor method of detecting cheaters.

Again congratulations to Ken.

]]>