Cricket, 400, and Complexity Theory
A random cross-section of our first 400 posts
Brian Lara remains the only cricket batsman to score 400 or more runs in one innings of a Test Match. In fact he scored exactly 400 not out, meaning in this case that he could have scored even more runs had his team not declared their innings closed. Lara was born on the island of Trinidad, and played a long and storied career for the West Indies cricket team. Test Matches, which may span 5 days, are considered the highest form of cricket, and only nine nations plus the confederation of fifteen West Indies countries are entitled to play them. Scoring 50 in a Test is notable, a century of 100 is the gold standard, 200 is platinum, and 300+ has happened only 25 times, including Lara’s 375 and 400.
Today we make our 400th post on this blog, and randomly reflect on the innings played thus far—which kinds of posts we have had, and what kinds readers may wish to see.
Winning a Test Match requires getting ten of the other side’s eleven batsmen out in both innings. It doesn’t matter how many runs one side is ahead—if the 5 days expire without collecting 20 of the other side’s wickets, the match ends in a draw. Thus the West Indies voluntarily closed their first innings as soon as Lara reached 400 with a team score of 751, because it was already the middle of Day 3. They collected 10 wickets to end England’s first innings at 285, and their huge lead invoked the follow-on rule forcing England to bat first in the second innings. However, England lasted out the final day losing only 5 wickets to hold the draw.
Lara came out of retirement in 2010 at age 41 to play Twenty20 cricket. This is the cricket analogue of Rapid chess, giving each side just one innings of 20 six-ball overs each, whereas a Test typically sees 90 overs per day. The whole match usually finishes in about 3-1/2 hours, and the batsmen play in exciting allegro fashion because there is less worry of losing ten wickets. The team with more runs wins even if the other side loses no wickets, and tiebreak rules ensure there are no draws. Lara scored 65 runs in his Twenty20 debut, but not quickly enough to pass the other team’s score.
This blog scored a triple of something else just yesterday: 3 million reads since inception in February 2009. This works out to just under 2,500 reads per day, and just over 100 per hour. None of those reads, however, have come from China since October 2011, when wordpress.com was blocked wholesale according to this list.
We also have 10,939 officially logged comments, for which we thank all who participate. Maybe about 1,000 of those shouldn’t count, since our own cross-reference links to others of our posts count as comments. Hence we could be near a round number of “true” comments too.
Each post also has an official word count. For instance, Wednesday’s quantum-debate post has 2,270 words. It is longer than usual; our typical target is 3-to-5 LaTeX pages at 350–400 words per page. There seems, however, to be no convenient way to add up the counts for all the posts—the widgets we find online seem to apply only to the sister service wordpress.ORG for self-hosted blogs. We were curious whether we were getting anywhere near one million total words, and how much longer that might take.
When you don’t want to scroll manually through 399 posts, what should a computer scientist do? Sample. Use random or pseudo-random bits to generate, say, 40 numbers in the range 1 to 399; then find the corresponding posts and add up the words. Simple.
Well not so simple. Our WordPress dashboard does not index the posts 1–399. They have URLs for editing that include numbers currently up to 8,910, but they seem to be irregularly spaced. Instead the master list of posts gives pages of 20 posts each—and we have 20 such pages with page 1 most recent and page 20 currently holding the the chronologically first 19 (not 20) posts.
Ken modified an online example Perl script using the rand function to compute also the corresponding page number plus index on that page, where the topmost listed on each page has the highest post number (since most recent) but gets index 1. He made the mistake of allowing numbers to repeat. A rough Birthday Paradox calculation for 40 draws would expect about one duplicate, and that is exactly what Ken got, items number 11 and 39. If you have a WordPress blog and wish to use and/or fix his little script it is here. After looking up the word counts for the 39-plus-one-duplicate posts he selected, Ken got this output:
Twenty Random Posts
Ken noticed that the 23rd post he generated was An Educational Extinction Event, which recalled the current unfortunate case of the ouster of Teresa Sullivan as president of the University of Virginia. So he backtracked to 21 and took down also the links of the latter 20 posts in his sample. Here are a random sample of the sample of the posts, along with excerpts—you may wish to look at some of these, especially if you missed them the first time:
Happy Mother’s Day: Jennifer Lipton-O’Connor is not a computer scientist, she is a PhD in psychology. She is also my daughter, to whom I wish a happy mom’s day.
A story about Jennifer. She may have invented, when she was about ten years old, a novel type of diagonalization argument. One day in May, I was driving her and her older sister, Andrea, to the mall. Andrea was twelve at the time. Jennifer asked if I could take her to Six Flags, a great amusement park, sometime over the summer. Andrea, who was about to go away to camp for the whole summer, immediately complained. Andrea said that it would not be fair if we went when she was away. Andrea loved, and still does, roller-coasters, so she really wanted to go too. I answered something non-committal to both of them. But Andrea was clearly upset. Jennifer finally turned to Andrea and said,
Andrea, would it be okay if we went, but we did not tell you?
I almost lost control of the car laughing. Andrea, who was never at a loss for words, just looked at her sister.
Rabin’s 80th Birthday: Michael Rabin is brilliant.
His work has touched large areas of mathematics as well as large areas of computer theory. Without his work theory would be completely different today. Michael’s work is an incredible mine with three rich veins. There are deep, hard theorems—for instance his work on the second-order theory of two successors. There is foundation laying—for instance his work on finite automata with Dana Scott. And there is practical work—for example his beautiful string matching algorithm with Richard Karp, and much more. His concept of oblivious transfer is a bedrock primitive in secure computation. Not surprisingly he has been honored with many prizes, including the Turing Award, the Israel Prize, the Emet Prize, the Harvey Prize, the Dan David Prize, and others.
Theorems Are Forever: A Great One From 1492 (Make that 1942):
The paper of Salem and Spencer is titled: On Sets Of Integers Which Contain No Three Terms In Arithmetical Progression. The title kind-of gives away the whole result. This was probably more important in 1942, since a paper with a title like: A Result On Patterns In Numbers would be hard to find. Recall there was no search inside papers in those days. Nowadays we can use any title at all and people can still search and find your result. They can even search to see if you’ve worked on a particular equation. But then a good title was worth many citations.
Surprises In Mathematics And Theory: The geometer Karol Borsuk asked in 1932: Can every convex body in be cut into pieces of smaller diameter? This became the Borsuk Conjecture, which was proved in low dimensions and for special classes of convex bodies—for example, Hugo Hadwiger proved in 1946 that it was true for all smooth convex bodies. The intuition seemed to be that the result was going to eventually be proved; it was just a matter of time.
The shock, I think more than a surprise, came when Jeff Kahn and Gil Kalai proved in 1993 that for large enough the answer was not . On the contrary, they showed that for large enough , we need at least
pieces. Notice that this is a lower bound, and does not imply that pieces are enough. Here is a fixed constant. I think it is fair to say that is really different from —no? Their paper is short and brilliant: you definitely should take a look at it.
O Abuse: The problem with O-notation is simple. Suppose that we prove that some algorithm runs in time . Is this better or worse than another algorithm that runs in time ? It depends. In the long run, clearly, the first algorithm will be faster. The economist John Maynard Keynes once said: In the long run, we’re all dead. So the fact that is eventually better may or may not matter.
SAT Is Not Too Easy: SAT is of course the problem of determining a truth assignment for a set of clauses, and is the original NP-complete problem. Yet what we know about the complexity of SAT is pretty slim. The conventional wisdom, of course, is that SAT cannot be recognized by any deterministic polynomial time Turing machine. What we believe is vastly different from what we can prove. The best results look like this:
Theorem: No deterministic Turing machine accepts SAT in time and space where .
Here is some fixed constant that is unfortunately small: I call it a “” since the first result was due to Lance Fortnow. The current one is due to Ryan Williams.
Drug Doping And Mathematics: Suppose there was a new wonder drug, let’s call it T, that for a short time tremendously increases your ability to concentrate and therefore to solve mathematical problems. Of course nothing is free; the drug would have bad side effects and health risks. The question is: Would you take the drug T?
Would you take the drug and use it to solve P=NP? Or to solve other problems? What if the health risks were high? Would that change your mind? What if you needed to be on the drug for months while you were working on the proof, would that change your answer?
Fast Matrix Products And Other Amazing Results: Strassen’s 1969 result Gaussian Elimination is not Optimal shows, of course, a way to multiply two matrices in subcubic time. This is certainly an amazing result—it came as a shock to everyone that the classic method was not optimal. Strassen’s fast matrix algorithm launched a thousand results; today many of the best algorithms for X are obtained by reducing X to matrix products. As a consequence, it is quite common to see algorithms which run in time , for instance, where is the exponent of the fastest known matrix multiplication algorithm.
The current best matrix product algorithm is now due to Don Coppersmith and Shmuel Winograd—wrong see this for the new result of Virginia Williams. I must mention the brilliant work of Henry Cohn, Robert Kleinberg, Balázs Szegedy and Christopher Umans who can reduce the search for faster matrix algorithms to certain group theory questions. I will discuss their work another time.
There is a story—I believe is true—how Strassen found his terrific algorithm. He was trying to prove that the naive cubic algorithm is optimal. A natural idea was to try and prove first the case of two by two matrices—one by one matrices are trivial. One day he suddenly realized, if there was a way to multiply two by two matrices with seven multiplications rather than eight, there would be a subcubic recursive method for by . He immediately sat down and quickly discovered his now famous formulas. At least this is the story.
Update: Ken has listed the other 11 posts not mentioned in this section in a comment below.
Brian Lara also holds the record for scoring in an innings of first-class cricket, which includes professional league matches within a country. He scored 501 runs, again not out. What shall we do while aiming for that total?
Thanks to all for the past years’ support. We would love to hear what type discussions you would most like to see in our next century.
[enlarged photo, fixed typo in table, some spot-adds, added comment linking rest of sample]