The Role Of Memory In Theory
Is rote learning useful?
Garry Kasparov retired from top-flight chess six-and-a-half years ago, founding a Russian political party called the United Civil Front. This is part of a larger coalition opposed to Prime Minister Vladimir Putin called The Other Russia, whose English-language website is managed by chess commentator Michael “Mig” Greengard. We often see chess used as a metaphor by politicians and cartoonists, but this is the opposite. Kasparov wrote with Greengard a book explaining parallels from chess to life.
Today I want to ask a question posed by Dick Karp. It came about after Michael Rabin told a Kasparov story over lunch, on the last day of the Rabin Celebration a month ago.
Despite his retirement Kasparov still plays exhibition matches against both former and upcoming stars, and “simultaneous exhibitions” against 30-40-or so lesser players. He is set to play an 8-game match of “blitz chess” against his former world championship challenger Nigel Short this coming Sunday. Kasparov gave a 30-board “simul” during festivities at Tel Aviv University last year where Rabin received a Dan David Prize shared with Leonard Kleinrock and Gordon Moore. Later that evening Michael asked Kasparov how he evaluated each board. Michael wanted to know:
Did he re-evaluate each board from scratch each time he returned or did he recall the board’s position and continue his previous thread?
In computer talk: did he have state?
Kasparov looked a bit surprised, Michael recalls, and said of course he remembered the board position, and of course he did not have to recalculate from scratch. He said further that when he was young he could recall all the moves of all the games he had played for the last six months. At the the time of his visit to Michael he was “older” and he could only recall the games he played that week. But recalling the boards during the simultaneous match was still quite easy for him.
As a “wood pusher” unlike my partner Ken, I am beyond impressed. Clearly memory plays a major role in chess—top players usually know many openings, and many moves into each opening. But to recall that many games, for that length of time, seems quite impressive to me. Well Ken says at age 12 he was able to recall all moves of all tournament games he had played since age 10, but he lost that ability by age 14.
The question that Karp asked Michael and the rest of us at the table was: did memory, even rote memory, play a major role in success at mathematics and theory? He answered his own question quickly—I am giving the gist in third-person: He had thought for a long time that “rote” memory was not very important, but he had changed his mind and started to believe that it played a bigger role in problem solving than he had thought. This lead to a lively discussion at the table, as you can imagine. Most took the position that rote memory especially was not useful, that one needed to understand the concepts. But others besides Karp—myself, for example—felt that perhaps while rote memory usually has a negative connotation, perhaps it is underrated.
A Old Example
All this made me recall a theorem that Dick and I proved years earlier about the cover time of graphs. This is the result of Romas Aleliunas, Richard Karp, Richard Lipton, Laszlo Lovasz, and Charles Rackoff—often refered to as Aleliunas et al. The short version is that Dick came back to Berekely from Toronto and told me the question over lunch—lunch seems to be a good time for Karp to raise questions. There he stated the conjecture that the others had made: a random walk on an undirected graph would quickly cover all the vertices in expected polynomial time.
I have to say that I liked the question immediately. Dick and I continued talking about it in his office right after lunch, where we reduced a proof to a simple fact about Markov chains. We needed to know the expected return time of a certain random walk. Unfortunately Dick had to go teach class, so I told him I would search the library, which was downstairs from his office, while he was in class. I hoped that I could find a formula for the expected return time.
Recall this was in 1979 and there was no Google, no search engines, no Internet, so one actually looked things up by hand. I went into the math library and began a quick linear search of all journals that seemed like they had anything to do with random walks. After about an hour I found the formula we were looking for: the expected return time was just the reciprocal of the probability of being in that state.
As soon as Dick returned from his class I showed him the formula. We both were a bit embarrassed, since the formula was easy to prove. But worse, it was something that we had no doubt seen before, but forgotten. If we’d had better memory we would have proved the theorem instantly. In any event the conjecture became a theorem, and we all wrote up a paper together.
I have no idea whether this story was part of the reason Dick raised the question of rote memory. But I always wondered what would have happened if we had not been able to find the simple formula? It was the type that once one has it, it is easy to prove. But without the formula it seems hard to figure out what is happening with the random walk.
This was long ago. Maybe today young researchers are much better than we were in the understanding of Markov processes, but back then it was hard to prove this theorem. Oh well.
Richard Feynman sharply criticized this type of learning. A quote from our friends at Wikipedia is relevant:
Rote learning is sometimes disparaged with the derogative terms parrot fashion, regurgitation, cramming, or mugging because one who engages in rote learning may give the wrong impression of having understood what they have written or said. It is strongly discouraged by many new curriculum standards. For example, science and mathematics standards in the United States specifically emphasize the importance of deep understanding over the mere recall of facts, which is seen to be less important, although advocates of traditional education have criticized the new American standards as slighting learning basic facts and elementary arithmetic, and replacing content with process-based skills.
Some have a different take. See this, for example. Larry Sanger says:
Whenever I encounter yet another instance of educationists’ [I’m guessing that is not a term of endearment] arguments against “memorizing,” the following rather abstract yet simple thought springs into my philosopher’s mind: Surely the only way to know something is to have memorized it. How can I be said to know something that I do not remember? So being opposed to memorizing has always sounded to me like being opposed to knowledge. I realize this argument likely seems glib. The thing educationists object to, of course, is not the remembering or even the memorizing but rather the memorizing by rote—that is, by dull repetition and often without experience or understanding.
In complexity theory one must learn many facts that I cannot see how to know except by memorization. Just think of the the definitions of the almost countless different complexity classes: there is no way to know what is unless you have memorized it. That does not even cover the also almost countless theorems and lemmas that relate these classes.
Ken notes that Garry Kasparov’s female counterpart of the 1980’s into the 1990’s, former women’s world champion Susan Polgar, may have helped find the particular kind of rote memory used. At least for chess, if not for theory, it is the same area of the brain used for face recognition. This is the argument of Part 1 of the National Geographic documentary My Brilliant Brain, which centers on Susan. That is, to a chess player:
Perhaps we each have our own “Facebook” of familiar theorems, techniques, and complexity classes? This may mean that the act of solving already-solved problems as training may be more important for theorists than we have expected.
Susan Polgar, incidentally, is also retired from active play, but runs an extremely popular chess blog, to which Ken has sometimes contributed. She routinely posts 10–15 items per day, which is like playing 40 games at once over the same 3-4 day period that elapses between our items. She is highly active in chess education from young school levels up through university, and her Susan Polgar Institute for Chess Excellence is hosted by Texas Tech University.
Should we stress more memorization in teaching and learning math and in particular complexity theory?
[removed documentary credit to BBC]