
[ Photo courtesy of Kyodo University ] 
Shinichi Mochizuki is about to have his proof of the ABC conjecture published in a journal. The proof needs more than a ream of paper—that is, it is over 500 pages long.
Today I thought we would discuss his claimed proof of this famous conjecture.
The decision to published is also discussed in an article in Nature. Some of the discussion we have seen elsewhere has been about personal factors. We will just comment briefly on the problem, the proof, and how to tell if a proof has problems.
Number theory is hard because addition and multiplication do not play well together. Adding numbers is not too complex by its self; multiplication by its self is also not too hard. For those into formal logic the theory of addition for example is decidable. So in principle there is no hard problem that only uses addition. None. A similar point follows for multiplication.
But together addition and multiplication is hard. Of course Kurt Gödel proved that the formal theory of arithmetic is hard. It is not complete, for example. There must be statements about addition and multiplication that are unprovable in Peano Arithmetic.
The ABC conjecture states a property that is between addition and multiplication. Suppose that
for some integers . Then
is trivial. The ABC conjecture says that one can do better and get
for a function that is sometimes much smaller than . The function depends not on the size of but on the multiplicative structure of . That is the function depends on the multiplicative structure of the integers. Note, the bound
only needed that were numbers larger than . The stronger bound
relies essentially on the finer structure of the integers.
Roughly operates as follows: Compute all the primes that divide . Let be the product of all these primes. Then works:
The key point is: Even if , for example, divides , we only include in the product . This is where the savings all comes from. This is why the ABC conjecture is hard: repeated factors are thrown away.
Well not exactly, there is a constant missing here, the bound is
where is a universal constant. We can replace by a smaller number—the precise statement can be found here. This is the ABC conjecture.
The point here is that in many cases is vastly smaller than and so that inequality
is much better than the obvious one of
For example, suppose that one wishes to know if
is possible. The ABC conjecture shows that this cannot happen for large enough. Note
for positive integers .
Eight years ago Mochizuki announced his proof. Now it is about to be published in a journal. He is famous for work in part of number theory. He solved a major open problem there years ago. This gave him instant credibility and so his claim of solving the ABC conjecture was taken seriously.
For example, one of his papers is The Absolute Anabelian Geometry of Canonical Curves. The paper says:
How much information about the isomorphism class of the variety is contained in the knowledge of the étale fundamental group?
A glance at this paper shows that it is for specialists only. But it does seem to be math of the type that we see all the time. And indeed the proof in his paper is long believed to be correct. This is in sharp contrast to his proof of the ABC conjecture.
The question is: Are there ways to detect if a proof is (in)correct? Especially long proofs? Are there ways that rise above just checking the proof line by line? By the way:
The length of unusually long proofs has increased with time. As a rough rule of thumb, 100 pages in 1900, or 200 pages in 1950, or 500 pages in 2000 is unusually long for a proof.
There are some ways to gain confidence. Here are some in my opinion that are useful.
The answer to the first question (1) seems to be no for the ABC proof. At least two world experts have raised concerns—see this article in Quanta—that appear serious. The proof has not yet been generalized. This is an important milestone for any proof. Andrew Wiles famous proof that the Fermat equation
has no solutions in integers for and a prime has been extended. This certainly adds confidence to our belief that it is correct.
Important problems eventually get other proofs. This can take some time. But there is almost always success in finding new and different proofs. Probably it is way too early for the ABC proof, but we can hope. Finally the roadmap issue: This means does the argument used have a nice logical flow. Proofs, even long proofs, often have a logic flow that is not too complex. A proof that says: Suppose there is a object with this property. Then it follows that there must be an object so that Is more believable than one with a much more convoluted logical flow.
Ivan Fesenko of Nottingham has written an essay about the proof and the decision to publish. Among factors he notes is “the potential lack of mathematical infrastructure and language to communicate novel concepts and methods”—noting the steep learning curve of trying to grasp the language and framework in which Mochizuki has set his proof. Will the decision to publish change the dynamics of this effort?
]]>
An idea for humaninterest interviews
Pixabay free src 
Dr. Lofa Polir is, like many of us, working from home. When we last wrote about her two years ago, she had started work for the Livingston, Louisiana branch of LIGO. They sent her and the rest of the staff home on March 19 and suspended observations on the 26th. Since Polir’s duties already included public outreach, she is looking to continue that online.
Today we helped Dr. Polir interview another pandemicaffected researcher.
We liked her idea of interviewing young people just starting their careers, who are facing unexpected uncertainties. Her first choice was a new graduate of Cambridge University doing fundamental work related to LIGO. Unfortunately, he had been unable to install a current version of Zoom on his handheld device, or maybe afraid owing to security issues. So she requested the special equipment we have used to interview people in the past.
He replied at the speed of light that he was willing to do the interview so long as we respected some privacy measures. As for what name to use, he said we could just call him Izzy—Izzy Jr., in fact. So Dick, I, and Dr. Polir all used our own Zoom to port into our machine’s console room. The connection worked right away as Izzy’s head glimmered into view.
At first glimpse, all we could see was his long, lightbrown hippie hair. This really surprised us—not the image we had of Cambridge—and we gasped about it before even saying hello. He replied that it was fashion from the Sixties. We asked how his family was doing and he said his fathers had passed on but mother and young siblings were at home and fine. We think he said “fathers” plural—the machine rendered him in a drawl like Mick Jagger and he was hard to follow.
Izzy picked up on our discomfort and immediately assured us he hadn’t been doing any drugs: “You can’t get them anyway because they’re all being diverted to treat the sick.” But he did open up to us that he was in some kind of withdrawal. He confessed that he had resorted to looking at the sun with one eye. “It was ecstasy but bad—I still can see only reds and blues with that eye, and I need to use an extralarge rectangular cursor to read text.” We were curious what brand of handheld device he was using because of his problems with Zoom, and he told us it was a Napier 1660 by Oughtred, Ltd. We hadn’t heard of that model but he said he’d connected three of them into a good home lab setup.
We asked how he was coping with distance teaching, but he said he hadn’t yet started his faculty position at Trinity College. We were surprised to learn that lecture attendance at Cambridge University is optional. “I shall be required to give the lectures but nobody will come to them so that’s all the same now—at least here I’ll have a cat for audience. No dogs and not my mother or siblings—I’d sooner burn the house down.” He quickly added, “Oh, my mother and I get along fine now and I love playing teatime with my little sisters.”
We really didn’t want to go into Izzy’s personal life, and I tried to shift the smalltalk by noting a little chess set on a shelf behind him. He snapped that he shouldn’t have spent money on it and he was a poor player anyway. We thought, wow, either this guy’s really down on himself or the cabin fever of the pandemic is getting to him. So Dick, always quick to pick up on things and find ways of encouragement, said:
“Dr. Polir here works on gravity and we’re told you have some great new ideas about it. We’d love to hear them.”
“Yes, I do—or did. But something happened yesterday that is making me realize that it’s all wrong, rubbish really…”
Izzy started by explaining that it’s a basic principle of alchemy that all objects have humors that can manifest as kinds of magnetism. (“Alchemy”? did we hear him right?) If you realize that the Earth and Sun are objects just like any other then you can model gravity that way. You just need to assign each object a number called its “mass” and then you get the equation
for the force of attraction, where is the distance between the objects with masses and is a constant that depends on your units.
“We understand all that,” said Dr. Polir.
Izzy said the point is that depends only on your units and is the same regardless of where you are on Earth or on the Moon or wherever. It is very small, though. Then he went into his story of yesterday.
“I was in our garden by the path to the neighbor’s farm. I was supposed to be watching my little brother Benjamin who wanted to help harvest squash but I hate farming so I let him go without me. I was lying under an apple tree for shade when an apple fell and I realized all my mistakes.”
“What?,” we thought silently. We didn’t need to speak up—Izzy launched right into his litany of error:
“First, I’d thought the force was in what made the apple fall, but that’s nonsense. The apple would fall naturally because down is the shortest path it would be on if the tree branch were not holding it back. The only force is the tensile strength of the branch which was restraining it. I think that the tensile force really is magnetism, by the way.”
“Second, it’s ridiculous to think the force is coming from the Earth. On first principles, it could come from the ground, but that’s not what the equations say. They could have it all coming from one point in the center of the Earth. Just one point—four thousand miles deep!”
“Third and worst, though, is when you apply it to the Sun and the Earth. My equation means they are exerting force on each other instantaneously. But they are millions of miles apart. Whereas, the tree was touching the apple. Force can work only by touch, not by some kind of spooky action at a distance.”
We realized what he was driving at. Dick again always likes to encourage, so he said:
“But the math you developed for this force theory—surely it is good for calculations…?”
“No it’s not—it’s the Devil’s own box. I can calculate two bodies—the Earth and the Sun, or the Moon and the Earth if you suppose there is no Sun, but as soon as you have all three bodies it’s a bog. Worst of all, I can arrange five bodies so that one of them gets accelerated to infinite velocity—in finite time. This is a clear impossibility, a contradiction, so by modus tollens… it can all go in the bin.”
We didn’t think it would help to tell him that his math was good enough to calculate a Moon landing but not to locate a friend’s house while driving. He supplied his own coupdegrâce anyway:
“And even the twobody calculations are tainted. I can calculate the orbits of the planets but the equations I get aren’t stable. I would wind up having to postulate something like God keeps the planets on their tracks. Yes, you need an intelligent Agent to start the planets going—all in one plane, basically—but to need such intervention all the time defeats the point of having equations.”
We asked Izzy what he was going to do. He said that the one blessing of enforced solitude is that one gets time to reflect on things and deepen the foundations. And he said he’d had an idea later that afternoon.
“Toward supper I realized I needed to get Benjamin home. The path to the farm is straight except it goes over a mound. I was sauntering along and when I got to the hill I realized that if I didn’t watch it I’d have fallen right into it. So that got me thinking. First, what I thought was straight on the path was really a curve—the Earth is after all a ball. We think space is straight, but maybe it too is curved. So when I’m standing here, perhaps I would really be moving in a diagonally down direction, but the Earth is stopping me. The Irish blessing says, ‘may the road rise up to meet you.’ Perhaps it does.”
“So are you doing math to work that out?,” I ventured.
“I started after supper. One good thing is that it allows light to be affected by gravity—which I was already convinced of—even if light has no mass. But a problem is that it appears Time would have to be included as curved. That does not make sense either.”
We asked when he might write up all this. He said he didn’t want to be quick to publish something so flawed on the one hand, or incomplete on the other, “unless someone else be about to publish the same.” We noted that there weren’t going to be any inperson conferences to present papers at for awhile anyway.
“Besides, that’s not what I’m most eager to do. What the respite is really giving me time for is to start writing up my work on Theology. That’s most important—it could have stopped thirty years of war. For one thing, homoiousios, not homoousios, is the right rendering. There will be a time and times and the dividing of times in under 400 years anyway.”
That last statement somehow did not reassure us. We thanked Izzy Jr. for the interview and he gave consent to publish it posthumously.
We hope that your April Fool’s Day is such as to allow a time to laugh. But also seriously, would you be interested in the idea of our interviewing people during these times? Is there anyone you would like to suggest?
]]>
A visual proof with no abstractalgebra overhead
Composite crop of src1, src2 
Dominique Perrin and JeanÉric Pin are French mathematicians who have done significant work in automata theory. Their 1986 paper “FirstOrder Logic and StarFree Sets” gave a new proof that firstorder logic plus the relation characterizes starfree regular sets.
Today we present their proof in a new visual way, using “stacked words” rather than their “marked words.” We also sidestep algebra by appealing to the familiar theorem that every regular language has a unique minimal deterministic finite automaton (DFA).
The first post in this series defined the classes for starfree regular and for firstorder logic where variables range over the indices of a string of length . For any character in the alphabet there is the predicate saying that the character in position of equals . The first part proved that for any language there is a sentence using only predicates and the relation besides the usual quantifiers and Boolean operations of firstorder logic. such that defined . That is, it proved .
A key trick was to focus not on but on a formula expressing that the part of from position (inclusive) to (exclusive) belongs to . To prove the converse direction, , we focus not on middles of strings but on prefixes and suffixes. The previous post proved two lemmas showing that if is starfree then for any string , so are its prefix and suffix languages:
We will piece together starfree expressions for a multitude of and sets that come from analyzing minimum DFAs at each induction step, until we have built a big starfree expression for . This sounds complex—and is complex—but all the steps are elementary. At the end we’ll try to see how big gets. Again the rest of this post up to end notes is written by Daniel Winton.
We implement the “marked words” idea of Perrin and Pin by using extra dimensions rather than marks. Consider any formula in FO[] where are the free variables. Our "stacked words" have extra rows underneath the top level holding the string . Each row has the same length as and contains a single and s. The column in which the lies gives the value of the variable . Thus the alphabet of our stacked words is . Here is an example:
Each column is really a single character , but we picture its entries as characters in or . For any row , define to be the disjunction of over all that have a in entry , and define similarly. Then the following sentence expresses the “format condition” that row has exactly one :
This readily transforms into a starfree expression for the language of stacked words with exactly one in row :
Here is the union of all stacked characters that have a in entry , and is similarly defined. This is where the examples in the previous post about finite unions inside stars come in handy. The upshot is that
is a expression that enforces the “format condition” over all rows. We will use this at the crux of the proof; whether is really needed is a question at the end. If we take the format for granted, then the main advantage of our stacked words will be visualizing how to decompose automata according to when the in a critical row is read.
The proof manipulates formulas having the free variables shown. The corresponding language is the set of stacked words that make true, where for each , is the position of the lone in row , which gives the value of .
Theorem 1 For all formulas over , .
Without loss of generality we may suppose is in prenex form, meaning that it has all quantifiers out front. If there are variables of which are free, this means
where each is or and has no quantifiers. If we picture the variables as once having had quantifiers , then the proof—after dealing with —restores them one at a time, beginning with . That is the induction. We prove the base case here and the induction case in the coming sections.
Proof: Quantifier free formulas in are Boolean combinations of the atomic predicates and . Here is only for the characters , not the stacked characters . But inside (when we have ) we have to deal with the whole stack to represent . Putting , the expression can be viewed schematically as:
As illustrated above, using the initial lemma in the part1 post, this is equivalent to a starfree regular expression. The other atomic predicate, , is true when we have a stacked word of the form:
It is not important to have ; these are just the labels of the variables. Their values must obey , meaning that the in row comes in an earlier column than the in row . The characters of in the top row are immaterial. We can capture this condition over all strings over by the expression:
Again by the lemma in Part 1, this yields a starfree expression. To complete the basis, we note that if and have starfree expressions and corresponding to , then is represented by and by , both of which are clearly starfree.
Assume that all formulas in with quantifiers have starfree translations. We will show that any formula in with quantifiers also has a starfree translation. Let where is a formula in that has total variables, free variables and quantifiers. We are allowed to assume the first quantifier in is existential as and we observed above that the starfree expressible formulas in are closed under negation.
The language of is given by:
we can add a row to each of the characters of using minus one s and one such that the resulting string belongs to
We must show that is starfree. As has quantifiers, then by assumption is starfree regular.
Note that has one more row than because it has one more free variable. Without loss of generality we may suppose it is the bottom row.
Since we have an existential quantifier we might think we can simply delete the last row of . But here is a counterexample to show why we must be careful not to “lose information”:
Let
and
Notice that equals with its final row removed. We have
and
Then is the language of strings with at least three zeroes in a row and is the language of strings with at least two zeroes in a row.
We cannot add a second row consisting of zeroes and one 1 to each word in to force to be in . For a trivial example, consider the word
The fault can be interpreted as not being recoverable uniquely from owing to a loss of information. We could also pin the fault in this case on the central , noting that is generally not equivalent to .
Either way, this means we must be careful with how we express (the final row of) . The idea is to give in segments such that the final row of each segment is allzero or a single , so that the act of removing the final row could be inverted. This is the crux of the proof. To handle it we employ the unique DFA for the regular expression already obtained by induction for .
By the MyhillNerode Theorem, there must exist a unique minimal automaton over the alphabet of our stacked words defining the language of .
The state set of can be partitioned into the set of states before any is read in the final row, the set of states reached after reading a in the final row, plus a dead state belonging to neither nor . The start state belongs to and all accepting states belong to . Since any stacked word and therefore accepting computation has exactly one in any row, all arcs within and must have a only s in their final row, while the letters connecting states in and must have a in their final row. The minimality of enforces these properties. Here is a sketch, omitting :
Let be the set of edges that cross from to . With we can label their origins by states (not necessarily all distinct), and similarly label their characters by and the destination states by . Then collects all the characters used by that have a in row (except those into or at the dead state ). We can identify with the set of instructions .
Finally let denote the set of strings that take from state to state , and let . Then we can define by the expression
What remains is to find expressions for and for each . The latter is easy: Pick any string in . Then
The closure of under left quotients supplies expressions for .
The case of , however, is harder. What we would like to do is choose some (any) string that goes from to an accepting state and claim that . The problem is that might be accepted from some state that has an incoming arc on the same character from some other state . Then , which is disjoint from , is included also in . There may, however, be strings and such that is not accepted by . Then the string would be wrongly included if we substituted for (or for ) in (1).
There is also a second issue when crossing edges on the same character come in from distinct states to the same state . To fix all this, we need to use sets of strings that distinguish from all the other states . This requires the most particular use of the minimality of : If , then there is either (or both):
The strings in question need not begin with a character in —they need not cross right away. Let stand for a choice of strings of the former kind, the latter kind, covering all states . Then
We could not use just the complement of because that would allow strings that violate the “one ” condition in the rows. Note that . Whether one can do the induction for and in tandem without invoking is a riddle we pose at the end.
Either way, the closure of under right quotients yields a starfree expression for . Then we just have to plug our expressions and into (1) to get a starfree expression for . Now we are ready for the crude final step:
Form by knocking out the last entry in every character occurring in . Then represents .
To prove this final statement, first suppose is a stacked word in . Then satisfies , so there is a value so that satisfies with set equal to the value . This means that the stacked word formed from and a last row with in position belongs to and hence matches . Since is obtained by knocking out the last row of , matches .
The converse is the part where the tricky point we noted at the outset could trip us up. Suppose matches . From the form of (1) as a union of terms , we can trace how is matched to identify a place in that matches a “middle character” obtained from a in a crossing edge. Putting a in a new row underneath this place and s elsewhere hence makes a word that matches the unique regular expression obtained by appending a component to every “middle character” and a to all other characters in . Then is equivalent to , so satisfies . Since the last row that was added to amounts to supplying a value for the variable , it follows that satisfies with , hence satisfies . This yields and so the entire induction goes through.
First, as an end note, the end of the proof wound up different from what we envisioned until polishing this post. It originally applied the distinctstates argument to states on the far side of the crossing, but we had to backtrack on having previously thought that the “second issue” did not matter. Two other questions we pose for our readers:
A larger question concerns how the size of the translated expressions increases as we add more quantifiers. We discuss the blowup in terms of quantifier depth and length . Let be a firstorder formula with no quantifiers and let denote with quantifiers applied to it. Also let be a starfree translation of and the length of , denoted by len, equal .
To obtain from we represent by a union over crossing edges between and and the final states of of the concatenation of prefix and suffix languages and a crossing character between them. We have that gives a decent approximation for—worstcase—length of the prefix languages, suffix languages and the product of crossing edges and final states. Using these approximations we have . Iterating gives , , and in general, . This says that our blowup could be doubly exponential. Is there a tighter estimate?
Our final question is: how well do our “stacked words” help to visualize the proof? Are they helpful for framing proofs of equivalences between language classes and logics that are higher up?
]]>
Part 1 of a twopart series
Daniel Winton is a graduate student in mathematics at Buffalo. He did an independent study with me last semester on descriptive complexity.
Today we begin a twopart series written by Daniel on a foundational result in this area involving firstorder logic and starfree languages.
Dick and I have discussed how GLL might act during the terrible coronavirus developments. We could collect quantitative and theoretical insights to help understand what is happening and possibly contribute to background for attempts to deal with it. Our March 12 post was this kind, and was on the wavelength of an actual advance in group testing announced in Israel two days ago, as noted in a comment. We could try to be even more active than that. We could focus on entertainment and diversion. Our annual St. Patrick’s Day post mentioned the ongoing Candidates Tournament in chess, with links to follow it live 7am–noon EDT. Game analysis may be found here and elsewhere.
Here we’re doing what we most often do: present a hopefully sprightly angle on a technical subject. We offer this in solidarity with the many professors and teachers who are beginning to teach online to many students. The subject of regular languages is the start of many theory courses and sometimes taught in high schools. Accordingly, Dick and I decided to prefix a section introducing regular languages as a teaching topic and motivating the use of logic, before entering the main body written by Daniel.
Regular languages are building blocks used throughout computer science. They can be defined in many ways. Two major types of descriptions are:
describes the set of strings over the alphabet that have an even number of ‘s.
The regular languages defined by (1) and (2) are the same. All regular expressions have corresponding finite automata. This equivalence makes a powerful statement about the concept of regular languages. The more and more diverse definitions we have, the better we understand a concept. This leads us to consider other possible definitions.
A natural kind of definition involves logic. Studying complexity classes through logic and model theory has proven fruitful, creating descriptive complexity as an area. Good news: there are logic definitions equivalent to the regular languages. Bad news: they require going up to secondorder logic. We would like to stay with firstorder logic. So we ask:
What kind of languages can we define using only firstorder logic (FOL) and simple predicates like “the th bit of is ” and “place comes before place “?
The answer is the starfree languages, which form a subclass of the regular languages. They were made famous in the book CounterFree Automata by Robert McNaughton and Seymour Papert, where the equivalence to FOL was proved. A portentous fact is that these automata cannot solve simple tasks involving modular counting. Nor can perceptrons—the title subject of a book at the same time by Papert with Marvin Minsky, which we discussed in relation to both AI and circuit complexity. This post will introduce and FOL, prove the easier direction of the characterization, and give two lemmas for next time. The second post will present a new way to visualize the other direction. The rest of this post is by Daniel Winton.
A regular language over an alphabet is one with an expression that can be obtained by applying the union, concatenation, and Kleene star operations a finite number of times on the empty set and singleton subsets of . Starfree languages are defined similarly but give up the use of the Kleene star operation, while adding complementation () as a basic operation. The starfree languages are a subset of the regular languages, because regular languages are closed under complementation.
Complementation often helps find starfree expressions that we ordinarily write using stars. For instance, if the alphabet is , then gives . The following lemma gives a family of regular expressions that use Kleene star but are really starfree.
Lemma 1 The language given by taking the Kleene star operation on a union of singleton elements in an alphabet is starfree.
Proof. For we have that can be given by
For example, the language is starfree because is by this lemma. This idea extends also to forbidden substrings—e.g., the set of strings with no is .
The language is not the same, however, and it is not starfree. Intuitively this is because it involves modular counting: an in an odd position is OK but not even. The parity language from the introduction is another example. So is a proper subset of the regular languages. What kind of subset? This is where having a third description via logic is really useful.
In addition to the familiar Boolean operations and truth values, firstorder logic provides variables that range over elements of a structure and quantifiers on those variables. Since we will be concerned with Boolean strings , the variables will range over places in the string , where . A logic also specifies a set of predicates that relate variables and interact with the structure. For strings we have:
We can take the for granted since we are talking about strings, but we need to say that predicates like and are excluded, so we call this . We could define equality by but we regard equality as inherent.
We can use to define the successor relation , which denotes that position comes immediately after position :
Note the use of quantifiers. We can use quantifiers to say things about the string too. For instance, the language of strings having no substrings is defined by the logical sentence
It is implicit here that and are always inbounds. If were a legal constant with always false then a string like (which belongs to ) would falsify with and .
How big is ? We'll see it is no more and no less than .
To prove that any language in has a definition in , we will not only give a sentence but also a formula . We will define this formula to indicate that for a given string , the portion of the string between indices and is in . Then for the correct choices of and , gives . We define to test middle portions of strings, because it handles lengths better for the induction in the concatenation case.
Theorem 2
Every language in is definable in .
The proof is a nice example where “building up” to prove something more general—involving two extra variables—makes induction go smoothly.
Proof: Let be a starfree regular language and the portion of the string between indices (inclusive) and (exclusive) is contained in . Let be the formula denoting that for a given string , that is, is the representation of in . We will show that such a exists via induction on . This is sufficient, as for a given symbol that always represents the length of a string , is the formula in representing the language .
First, we must show that is in for one of the basis languages, , and , for some in . We have that:
Now, we must show that given starfree languages and with FO translations and respectively, we have , , and are in . Then:
Since starfree languages can be obtained by applying the union, concatenation, and complementation operations a finite number of times on and singleton subsets of , this completes the proof of .
Prefatory to showing (in the next post) that is contained in , we prove properties about substrings on the ends of starfree languages, rather than in the middle as with the trick in the proof of Theorem 2.
Let be a language over an alphabet and be a word in for some . Define , the right quotient of by , by and , the left quotient of by , by . First we handle right quotients:
Lemma 3 If is starfree, then is starfree.
Proof: For any word over we define a function by where . If , then , and so the statement of the lemma trivially holds. So let for some string and character . Note that . Thus for all . Hence it suffices to define for any single character by recursion on . We have:
and recursively:
In general, if , then
Lemma 4 If is starfree, then is starfree.
The proof of Lemma 4 is similar to the proof of Lemma 3. The main differences lie in the concatenation subcase for the case and the order of quotienting when using this operation repeatedly.
Proof: For any word over we define a function by where . If , then , and so the statement of the lemma trivially holds. So let for some string and character . Note that . Thus for all . Hence it suffices to define for any single character by recursion on . We have:
and recursively:
In general, if , then
There are richer systems such as and . Note that allows defining the relation via . We can define nonregular languages such as in . The class famously equals uniform , see chapter 5 of Neil Immerman's book Descriptive Complexity. Thus we hope our new style for proving (to come in the next part) will build a nice foundation for visualizing these higher results.
]]>
Stay safe, everyone
Cropped from Floss Dance source 
Neil L. is a Leprechaun. He has visited me once every year since I started GLL. I had never seen a leprechaun before I began the blog—there must be some connection.
Today, Ken and I want to share the experiences we had with him on this morning of St. Patrick’s Day.
I thought because of the pandemic that he might not come. I thought that since New York City is on virtual lockdown he might not come. So I decided not to wait up for his visit, which always occurs in the wee hours of the morning. I looked it up: Wee hours is between 1am and 4am. We—that is Kathryn and I not wee—went to bed before midnight.
Something woke me up at 3am. Not any noise but a bright green light. It was coming from my laptop, which was on the dresser but closed and switched off. I took it out of the bedroom and opened the lid and there he was.
Top o’ the morning to ye.
The first thing I noticed was: no pipe. There was no puff of smoke in my living room. Neil smokes a special brand of tobacco—green of course—that I would recognize in an instant. Otherwise he looked as usual: green coat and hat, brown vest and shoes, red beard. I could see all of him.
“Glad as always to see you Neil. But why no pipe?”
Neil started his usual pipepuffing gesture but stopped his hand from touching his face.
‘Tis virus not microbe. Smoke nae gan do good. All of us doing without. Least of our troubles.
Wow—that was hard for me to fathom. I thought of the travel bans. “Are you all having to stay home?”
Aye. We obey the bans. Crown or Republic, nae matter.
I had to wrap my mind around this—leprechauns too? “The virus applies to you?”
Sadly, we can carry it. Were it not, we could help folks pointtopoint.
I realized: leprechauns can go anywhere instantly. The planar aspects of simulations like this do not apply to them. Neil read my mind—at least he could do that remotely:
Aye. I noted all your comment thread about expander graphs and spreading the disease. Fixed dimensions don’t constrain us. This compounds our main trouble, which be…
Of course our main trouble and concern is the danger of the virus—our being able to care for those who catch it and measures to ensure many do not catch it. I realize leprechauns, being immortal, have fewer worries there—and I did not ask Neil if they can get sick. So I thought of all the canceled St. Patrick’s Day parades, bars closing, no public merriment, sporting events gone too. But Neil’s next word went straight to the heart.
Boredom.
I thought about fun things Neil had recounted in the past: tricks with math and codes and logic, tricks on people, even tricks during March Madness. The NCAA tournament got completely canceled, unpostponable. Neil read my mind again and picked up on the topic of sporting events.
Consider me neighbor Bertie. He had his accommodations booked on the wee island in the pond of the 17th hole at TPC Sawgrass. For last weekend during the Players Championship. That’s why they built the island, ye know.
“The Purest Test in Golf” source 
But of course, the Players was canceled. Even the Masters has been postponed. Neil continued,
He and his fellows’ best craic was playing with the balls goon o’er the water. Not in a nasty way. Ye know by the Rule of Indistinguishability we canna do them bad. Not allowed to make them play worse than other tough holes so ye could tell it is more than the usual hobgoblins in your head and butterflies in your chest. What we do must have the same expectation and other moments as random variation. But random is measure1 so it is easy to comply.
Of course. We have written about feigned randomness, and we haven’t yet had time to write a post this year on scientists taking superdeterminism seriously—even computationally. How could we tell that stuff apart from leprechauns? But thinking of Bertie gave me a more obvious question:
“Neil, can’t Bertie and his pals do that remotely? Like lots of other things we’re doing now.”
Neil’s answer exposed my lack of basic leprechaun knowledge.
In every story of leprechaun tricks, who is present? The leprechaun. He cannot be remote—and he must run the risk of being catched. He can dance, run to the end of the rainbow, and disappear, but fun happens only when he is around.
Hearing this, I would have puffed my own pipe, if I smoked and had one. Neil continued:
Besides, remotely we cannot control well enough to abide the Rule of Indistinguishability.
All this about randomness and conspiracy physics, with the world already weighing heavily, was too much. I winked out on the couch, without closing my laptop or seeing if Neil said goodbye with his usual flourish.
I woke up in late morning and called Ken. As I told this part, Ken was instantly alarmed and horrified.
“We must get Neil back. Is he still on your laptop?”
He was not. Ken plugged in and opened his own laptop—Neil has in the past visited him too—but nothing. He could not find any leprechauns registered on Skype or Zoom or Hangouts. Then Ken had a thought: “See if your laptop has the IP records.”
It did. I could not tell what they signified, but my old IT contact at Tech said he could pinpoint the origin in minutes. Meanwhile, Ken calmed down enough to explain. I’ll let him write the rest of the story.
What is apparently the lone major human sporting competition in the world not to be canceled began today. It is chess—the Candidates Tournament to determine who will face Magnus Carlsen in the next world championship match. It is taking place in Yekaterinburg, Russia—in Siberia, still far from the worst virus activity. It has only the 8 players plus officials onsite—no spectators, press limited—and they are all subject to daily screenings. The players shake hands only virtually.
Controversy about its starting continued today with one former World Champion, Vladimir Kramnik, stepping down from a broadcast commentary team “considering the nowadays disastrous humanitarian situation in the world.” But I’m with those saying that its value as a diversion—something to follow live in community—outweighs the concerns. I said this after some reflection on risks from our previous post and before being contacted about monitoring the tournament. One Chinese broadcast had over 1 million viewers today. Other broadcasts can be found on the official site and several other places. Today’s first round was vigorous, with all four games going beyond the turn40 time control and two decisive.
But if it is a lone attraction for us, omigosh, for the leprechauns… Dick’s story proved they can at least read minds remotely. What if they all banded together to mess with the players’ moves? Neil had told of the Indistinguishability constraint, but I realized my own work could enable getting around it. My model enables randomly simulating a human distribution of moves at any rating level. I’ve had the idea to try a “Turing Test” akin to one Garry Kasparov turned aside, as part of the Turing Centennial in 2012, by distinguishing five games played by human 2200level players from five by computers set to play at 2200 strength. If the leprechauns passed the Turing test at a 2800 rating level they could derange the tournament. This all verges on real concerns in my statistical anticheating work, with humans not leprechauns.
I quickly looked up the lore for summoning a leprechaun. This is far more audacious than knocking on a professor’s door outside office hours—not that that will happen anytime soon. I did not have real shamrocks but strewed countryside photos from my family’s trip to Ireland last summer around my laptop. I needed Kathryn as well as Dick online by videoconference to make the quorum of three. The IT person came back with the coordinates so that uniquely Neil would be summoned. I adapted the ancient incantation to our new age of interaction by laptop:
Oh Leprechaun, Leprechaun, I humbly call unto thee,
Ride the Irish rainbows of joy across the virtual sea,
Let the cables be lit with the beacon of your Shillelagh,
In dire peril of bad luck I make this plea,
For an unknown evil lies near, I ween:
Oh Leprechaun, Leprechaun, please appear on my screen!
It worked. Neil crackled into view. After a short greeting I blurted out my concerns.
Nae worry. Nae worry. The International Chess Federation—FIDE—took the most important step to assure the fair play.
Wait—I know the fairplay measures FIDE developed. I’ve been part of them. That didn’t assuage what I’d just said.
FIDE put not just one but two Scotsmen on the tournament staff. One of them is in charge of fair play. Nae Leprechaun gan cross a Scotsman!
And with that Neil was gone.
Our hearts go out to all those affected more immediately than we so far. Amid all the worries, we wish you a happy and safe St. Patrick’s Day.
[some formatting and wording tweaks]
History of Econ. Thought src 
Robert Dorfman was a professor of political economy at Harvard University, who helped create the notion of group testing.
Today Ken and I discuss this notion and its possible application to the current epidemic crisis. We also discuss other mathematical aspects that have clear and immediate value. Update 03/19/20: Israel Cidon in a comment links to an actual implementation of group testing for the virus up to 64 samples from the Technion.
The novel coronavirus (and its disease COVID19) is the nasty issue we all face today. I live in New York City and can see the reduced foot traffic every day. The issue is discussed every minute on the news. I hope you are all safe, and well. But we are all worried about this.
The point of group testing is that it can reduce the number of tests needed to find out who has the virus. I assume that we are not using Dorfman’s idea because it does not apply to today’s testing. But it seems like it could fit. One issue is the need for more individual testing kits. As we write this, the US is still well short of the needed supply of kits. Are there situations where group testing can still help economize?
Dorfman created the notion of group testing in 1943. You could say he was driven by the need to test light bulbs:
As Wikipedia grouptesting article explains:
[Suppose] one is searching for a broken bulb among six light bulbs. Here, the first three are connected to a power supply, and they light up (A). This indicates that the broken bulb must be one of the last three (B). If instead the bulbs did not light up, one could be sure that the broken bulb was among the first three. Continuing this procedure can locate the broken bulb in no more than three tests, compared to a maximum of six tests if the bulbs are checked individually.
Okay, Dorfman’s motivation was not light bulbs. It was testing soldiers during WWII for a certain disease, that will go unnamed. This type of testing reduced the number of blood samples needed. What was analogous to connecting groups of light bulbs into one circuit was combining portions of individual blood samples into one sample. If it tested negative then that entire group could be dismissed without further testing.
The point of using group testing is made stark when you think about recent issues with cruise ships. On one ship there were around passengers who were eventually found to be almost all okay. I believe only were infected. This could have been checked by group testing with many fewer than tests.
According to the current outbreak maps, known cases in the US are still fairly sparse. Those in New York are mainly in the city, Westchester, and along the Hudson River to Albany. Let’s say we wish assurance that all nonvirus related admits to a hospital are free of the virus. Can group testing apply?
The original version of group testing would apply if it were deemed mandatory that all new admits give a blood sample. Some kinds of coronavirus test kits use blood samples. Taking blood samples might however be as costly and intrusive as having the individual tests to begin with. There are other kinds of tests involving taking swabs, but it is not apparent whether those samples can be combined at scale as readily as blood samples can.
At least the idea of group testing has interesting connections to other parts of theory. Here is a 2000 survey by Ken’s former Buffalo colleague Hung Ngo with his PhD advisor, DingZhu Du, where the application is to large scale screening of DNA samples. Hung and Ken’s colleague Atri Rudra taught a course in Buffalo on group testing in relation to compressed sensing. More recent is a 2015 paper that tries to solve problems of counting the number of positive cases, not just whether they exist.
As we write, New York Governor Andrew Cuomo has just declared a cap of 500 attendees for any assembly. This is forcing the suspension of Broadway shows among many other activities. Yesterday the entire SUNY system joined numerous other institutions in moving to distanceonly learning after this week.
The cap is based on the likelihood of persons including someone who is already infected. That likelihood in turn is based on the density of known cases and what is known about the proportion of unknown to known cases. The virus has a long (two weeks) incubation time during which it is contagious but not symptomatic.
Here is a graph from a PSA item posted just today by Alex Tabarrok for the Marginal Revolution blog (note that the calculations are by Joshua Weitz of Georgia Tech):
The axis is the number of cases (in proportion to the US on the whole) and the axis is the group size. Which axis is more important—has more effect on the danger? Tabarrok notes:
Now here is the most important point. It’s the size of the group, not the number of carriers that most drives the result.
This involves comparing two partial derivatives. The item gives a brief workedout example without using calculus.
I, Ken, have been connected this week to another example. I was statistically monitoring the World Senior Team Chess Championship, which began last Friday in Prague. Almost 500 players in teams of 4 or 5 players took part. Initially the players were roughly evenly divided between two halls. Effective Tuesday, a cap of 100 was declared for the Czech Republic, so more playing rooms were found and spectators were banned. Today, however, after an update on the density of cases in the Czech Republic, the cap was lowered to 30 effective tomorrow. Thus the tournament was forced to finish today, two rounds earlier than planned. Even though chess events have a lower size footprint than all of the spectator sports whose seasons have been suspended in recent days, the growth of the outbreak is making cancellation the only reasonable policy for all but the smallest events.
The main purpose of these and other social isolation measures is to flatten out the growth of cases. The target is not just to contain the outbreak but also to stay below the number of serious cases that our treatment systems can bear at once. Here is one of numerous versions of a graphic that is being widely circulated:
The graphic is not necessarily talking about reducing the number of cases total. The area under both curves is the same. A sentence in accompanying article—
If individuals and communities take steps to slow the virus’s spread, that means the number of cases of COVID19 will stretch out across a longer period of time.
—seems to imply that “the number of cases” is the same in both scenarios, but stretched out over time in the latter. The point is how the stretching keeps the value bounded. Update 3/15: To complete the thoughts here, we should be talking about reducing the number of cases—see this.
We certainly hope that isolation can reduce that number—i.e., that containment data out of Southeast Asia in particular holds true, as opposed to fears being voiced in the West that the virus will spread to a large percentage of the population over time. See charts here, especially chart 7. This tracking map has free access and is updated daily.
What governs the spreading process? This is being understood via simple mathematical models of contagion, such as come from percolation theory and its associated factor. Almost a month ago, the Washington Post made an interactive showing how epidemics spread according to the parameters in these models. How and whether the COVID19 pandemic follows these models remains to be seen. Of course we hope it stays on the better side of the equations. Update 3/15: This new animation from the Washington Post shows how the “better side” can arise.
Is group testing a practical mechanism for mapping and constraining the epidemic? How can we promote the understanding of mechanisms and equations and models, not only for those shaping policy but for us who must abide by it and know why. Update 3/24: Article on FiveThirtyEight.com about how COVID19 tests work.
[added note about UB and others going to online learning, added smallevent qualifier about chess, clarified COVID19 is the disease, added map link and two updates.]
]]>Composite crop of src1, src2 
Maryam Aliakbarpour and Sandeep Silwal are PhD students at MIT. They have a joint paper titled, “Testing Properties of Multiple Distributions with Few Samples.” Aliakbarpour is advised by Ronitt Rubinfeld. Silwal has a different advisor, Piotr Indyk. Could we test which advisor is better by sampling? Haha, not a chance…
Today we will discuss this paper as an example of the power of computer science theory.
Aliakbarpour and Silwal (AS) prove results on property testing, which is of course a long studied area in statistics, in mathematics, and in complexity theory. Many problems of property testing are open; many are very hard problems, and some are really impossible—such as distinguishing slight differences between distributions. With all due respect, the brilliance of their paper is not in solving open problems. Rather it is how they modify property testing in a novel manner.
Their proofs are clever, are technically strong, but we feel more importantly is how they change the problems. That is, they show how to add a new constraint to property testing, where:
Let’s look at this more closely.
We all know about open problems and closed problems. There are however two kinds of closed problems: ones that have been settled and ones that have been proved impossible to settle. We also know that one can often take an open problem and define a meaningful subcase of it that can be solved. What may be less obvious is how one can do the same with the latter kind of closed problem.
The work of AS is a perfect example of this phenomenon. They take a problem that is unsolvable and change it to one that is solvable. The details of their results are less important than their change to the underlying problem. One of the key insights throughout theory is that we often have made major progress by changing the ground rules. The AS work is a particularly clear case of this.
AS call it the structural condition. They postulate a consistency type of condition on property testing. Suppose that one is interesting in testing whether or not a collection of slotmachines are fair. That is, are they set to cheat players or not? This is known to be hard to do with few tests. But after adding their new constraint it becomes quite efficient.
Among some examples to convey their definition, they postulate a row of slot machines in a casino that are supposed to give the same distribution of prizes with known fair probabilities . As usual with property testing we want to distinguish between two hypotheses that are some distance apart: one that the machines conform to the fair distribution and an alternative hypothesis that each is far from fair. There are two relevant factors:
Such situations are common with realword data. The authors mention cases of medical data when one gets just one or a few blood readings from patients with disparate situational factors that can bias their samples. The joint effect of the biases may completely mask the systematic population tendency one is trying to test for. This is the setting in which the problem of distinguishing has been regarded as impossible, case closed. In the slotmachine example, Aliakbarpour and Silwal state their structural condition informally as follows:
Our goal is to test whether all the machines were fair … or they are far from being fair. In this case, we can naturally assume that if the machines are unfair, the house will assign a lower probability to the expensive prizes, and higher probability to cheap ones.
Of course, the house could cheat players without cheating in this uniform manner. More subtle rigging of the machines in different ways could mask the lowered expectation. But what AS are saying is: Provided the house cheats in this consistent manner, then they can be discovered quickly.
The formal definition is that each of the true source distributions is biased the same way in the same components as the expected (“fair”) distribution:
Definition 1 A sequence of distributions on a discrete domain obeys the structural condition relative to the distribution if there is a subset such that for all and :
The term “structural condition” is vanilla and perhaps saying the are “conformally biased” relative to would be more descriptive. A major point which the authors emphasize right afterward is that the set is unknown. There are exponentially many possibilities for what could be—so we cannot try to guess it. Rather, proofs using this condition need to exploit how all the distributions conform to the same . The slotmachine example arguably does not reflect this point in full—because we can know in advance what the major and minor prizes of a slot machine are (including “no prize” among the latter). The medicaldata examples and others certainly do, however.
We can convey briefly and intuitively how this condition is used for the case where is uniform distribution on —which is quite general as testing for many distributions can be efficiently mapped into this case. A key property of uniform distribution is that among all distributions on , it minimizes the probability of getting collisions from repeated samples. This idea has yielded the most effective test statistics for problems when there is only one distribution . When there are multiple the effectiveness can be blunted in instances like the one in the previous section—the instances that are impossible.
However, AS show that their structural condition is just what’s needed to bridge the gap that keeps the impossible cases from being distinguished. Realizing that the compounded complications of having multiple can be resolved by this stroke is their act of brilliance. Their paper applies it successfully to two other problem settings—identity testing and closeness testing—and also shows meaningful instances that obey and are resolved by their definition but that escape an older condition involving product distributions. As usual, we say to see the paper for the details.
It remains our place to ask, in which other cases might the AS condition arise and what uses might it have? I can think first of one implication for adversarial strategies in cryptography. It goes like this:
We can view the AS theorems as saying that anyone who wishes to cheat others should not be consistent. Being consistent according to some utility criterion—even an unknown one—makes it easier for others to detect that they are cheating.
Following their slotmachine example suggests this: Arrange the machines in total to cheat players. But make it so that they are not consistent. They can set some prizes up in probability and some lower, and importantly, do this differently on different machines. They could even make some machines pay out more than the legal expectation—while still giving a small profit to the casino so it does not become too obvious to customers that those machines are “lucky.” There is safety in nonconformity alone.
Ken notices a similarity to the chessmodeling situation he began describing in this recent post. Instead of prizes of known value, in chess we have distributions over possible moves whose values are not so readily apparent to the player—that’s what makes chess challenging to play.
In his case, giving any chess position and a player having that position, he postulates a true distribution of moves would make and wants to compare that with the distribution projected by his model. Now there can be many positions with different characteristics, but Ken can treat the distributions coming from his model as one fixed point of reference , and he can give different positions the same ranks of legal moves (best), then . If positions have different numbers of legal moves he can pad them out by treating illegal moves the same as completely losing blunders.
So he has a situation like this: different and one over a bunch of positions. Ken knows his model not only varies in accuracy but is biased in ways explained in that post. It appears that this bias is highly conformal. The model is trained to make the projection of the probability of the first move accurate on average. Then the tendency is that the projections of the secondbest and thirdbest moves are off one way, and the projections of all the other moves are off the other way. This is like having or its complement most of the time. The new statistical tests that Ken is crafting also live under the specter of impossibility. A coworker has demonstrated that they fail in cases randomly generated from normal distributions. But so far they are appearing to succeed in Ken’s morestructured situations from his realworld chess data. On this we will stay tuned.
Does the AS paper say something about cryptography? How widely does their condition apply?
]]>
With a lemma from 1947 that might be useful today?
Cropped from article on his letters 
Freeman Dyson passed away last February 28th, one day short of the leap day, February 29th. He was one of the great physicists, one of the great writers about science, and one of the great thinkers of all time. He is missed.
Today we wish to discuss a tiny part of Dyson’s contributions to mathematics—and ask whether it has been developed further.
Dyson did so much in so many areas of science that we will leave it to others to discuss it. We covered a puzzle of his years ago here. He famously showed that quantum electrodynamics (QED) is a consistent theory—in particular showing that different theories connecting quantum mechanics and special relativity were the same. He published many interesting books about science in general. He speculated about aliens, about space travel, and much more.
Our focus is on a beautiful result of his proved in 1947. A result about rational numbers. Nothing grandiose, nothing about infinite visions, nothing about worlds that we cannot easily imagine.
For over 2000 years we have known that is not expressible as a rational fraction. This is sometimes credited to Hippasus of Metapontum. The obvious question, at least to math types, is how close can we make it to a rational number? The answer is interesting, with many consequences. And includes some top mathematicians such as Johann Dirichlet, Joseph Liouville, Axel Thue, Carl Siegel, Dyson, and Klaus Roth.
Hippasus: The value is not rational.
Folklore: The value like all algebraic numbers can be algorithmically approximated by rationals. Note that is algebraic since it satisfies the equation
with . The number is called the degree of .
Dirichlet: The value can be well approximated. That is
can be done for infinitely many and .
Liouville: The value cannot be too well approximated. That is
cannot be done for infinitely many and and some constant provided is larger than .
Thue: The value cannot be too well approximated. That is
cannot be done for infinitely many and and some constant provided is larger than .
Siegel: The value cannot be too well approximated. That is
cannot be done for infinitely many and for some constant provided is larger than .
Dyson: The value cannot be too well approximated. That is
cannot be done for infinitely many and for some constant provided is larger than .
Roth: The value cannot be too well approximated. That is
cannot be done for infinitely many and provided and is larger than .
Note that the last result is best possible in the sense that Dirichlet achieved , although slightly stronger statements than Roth can be made by using logarithms.
To illustrate these formulas, Liouville proved that the following number is transcendental:
Note that the exponent in Liouville’s number grows as . The later formulas improve this to simply exponential in . For instance,
is transcendental. We’ll leave it to our readers to figure out which formula gives this consequence and will give the historical answer at the end.
What we wish to highlight from Dyson’s 1947 paper is the main lemma for his proof. He called it the key new idea in his paper and also said it might have independent interest. We agree, and we abstract some of its statement into the following definitions.
Suppose we have a polynomial . Call a point a zero of order if and implies
Since this includes the case , the point must be a zero of as well as of all the partial derivatives. For example, when
the point is a zero of order and of order but not of order . The diagonalstep nature of this example plays into the next definition. We follow Dyson in numbering from zero.
Definition 1 A “staircase” for is given by points and nonnegative reals such that for all and ,
It is understood that the are distinct and the are distinct, but they need not be distinct from each other.
The parameter is the steepness of the staircase. The idea is to make the staircase quite steep by taking both and the degree of to be large. This is controlled by a parameter that is inverse to and chosen to meet the hypotheses of the key lemma:
Lemma 2 Suppose and constitute a staircase for a bivariate polynomial of degree in and degree in , where for each , ,
Suppose is a positive real number such that:
Dyson supplied the following interpretation in his paper. The lefthand side of (1) approximately counts the number of zeroes in the staircase—approximately because is left as a real number. We want a good upper bound on this number
A polynomial of degree in and in may have up to nonzero coefficieints. Thus is intuitively the maximum number of zeroes that could be arranged “on purpose” by the choice of .
We want to minimize the number af additional zeroes that could exist. Lemma 2 says that this is limited by the factor . We want to make this factor approach . We can do so by making arbitrarily small. By the condition this entails choosing large. The other requirement to chooise is that it must majorize
We need to be small yet bigger than
This means making . For whatever number of points we are concerned with, we can meet this by making the degree in be larger than the degree in by the factor . That is all we need to do for and to exist that will allow us to apply the lemma.
Thus the upshot is that suitably “lopsided” polynomials, together with a connected region of their partial derivatives, cannot have too many more than the prescribed number of zeroes, counting multiplicity of the orders of the zeroes. The proof of the lemma works by successive applications of reasoning about determinants and linear independence of polynomials that serve as components of . It is fairly longwinded and ends with some painstaking estimates.
The application supposes for sake of contradiction that there are infinitely many fractions giving closer approximations to than the theorem statement allows. It suffices to consider the case where is an algebraic integer—that is, a root of a univariate polynomial with integer coefficients. The supposition yields candidate polynomials as differences of two polynomials, one derived from and the other from the closely approximating fractions. The have both a large staircase of zeroes and a bounded space of possible coefficients, which imposes constraints on the degrees in and . These elements can be manipulated, using the approximating fractions on one hand and the fact of (and hence its degree) being fixed on the other hand, to create the lopsided form of in which the hypotheses of lemma 2 take effect. The lemma then cranks out a contradiction.
At the end of his proof, Dyson deftly explains how taking , where results from the process that selects the sequence for the lemma, yields Siegel’s theorem. Thus his framework affords an extra lever for manipulating ratios, in his case the ratio . Tending the ratio to rather than to produces his theorem.
Our point of interest is whether Dyson’s setup can be used to attack other questions about polynomials in complexity theory. We have not yet formed an understanding of how specific it is to the approximation application. Perhaps for complexity we would want to generalize it from polynomials in 2 to variables. There could be some relation to the variable techniques used for integer multiplication as we covered here, but again we’re just at the point of asking. Dyson ends his paper with the opinion that
“such an investigation … would not be in any way a hopeless undertaking.”
Have there been useful extensions of Dyson’s lemma, as opposed to improvements by Roth and others to his approximation bounds? Has Dyson’s “nonhopeless investigation” been brought to fruition? What other applications would it have? What are the closest techniques that have been used in complexity theory?
The search for proving transcendental numbers has gone in other directions. These are represented in a survey paper by Jeffrey Shallit, which grew into the book Automatic Sequences with JeanPaul Allouche. To answer our reader question above, in the book they trace the transcendence of to a 1916 paper by Aubrey Kempner which uses a different technique.
[Edit: Fixed Roth bound on degree.]
]]>Cropped from Maths History source 
Emil Post was the first to use the formal notion of reduction between problems. We discussed Post’s wonderful work and its relevance to complexity earlier here.
Today Ken and I want to discuss the notion of reduction, and also perhaps some jokes for your amusement.
Reductions are used throughout mathematics. We regard Post’s definition from 1944 as quintessential. His are called manyone reductions. Here is the definition in a functional style that we’ve tried to promote in other cases: A problem reduces to a problem if there is a computable function such that for any instance of problem ,
If and are languages then this gives the familiar condition . But the idea can be more general: and can be functions.
Alan Turing had earlier defined an even more general kind of reduction, in which given one computes multiple , gets the answer for each of them, and finally pieces together the answers to obtain . But this feels more like “expanding” than “reducing.” We want it to be: one , one . That was Post’s notion.
The idea of reduction is simple: When confronted with some problem, be lazy: Instead of solving the problem, show that it can be changed and then solved by some previously known method. That is, show that your problem is reduced to someone’s already solved problem. Be lazy.
In this form, we can think of two mathematical examples that are much older than Post’s:
then your multiplication is reduced to a much easier addition. John Napier gave the name “logarithm” in 1614 and soon William Oughtred built a physical device to automate this reduction. If you are over 40, perhaps you have seen one.
The latter example is in Wikipedia’s article on reduction. The former, however, has the signature idea of reduction between problems, more than massaging instances of the same problem. What we really want to know is:
When was the process of transforming between problems first known by the term reduction?
We wonder if tracing an old kind of joke can help. The point is that reductions are a neat source of jokes. Here is a classic one, taken from a big list of math jokes. The list gives a second form of the joke based on reductions and there are many others. More on the jokes later.
A physicist and a mathematician are sitting in a faculty lounge. Suddenly, the coffee machine catches on fire. The physicist grabs a waste basket, empties the basket, leaps towards the sink, fills the basket with water, and puts out the fire. As this coffee machine has done this before they agree to keep a waste basket next to the coffee machine filled with water.
The next day, the same two are sitting in the same lounge. Again, the coffee machine catches on fire. This time, the mathematician stands up, grabs the waste basket that is filled with water. Empties it, places some trash in the basket, and hands it to the physicist. Thus reducing the problem to a previously solved one.
Instead of putting out a fire, the following video is about retrieving a shoe that is floating away.
Here is an example of reductions that are not so silly and a little less simple.
Imagine that Alice and Bob are at it again. Bob wants to be able to multiply integers fast and he plans on building a hardware system that stores the answers in a table. Then his hardware system will be able to compute the product of two integers by just looking up the answers. Okay, there are really better ways to do this, but just play along for the moment.
Bob’s table is big and he is troubled. The above table has entries just to multiply numbers less than . Clearly for a more extensive table the cost grows fast. He asks his friend Alice for some help. She says:”Just store the diagonal values and I can show you how to handle the general case.” Here is her old trick.
Using this allows Bob to just store the diagonal of the multiplication table, and forget all the rest. It is a powerful reduction that shows:
One can reduce integer multiplication to addition and taking the square of a number.
For example,
Wikipedia actually gives the above as the first example in its article on the complexity kind of reduction. These kind of reductions not only define NP hardness and completeness, they are really needed to understand what the problem is really about.
For example, once one accepts that a language in can be represented by a uniform family of Boolean circuits , one for each length of instances for , the reduction to SAT is quickly defined: The have auxiliary inputs , where is polynomial in , such that there is a making if and only if . The circuit can consist of binary NAND gates , each with input wires (which may be inputs or ) and one or more output wires (which may be the overall output ). The reduction first constructs the Boolean formula
To apply to a given , simply substitute the bits of for the variables and simplify to make the formula . Then is satisfiable if and only if . The reduction not only proves the completeness of SAT (indeed, 3SAT) instantly, it conveys the character both of SAT and what is all about.
This leads us to wonder something about teaching complexity theory. Maybe reductions, not languages, should be the principal objects of study.
This may seem like a joke but there are benefits. The reductions are composable in ways that languages are not. They carry the source and target problems with them, at least in the form of the function’s arguments and values. Emphasizing reductions highlights the greatest success of complexity theory to date, which is proving relations between problems, rather than its failure to classify languages via lower bounds.
Here are a selection from a big list of math jokes that speak most to computing, plus one from here. We have embellished a few of them:
How should the concept of reduction be taught and emphasized? Can you trace the age of the classic reduction joke?
There is a setting for multiplication where our Alice and Bob reduction example fails. Two of our latter list of jokes might suggest it to you. What is the issue?
]]>
Should everyone be taught coding in high school?
Today we will discuss their recent comments on this issue.
Bob is a longtime friend of mine, so I want to say that that up front. He is a professor of computer science at Princeton. Cuban is an emeritus professor of education at Stanford. Note, I do not know Cuban but will call him Larry—I hope that is fine. Besides what they say in the article, Larry wrote a three–part series in 2017 on his own education blog, while Bob was associated with a 2013 White Houseled initiative on coding in schools.
The WSJ article consists of ten short paragraphs by Bob followed by eleven from Larry. This is sequential structure—like with statements —or like having candidate townhalls for consectuive hours on CNN and such. What we’d like to see is a debate—like having parallel processes that must sync and communicate. Below we imagine one based on statements in the article.
The following is a paraphrase that tries to restructure some of the article in debate format.
Bob: Teaching students to code will help them understand logical thinking and foster creativity.
Larry: You could say the same for teaching writing, math, history, and many other subjects. There is no research that shows that coding is better than other topics in this regard.
Bob: I am not aware of any research that shows that each topic that is taught now is better than coding.
Larry: Yes that is true, but consider how durable core education has been for our society. A century ago, industrial groups pushed the federal government to require vocational training in schools for particular industrial and agricultural skills and to establish separate vocational schools. Those undermined the broader goals of social development and civic engagement.
Bob: Technology is basic to much of society’s issues. Perhaps the Iowa caucus fiasco could have been avoided if they had a better understanding of computing.
Larry: I think the main argument for coding is being pushed by technology CEOs. They need more coders. The educational system should not just do what they need. Do you not agree Bob?
Bob: If we teach coding it seems that it may help in lessening economic and gender based gaps. In summary, in the last millennium, education was based on reading, writing, and arithmetic. Perhaps we should now switch to reading, writing, and computing. Coding includes arithmetic and a whole lot more.
Larry: I agree education should help students achieve their potential. I just do not see that coding will do this. And further, data and projections from the U.S. Bureau of Labor Statistics show only an 11 percent increase in IT jobs, from 4.5 to 5 million, between 2018 and 2028. That’s going out a whole decade and still IT will only be about 3 percent of all jobs. Health care will grow in that time by as many jobs as IT has total.
Bob: Those percentages hide much of the benefit. Only a fraction of the thousands I have taught—in person and online—work in tech companies. The rest have gone into a broad variety of careers. Coding literacy is becoming a necessity in health care, social assistance, business services, construction, entertainment, manufacturing, and even politics.
Larry: But what would you cut to make room? Foreign language? History? Arts or music? Or decrease other aspects of math and science? Curricula are already crowded with required courses and frequent testing.
What should the role of coding be in our society? Who is right?
[some word and grammar tweaks]