Connect the Stars
How papers are like constellations
Bob Vaughan is a mathematician at Penn State University. He is also a Fellow of the Royal Society—not ours, Ben Franklin helped make it tough for us to have one about 236 years ago this Wednesday. He is a great expert on analytic number theory, especially applied to the prime numbers. His work involves the deep connections between integers and complex numbers that were first charted by Leonhard Euler in the time of Franklin.
Today we examine how connections are made in the literature, and how choosing them influences our later memory of what is known and what is not.
Proved mathematical statements are like stars of various magnitudes: claim, proposition, lemma, theorem… A paper usually connects several of the former kinds to a few bright theorems. Often there are different ways the connections could go, and a lengthened paper may extend them to various corollaries and other theorems. Thus we can get various constellations even from the same stars. Consider the Big Dipper and the larger Ursa Major:
Composite of src1, src2.
These two figures speak to different collective memories of what that part of the sky contains. Different links and memories affect what we think we know when we return to an area with a new telescope.
In a recent post, Dick and I made two connections between modes of simulation and outstanding problems in complexity theory:
- For computable numbers linear time is equivalent to real time.
- The Hartmanis-Stearns conjecture (HSC) implies that integer multiplication is not in linear time, and so .
Thanks to public and private respondents we have found the ambient star map in the literature, but we wish to talk about how it was constellated. First we motivate this with an anecdote about Vaughan.
As related in Chapter 2 of the book Prime-Detecting Sieves by Glyn Harman, Vaughan found a new identity by re-thinking connections in long-known formulas. The ratio of the Riemann zeta function and its derivative is connected to the primes via the following identity for complex with real part :
where the Mangoldt function contributes when is a power of the prime and zero otherwise. This function also satisfies the identity
where is the Möbius function which we covered here. Motivated by finite approximations, Vaughan crafted an expression using arbitrary functions and that cancels down to the Riemann ratio:
The point is that by choosing and for selected bounds and , and employing the identity between and , Vaughan was able to write in a form that is tailor-made for complex analysis. That is Vaughan’s identity proper, and its details in Harman’s text are findable by clicking here and searching the word “genesis” to bring up the unique page.
Referring to Vaughan’s ratio formula as “(2.1.1)” and the previous formula for as “(1.4.5),” and showing how Vaughan’s own identity for comes by re-arranging the latter, Harman writes:
It seems paradoxical that (1.4.5) was known for so long yet this elementary re-arrangement was not discovered until Vaughan wrote down (2.1.1). The optimist might take heart at this point: What further extremely useful identities are awaiting discovery, being currently hidden in a well-known formula?
Well, Harman continues by quoting what he calls the pessimist’s view,
All identities are trivial—once written down by somebody else!
Vaughan himself might concur, as a long list of amusing quotations on his homepage ends with the statement,
Everything we know is trivial, and the rest is a Field’s Medal. There is nothing between.
But we remain optimists, if only because many things that others know well do not look trivial to us.
Recall that HSC states that an irrational number whose digits (in any base) can be output in real time is transcendental. Expanding our statements in the introduction a bit:
- A digit sequence is computable in real time (meaning th digit output on th step) if and only if it is computable in linear time (meaning th digit appears within steps).
- If the time to multiply two -digit integers is linear, then is in linear time and so HSC is false.
The former was believed by me (Ken) since I had recalled HSC being stated for “linear” time, but the references we found—even a 2012 paper by Rusins Freivalds—only gave the real-time statement of HSC. The latter statement applies the 1976 paper by Richard Brent featured in the post and some extra work to show that since is not a Liouville number, the approximation shown by Brent suffices to compute the digit sequence. We asked a few experts whether they had seen either statement, and they all said no though one thought the former was clear. So we went ahead with the post.
Albert Meyer quickly gave us a helpful comment with a reference for the former statement, and we updated the post to reflect this. This was to a 1970 paper of his with Patrick Fischer and Arne Rosenberg, which in turn cited a 1968 paper by Fischer. The statement, however, still does not appear in “proclaimed” form as a result in that paper. Instead it is assembled from
Lemma: If a sequence can be output with the nth digit appearing at step then it can be output with the digit at time , provided .
Theorem: If a sequence can be output within time , then the nth digit can be made to appear at exactly time .
The proof of the Theorem ends by revising the goal to be outputting the th digit at time so that the Lemma can be applied, and then says “Further details are omitted…” Next the paper discusses a possible shortcut that doesn’t work. Then follows a section “Applications” and a “Corollary 1” still talking about general time functions . In our opinion someone leafing through for reference could easily miss the key sentence in the middle of the paragraph following the corollary:
Thus, for generation of sequences, the cases of real time () and linear time (, ) coincide.
Both papers also emphasize up-front that for machines with counters in place of tapes the outcome is surprisingly different: real and linear time are not equivalent. Of course the equivalence for tapes wouldn’t be lost on someone working in this area between 1965 and the early 1970’s, but references closer to 2012 seem to be missing the connection. Or presuming it in a hard-to-trace manner, as in this paper brought to our attention here, where “real time” is written but defined as linear time.
The Multiplication Connection
This leaves our second observation, which was not attested in comments and which surprised some people in private. However, Dick’s suggestion of Vaughan for this post made me realize I’d forgotten to write to another professor at Penn State—the author of the best recent advance on integer multiplication which Lance Fortnow had hailed as “Paper of the Year” for 2007: Martin Fürer. Martin graciously replied (somewhat edited, and with thanks for his permission):
I distinctly remember having discussed fast computation of pi in [Volker Strassen and Ernst Specker's] seminar in Zurich probably sometime [after 1975, and] that square roots can be computed as fast as multiplication. But I would assume that it was well known earlier that division and square roots can be reduced to multiplication by Newton’s method. Maybe Brent just formulated it precisely in terms of complexity. Furthermore, Liouville’s theorem was well known to theoretical computer scientists working in mathematics departments.
Likewise … it was safe to conjecture that multiplication could not be done in linear time on a Turing machine. Nobody believes that an irrational algebraic number can be computed without multiplication. Thus the [H-S] conjecture makes much sense. I would believe that this was the reasoning of Hartmanis and Stearns, even though I have not read about it.
Thus apparently everything we “discovered” was well-known in the Richard Nixon era. We still, however, perceive an 18-minute gap in the literature, which prompts our paraphrase of the famous Watergate question:
What did we well-know, and when did we well-know it?
Martin expressed surprise, however, at integer multiplication being in linear time for alternating machines. We must admit that by calling it a “folk lemma” it does not have a pedigreed statement and proof. Actually maybe it was originally proved by Dick, but he is not sure—and maybe we do need the extra round of alternations which we mentioned here…
The most visible connection in the literature is that the graph of integer multiplication is a rudimentary relation—sometimes it is defined that way, sometimes not. Celia Wrathall proved that all such relations are in ATM linear time. This is equivalent to ATM real-time by another theorem of hers with Ronald Book—no, it was Sheila Greibach with Book, stated for NTMs not ATMs.
Overall this is easy enough for us AARP-eligible types, but for graduate students not so much. How can we patch holes in our field that newcomers may fall into? One might reply that a student should become knowledgable thoroughly in an area. More and more, however, advances are calling for connections between subfields. One cannot demand more of both inside expertise and interdisciplinary reach by human recall. Thus we need a sky atlas of results, tricks, and problems, but how to organize it?
How can we improve the concept-connectivity of our field? Will the Web and intelligent search do it for us?
Finally, we offer an intriguing open problem about how fast we can multiply integers. It is possible to multiply integers very fast if they are represented via the Chinese Remainder Theorem—though perhaps still not in linear time. Alas while addition and multiplication are fast in this representation, it seems to make it harder to compare two numbers. Thus a natural question is:
Does there exist a representation of integers so that addition, multiplication, and comparison are all doable in linear time? Basically, is there a linear time discretely ordered ring?
Here is a formal statement that also allows the representation to be redundant. Can we find linear-time computable functions on binary strings and a (not necessarily linear-time) translation function to integers such that:
The function might be many-one, giving a redundant coding of the integers. Using a left-inverse we might be able to speed up long series of integer operations by paying the time penalty once, computing in the string structure, and translating the answer back.
Could the linear-time function possibly help us in Brent’s simulation of Newton’s Method? Then HSC would entail that no such representation exists.
[added pointer to Vaughan's identity itself]