How papers are like constellations

Bob Vaughan is a mathematician at Penn State University. He is also a Fellow of the Royal Society—not ours, Ben Franklin helped make it tough for us to have one about 236 years ago this Wednesday. He is a great expert on analytic number theory, especially applied to the prime numbers. His work involves the deep connections between integers and complex numbers that were first charted by Leonhard Euler in the time of Franklin.

Today we examine how connections are made in the literature, and how choosing them influences our later memory of what is known and what is not.

Proved mathematical statements are like stars of various magnitudes: claim, proposition, lemma, theorem… A paper usually connects several of the former kinds to a few bright theorems. Often there are different ways the connections could go, and a lengthened paper may extend them to various corollaries and other theorems. Thus we can get various constellations even from the same stars. Consider the Big Dipper and the larger Ursa Major:

 Composite of src1, src2.

These two figures speak to different collective memories of what that part of the sky contains. Different links and memories affect what we think we know when we return to an area with a new telescope.

In a recent post, Dick and I made two connections between modes of simulation and outstanding problems in complexity theory:

1. For computable numbers linear time is equivalent to real time.

2. The Hartmanis-Stearns conjecture (HSC) implies that integer multiplication is not in linear time, and so ${\mathsf{NLIN} \neq \mathsf{DLIN}}$.

Thanks to public and private respondents we have found the ambient star map in the literature, but we wish to talk about how it was constellated. First we motivate this with an anecdote about Vaughan.

## Vaughan’s Identity

As related in Chapter 2 of the book Prime-Detecting Sieves by Glyn Harman, Vaughan found a new identity by re-thinking connections in long-known formulas. The ratio of the Riemann zeta function and its derivative is connected to the primes via the following identity for complex ${s}$ with real part ${> 1}$:

$\displaystyle - \frac{\zeta'}{\zeta}(s) = \sum_{n=1}^{\infty}\frac{\Lambda(n)}{n^s},$

where the Mangoldt function ${\Lambda(n)}$ contributes ${\ln p}$ when ${n}$ is a power of the prime ${p}$ and zero otherwise. This function also satisfies the identity

$\displaystyle \Lambda(n) = \sum_{hd = n}\mu(d)\ln h$

where ${\mu}$ is the Möbius function which we covered here. Motivated by finite approximations, Vaughan crafted an expression using arbitrary functions ${f(s)}$ and ${g(s)}$ that cancels down to the Riemann ratio:

$\displaystyle -\frac{\zeta'}{\zeta}(s) = f(s) - \zeta(s)f(s)g(s) - \zeta'(s)g(s) - \left(\frac{\zeta'}{\zeta}(s) + f(s)\right)(1 - \zeta(s)g(s)).$

The point is that by choosing ${f(s) = \sum_{m \leq U}\Lambda(m)/m^s}$ and ${g(s) = \sum_{d\leq V}\mu(d)/d^s}$ for selected bounds ${U}$ and ${V}$, and employing the identity between ${\Lambda}$ and ${\mu}$, Vaughan was able to write ${\Lambda(n)}$ in a form that is tailor-made for complex analysis. That is Vaughan’s identity proper, and its details in Harman’s text are findable by clicking here and searching the word “genesis” to bring up the unique page.

Referring to Vaughan’s ratio formula as “(2.1.1)” and the previous formula for ${\Lambda(n)}$ as “(1.4.5),” and showing how Vaughan’s own identity for ${\Lambda(n)}$ comes by re-arranging the latter, Harman writes:

It seems paradoxical that (1.4.5) was known for so long yet this elementary re-arrangement was not discovered until Vaughan wrote down (2.1.1). The optimist might take heart at this point: What further extremely useful identities are awaiting discovery, being currently hidden in a well-known formula?

Well, Harman continues by quoting what he calls the pessimist’s view,

All identities are trivial—once written down by somebody else!

Vaughan himself might concur, as a long list of amusing quotations on his homepage ends with the statement,

Everything we know is trivial, and the rest is a Field’s Medal. There is nothing between.

But we remain optimists, if only because many things that others know well do not look trivial to us.

## Hartmanis-Stearns, Again

Recall that HSC states that an irrational number whose digits (in any base) can be output in real time is transcendental. Expanding our statements in the introduction a bit:

1. A digit sequence is computable in real time (meaning ${n}$th digit output on ${n}$th step) if and only if it is computable in linear time (meaning ${n}$th digit appears within ${O(n)}$ steps).

2. If the time ${M(n)}$ to multiply two ${n}$-digit integers is linear, then ${\sqrt{2}}$ is in linear time and so HSC is false.

The former was believed by me (Ken) since I had recalled HSC being stated for “linear” time, but the references we found—even a 2012 paper by Rusins Freivalds—only gave the real-time statement of HSC. The latter statement applies the 1976 paper by Richard Brent featured in the post and some extra work to show that since ${\sqrt{2}}$ is not a Liouville number, the approximation shown by Brent suffices to compute the digit sequence. We asked a few experts whether they had seen either statement, and they all said no though one thought the former was clear. So we went ahead with the post.

Albert Meyer quickly gave us a helpful comment with a reference for the former statement, and we updated the post to reflect this. This was to a 1970 paper of his with Patrick Fischer and Arne Rosenberg, which in turn cited a 1968 paper by Fischer. The statement, however, still does not appear in “proclaimed” form as a result in that paper. Instead it is assembled from

Lemma: If a sequence can be output with the nth digit appearing at step ${2T(n)}$ then it can be output with the digit at time ${T(n)}$, provided ${T(n) \geq n}$.

Theorem: If a sequence can be output within time ${T(n)}$, then the nth digit can be made to appear at exactly time ${T(n)}$.

The proof of the Theorem ends by revising the goal to be outputting the ${n}$th digit at time ${2T(n)}$ so that the Lemma can be applied, and then says “Further details are omitted…” Next the paper discusses a possible shortcut that doesn’t work. Then follows a section “Applications” and a “Corollary 1” still talking about general time functions ${T(n)}$. In our opinion someone leafing through for reference could easily miss the key sentence in the middle of the paragraph following the corollary:

Thus, for generation of sequences, the cases of real time (${T(n) = n}$) and linear time (${T(n) = kn}$, ${k > 1}$) coincide.

Both papers also emphasize up-front that for machines with counters in place of tapes the outcome is surprisingly different: real and linear time are not equivalent. Of course the equivalence for tapes wouldn’t be lost on someone working in this area between 1965 and the early 1970’s, but references closer to 2012 seem to be missing the connection. Or presuming it in a hard-to-trace manner, as in this paper brought to our attention here, where “real time” is written but defined as linear time.

## The Multiplication Connection

This leaves our second observation, which was not attested in comments and which surprised some people in private. However, Dick’s suggestion of Vaughan for this post made me realize I’d forgotten to write to another professor at Penn State—the author of the best recent advance on integer multiplication which Lance Fortnow had hailed as “Paper of the Year” for 2007: Martin Fürer. Martin graciously replied (somewhat edited, and with thanks for his permission):

I distinctly remember having discussed fast computation of pi in [Volker Strassen and Ernst Specker’s] seminar in Zurich probably sometime [after 1975, and] that square roots can be computed as fast as multiplication. But I would assume that it was well known earlier that division and square roots can be reduced to multiplication by Newton’s method. Maybe Brent just formulated it precisely in terms of complexity. Furthermore, Liouville’s theorem was well known to theoretical computer scientists working in mathematics departments.

Likewise … it was safe to conjecture that multiplication could not be done in linear time on a Turing machine. Nobody believes that an irrational algebraic number can be computed without multiplication. Thus the [H-S] conjecture makes much sense. I would believe that this was the reasoning of Hartmanis and Stearns, even though I have not read about it.

Thus apparently everything we “discovered” was well-known in the Richard Nixon era. We still, however, perceive an 18${\frac{1}{2}}$-minute gap in the literature, which prompts our paraphrase of the famous Watergate question:

What did we well-know, and when did we well-know it?

Martin expressed surprise, however, at integer multiplication being in linear time for alternating machines. We must admit that by calling it a “folk lemma” it does not have a pedigreed statement and proof. Actually maybe it was originally proved by Dick, but he is not sure—and maybe we do need the extra round of alternations which we mentioned here

The most visible connection in the literature is that the graph of integer multiplication is a rudimentary relation—sometimes it is defined that way, sometimes not. Celia Wrathall proved that all such relations are in ATM linear time. This is equivalent to ATM real-time by another theorem of hers with Ronald Book—no, it was Sheila Greibach with Book, stated for NTMs not ATMs.

Overall this is easy enough for us AARP-eligible types, but for graduate students not so much. How can we patch holes in our field that newcomers may fall into? One might reply that a student should become knowledgable thoroughly in an area. More and more, however, advances are calling for connections between subfields. One cannot demand more of both inside expertise and interdisciplinary reach by human recall. Thus we need a sky atlas of results, tricks, and problems, but how to organize it?

## Open Problems

How can we improve the concept-connectivity of our field? Will the Web and intelligent search do it for us?

Finally, we offer an intriguing open problem about how fast we can multiply integers. It is possible to multiply integers very fast if they are represented via the Chinese Remainder Theorem—though perhaps still not in linear time. Alas while addition and multiplication are fast in this representation, it seems to make it harder to compare two numbers. Thus a natural question is:

Does there exist a representation of integers so that addition, multiplication, and comparison are all doable in linear time? Basically, is there a linear time discretely ordered ring?

Here is a formal statement that also allows the representation to be redundant. Can we find linear-time computable functions ${a(x,y),m(x,y),\ell(x,y)}$ on binary strings and a (not necessarily linear-time) translation function ${t(x)}$ to integers such that:

$\displaystyle t(a(x,y)) = t(x) + t(y),\quad t(m(x,y)) = t(x)*t(y),\quad \ell(x,y) \iff t(x) \leq t(y)?$

The function ${t}$ might be many-one, giving a redundant coding of the integers. Using a left-inverse ${t'}$ we might be able to speed up long series of integer operations by paying the time penalty once, computing in the string structure, and translating the answer back.

Could the linear-time function ${m(x,y)}$ possibly help us in Brent’s simulation of Newton’s Method? Then HSC would entail that no such representation exists.

[added pointer to Vaughan’s identity itself]

July 2, 2012 8:30 am

So what is Vaighan’s identity???

Or at least provide a link.

2. July 2, 2012 8:47 am

I’ll write it out when time allows—it seemed too long beyond saying “…in a form that is tailor-made for complex analysis” for the section. But here is a link to Harman’s book, and searching the word “genesis” brings up the unique page on which Vaughan’s identity appears.

3. July 2, 2012 4:03 pm

The most visible connection in the literature is that the graph of integer multiplication is a rudimentary relation—sometimes it is defined that way, sometimes not. Celia Wrathall proved that all such relations are in ATM linear time. This is equivalent to ATM real-time by another theorem of hers with Ronald Book—no, it was Sheila Greibach with Book, stated for NTMs not ATMs.

Overall this is easy enough for us AARP-eligible types, but for graduate students not so much. How can we patch holes in our field that newcomers may fall into?

I think this highlights a valuable function of survey articles that is not always discussed: they help us to find useful results in areas of theory that have fallen into relative inactivity or obscurity. They also help recall the excitement and purpose of these areas, and honor the contributions of older researchers. I particularly mean high-level surveys that cover a decade or more of activity.
Not every subfield is blessed with such surveys (or, perhaps the surveys are themselves forgotten). Of course, this blog often performs a similar function of remembrance, which is great (and you do so for individual researchers as well as results).

July 2, 2012 5:44 pm

No one has yet mentioned the conspiracy theory: Richard Feynman, master of deception and enemy of America.

An FBI’s tipster offered to “swear to these things on a Bible, in a court, or before President Eisenhower”:

“I do not know—but I believe that Richard Feynman is either a Communist or very strongly pro-Communist—and as such is a very definite security risk.”

“This man is, in my opinion, an extremely complex and dangerous person, and a very dangerous person to have in a position of public trust, particularly a position that so vitally affects the safety and welfare of this nation-both present and future as that of Science Advisor for President Eisenhower.”

Seven more pages of conspiracy theory follow.

In contrast, John von Neumann cultivated a quick-to-explain and easy-to-grasp public reputation as a keen student of Roman history who subscribed to the politically conservative view that a pax Americana was the best 20th century bulwark against fascism originating equally from the left and the right. Von Neumann’s personal view of history was more nuanced than this simple picture (or so his brother once told me) and yet von Neumann appreciated — as Feynman did not — the practical necessity to summarize one’s political views in the simplest possible terms … lest jealous and/or conspiracy-minded colleagues find overmuch material to work their mischief.

5. July 3, 2012 6:08 am

Axioms are defined as “self evident truths”. Let’s consider the “symmetric axiom of equality” which states that “If a=b, then b=a”.

Well, if

(a/a)*a^3 = (b/b)*a^3

where b equals a,

and the properties of logarithms allow

(b/b)*a^3 = b*(a/b)^((3*ln(a)/(ln(b))-1)/(ln(a)/(ln(b))-1))

where b does not equal a,

then clearly, that so called “symmetric axiom of equality” is neither self evident, nor always true! Thus, the quote: “All identities are trivial – once written down by somebody else!” is not only “pessimistic”, but wrong as well, for it is quite obvious that the above consequence of the “Blazys identity” shakes the very foundations of mathematics itself.

Don Blazys.