Getting to the Roots of Factoring
We revisit a paper from 1994
Richard Lipton is, among so many other things, a newlywed. He and Kathryn Farley were married on June 4th in Atlanta. The wedding was attended by family and friends including many faculty from Georgia Tech, some from around the country, and even one of Dick’s former students coming from Greece. Their engagement was noted here last St. Patrick’s Day, and Kathryn was previously mentioned in a relevantly-titled post on cryptography.
Today we congratulate him and Kathryn, and as part of our tribute, revisit a paper of his on factoring from 1994.
They have just come back from their honeymoon in Paris. Paris is many wonderful things: a flashstone of history, a center of culture, a city for lovers. It is also the setting for most of Dan Brown’s novel The Da Vinci Code and numerous other conspiracy-minded thrillers. Their honeymoon was postponed by an event that could be a plot device in these novels: the Seine was flooded enough to close the Louvre and Musée D’Orsay and other landmarks until stored treasures could be brought to safe higher ground.
It is fun to read or imagine stories of cabals seeking to collapse world systems and achieve domination. Sometimes these stories turn on scientific technical advances, even purely mathematical points as in Brown’s new novel, Inferno. It needs a pinch to realize that we as theorists often verge on some of these points. Computational complexity theory as we know it is asymptotic and topical, so it is a stretch to think that papers such as the present one impact the daily work of those guarding the security of international commerce or investigating possible threats. But from its bird’s-eye view there is always the potential to catch a new glint of light reflected from the combinatorial depths that could not be perceived until the sun and stars align right. In this quest we take a spade to dig up old ideas anew.
|Pei’s Pyramid of the Louvre Court = Phi delves out your prime factor… (source)
The Paper and the Code
The paper is written in standard mathematical style: first a theorem statement with hypotheses, next a series of lemmas, and the final algorithm and its analysis coming at the very end. We will reverse the presentation by beginning with the algorithm and treating the final result as a mystery to be decoded.
Here is the code of the algorithm. It all fits on one sheet and is self-contained; no abstruse mathematics text or Rosetta Stone is needed to decipher. The legend says that the input is a product of two prime numbers, is a polynomial in just one variable, and refers to the greatest-common-divisor algorithm expounded by Euclid around 300 B.C. Then come the runes, which could not be simpler:
- Repeat until exit:
- a random number in ;
- if then exit.
Exiting enables carrying out the two prime factors of —but a final message warns of a curse of vast unknowable consequences.
How many iterations must one expect to make through this maze before exit? How and when can the choice of the polynomial speed up the exploration? That is the mystery.
Our goal is to expose the innards of how the paper works, so that its edifice resembles another famous modern Paris landmark:
This is the Georges Pompidou Centre, whose anagram “go count degree prime ops” well covers the elements of the paper. Part of the work for this post—in particular the possibility of improving to —is by my newest student at Buffalo, Chaowen Guan.
A First Analysis
Let with and prime. To get the expected running time, it suffices to have good lower and upper bounds
and analogous bounds for . Then the probability of success on any trial is at least
This lower bounds the probability of , whereupon gives us the factor .
We could add a term for the other way to have success, which is . However, our strategy will be to make and hence close to by considering cases where but is still large enough to matter. Then we can ignore this second possibility and focus on . At the end we will consider relaxing just so that is bounded away from .
Note that we cannot consider the events and to be independent, even though and are prime, because and may introduce bias. We could incidentally insert an initial text for without affecting the time or improving the success probability by much. Then conditioned on its failure, the events and become independent via the Chinese Remainder Theorem. This fact is irrelevant to the algorithm but helps motivate the analysis in part.
This first analysis thus focuses the question to become:
How does computing change the sampling?
We mention in passing that Peter Shor’s algorithm basically shows that composing certain (non-polynomial) functions into the quantum Fourier transform greatly improves the success of the sampling. This requires, however, a special kind of machine that, according to some of its principal conceivers, harnesses the power of multiple universes. There are books and even a movie about such machines, but none have been built yet and this is not a Dan Brown novel so we’ll stay classically rooted.
Using Polynomial Roots
Two great facts about polynomials of degree are:
- There are guaranteed to be at most roots.
- If , then .
The second requires the coefficients of to be integers. Neither requires all the roots to be integers, but we will begin by assuming this is the case. Take to be the set of integer roots of . Then define
where as usual . The key point is that
To prove this, suppose the random belongs to but not to . Then for some , so , but since . There is still the possibility that is a nonzero multiple of , which would give and deny success, but this entails and so this is accounted by subtracting off .
Our lower bound will be based on . There is one more important element of the analysis. We do not have to bound the running time for all , that is, all pairs of primes. The security of factoring being hard is needed for almost all . Hence to challenge this, it suffices to show that is large in average case over . Thus we are estimating the distributional complexity of a randomized algorithm. These are two separate components of the analysis. We will show:
For many primes belonging to a large set of primes of length substantially below , where is the length of , is “large.”
We will quantify “large” at the end, and it will follow that since is substantially greater, is “tiny” in the needed sense. Now we are ready to estimate the key cardinality .
In the best case, can be larger than by a factor of . This happens if for every root , the values do not hit any other members of . When this happens, itself can be as large as . Then
By a similar token, for any , if , then —or in general,
The factor of is the lever by which to gain a higher likelihood of quick success. When will it be at our disposal? It depends on whether is “good” in the sense that and also on itself being large enough.
For each root and define the “strand” where There are always distinct values in any strand. If then every strand has most as non-roots. There is still the possibility that —that is, such that —which would prevent a successful exit. This is where really comes in, attending to the upper bound .
| The Paris church of St. Sulpice and its crypt (source)
Hence what can make a prime “bad” is having a low number of strands. When and the strands and coincide—and this happens for any other such that divides .
Here is where we hit the last important requirement on . Suppose where is the product of every prime other than . Then and coincide for every prime . It doesn’t matter that is astronomically bigger than or ; the strands still coincide within and within .
Hence what we need to do is bound the roots by some value that is greater than any we are likely to encounter. The is not too great: if we limit to of some same given length as that of , then so . We need not impose the requirement but must replace above by where . We can’t get in trouble from such that divides and divides since then divides already. This allows the key observation:
For any distinct pair , there are at most primes such that divides .
Thus given we have “slots” for primes . Every bad prime must occupy a certain number of these slots. Counting these involves the last main ingredient in Dick’s paper. We again try to view it a different way.
Counting Bad Primes
Given , and replacing the original with , ultimately we want to call a prime bad if , where . We will approach this by calling “bad” if there are strands.
For intuition, let’s suppose , . If we take as the paper does, then we can make bad by inserting it into three slots: say , , and . We could instead insert a copy of into , , and , which lumps into one strand and leaves free to make two others. In the latter case, however, we also know by transitivity that divides , , and as well. Thus we have effectively used up not slots on . Now suppose instead, so “bad” means getting down to strands. Then we are forced to create at least one -clique and this means using more than slots. Combinatorially the problem we are facing is:
Cover nodes by cliques while minimizing the total number of edges.
This problem has an easy answer: make all cliques as small as possible. Supposing is an integer, this means making -many -cliques, which (ignoring the difference between and ) totals edges. When is constant this is , but shows possible ways to improve when is not constant. We conclude:
Lemma 1 The number of bad primes is at most .
We will constrain by bounding the degree of . By this will also bound relative to so that the number of possible strands is small with respect to , which will lead to the desired bound on . Now we are able to conclude the analysis well enough to state a result.
Define to mean problems solvable on a fraction of inputs by randomized algorithms with expected time . Superscripting means having an oracle to compute from for free. If is such that the time to compute is and this is done once per trial, then a given algorithm can be re-classified into without the oracle notation.
Theorem 2 Suppose is a sequence of polynomials in of degrees having integer roots in the interval , for all . Then for any fixed , the problem of factoring -bit integers with belongs to
provided and .
Proof: We first note that the probability of a random making is negligible. By the Chinese Remainder Theorem, as remarked above, gives independent draws and and whether depends only on . This induces a polynomial over the field of degree (at most) . So the probability of getting a root mod is at most
which is exponentially vanishing. Thus we may ignore in the rest of the analysis. The chance of a randomly sampled in a strand of length coinciding with another member of is likewise bounded by and hence ignorable.
The reason why the probability of giving a root in the field is not vanishing is that is close to . By , we satisfy the constraint
The condition ensures that this is the actual asymptotic order of . Since we are limiting attention to primes of the same length as , the “” above can be to base . Hence has the right order to give that for some and constant fraction of primes of length , the success probability of one trial over satisfies
Hence the expected number of trials is . The extra in the theorem statement is the time for each iteration, i.e., for arithmetic modulo and the Euclidean algorithm.
It follows that if is also computable modulo in time, and presuming so that , then factoring products of primes whose lengths differ by just a hair is in randomized average-case polynomial time. Of course this depends on the availability of a suitable polynomial . But could be any polynomial—it needs no relation to factoring other than having plenty of distinct roots relative to its degree as itemized above. Hence there might be a lot of scope for such “dangerous” polynomials to exist.
| Is there a supercomputer under the Palais Royale? (source)
Dick’s paper does give an example where a with specified properties cannot exist, but there is still a lot of play in the bounds above. This emboldens us also to ask exactly how big the “hair” needs to be. We do not actually need to send toward zero. If a constant fraction of the values get bounced by the event, then the expected time just goes up by the same constant factor.
We have tried to present Dick’s paper in an “open” manner that encourages variations of its underlying enigma. We have also optically improved the result by using rather than as in the paper. However, this may be implicit anyway since the paper’s proofs might not require “” to be constant, so that by taking one can make for any desired factor . Is all of this correct?
If so, then possibly one can come even tighter to for the length of . Then the question shifts to the possibilities of finding suitable polynomials . The paper “Few Product Gates But Many Zeroes,” by Bernd Borchert, Pierre McKenzie, and Klaus Reinhardt, goes into such issues. This paper investigates “gems”—that is, integer polynomials of degree having distinct integer roots and minimum possible circuit complexity for their degree—finding some for as high as 55 but notably leaving open. Moreover, the role of a limitation on the magnitude of a constant fraction of a gem’s roots remains at issue, along with roots exceeding having many relatively small prime factors.
Finally, we address the general case with rational coefficients (and ). If in lowest terms then means (divided by ) so the algorithm is the same. Suppose is a rational root in lowest terms and does not divide , nor the denominator of any Then we can take such that for some and define . This gives
which we write as . Then . Because is a root, it follows that is a sum of terms in which each numerator is a multiple of and each denominator is not. So in lowest terms where possibly . Thus either yields or falls into one of two cases we already know how to count: is another root of or we have found a root mod . Since behaves the same as for all , we can define integer “strands” as before. There remains the possibility that strands induced by two roots and coincide. Take the inverse for and resulting integer , then the strands coincide if . This happens iff . Multiplying both sides by gives
so it follows that divides the numerator of in lowest terms. Thus we again have “slots” for each distinct pair of rational roots and each possible prime divisor of the numerator of their difference. Essentially the same counting argument shows that a “bad” must fill of such slots. The other ways can be bad include dividing the denominator of a root or the denominator of a coefficient —although neither way is mentioned in the paper it seems the choices for and in the above theorem leave just enough headroom. Then we just need to be a bound on all numerators and denominators involved in the bad cases, arguing as before. Last, it seems as above that only a subset of the roots with constant (or at least non-negligible) is needed to obey this bound. Assuming this sketch of Dick’s full argument is airtight and works for our improved result, we leave its possible further ramifications over the integer case as a further open problem.
Update 7/10: >
I’ve made a web folder of photos from Dick’s wedding and honeymoon.
[linked wedding announcement, clarified nature of Shor’s phi, added more about gems, linked photos, fixed n/2 – eps to n/2 – eps*n in theorem statement]