Ghostbusting Old Factoring Ideas
Can ‘Shades’ of Carmichael Numbers be busted?
MAA source |
Robert Carmichael was an American mathematician who worked and published in various areas of mathematics, but is best known for numbers that were named after him. It’s a rare treat to have your own numbers; our famous Manuel Blum has his:
A Carmichael number is a product of two or more—actually three or more—distinct primes , each of which is such that divides —or equivalently, such that the least common multiple of all such divides . Carmichael found the first example, . Note that 2, 10, and 16 all divide 560.
Today Ken and I want to talk about Carmichael numbers and their relationship to the factoring of integers.
The relationship is that Carmichael numbers masquerade as being prime by enjoying the property of Fermat’s Little Theorem: for all relatively prime to ,
The significance is that fails the first part of the randomized compositeness test devised in the late 1970′s by Michael Rabin after earlier work by Gary Miller. Call a relatively prime number chosen at random a trial. Rabin proved that for all odd and of the trials, at least one of his two parts must succeed. Hence for Carmichael all the success probability is on the second part (and is higher). This is great because Rabin’s second part not only tells you is composite, it gives you a factor.
Of course factoring has been one of the world’s enduring mysteries for decades. Another is whether there will be a third “Ghostbusters” movie. Cryptographers needing to protect schemes whose security requires the hardness of factoring would like to “bust” heuristic attacks on factoring, by showing they have only negligible chance to succeed on numbers they care about, ones that are definitely not Carmichael numbers. But like ghosts in the two movies, attack ideas keep popping up. Here we talk about a variant of several old ideas in which we try to change an into an that is somehow a “shade” of a Carmichael number, such that getting a factor of might often give us one of .
A Caveat About Tables
Carmichael himself wrote, with Edwin Smith, the book Mathematical Tables and Formulas published in 1931. It included five-place tables of many of the standard functions, such as logarithms. and other functions. Clearly, such books were once of great importance, used often by engineers and mathematicians.
What is the logarithm of 7 to base 2? We wonder if Volker Strassen used a table to look this up? The key was that it was less than 3, but knowing it is approximately 2.807354922… is quite important. This is of course the exponent on Strassen’s famous fast matrix product algorithm. By the way some web pages that discuss his algorithm quote the constant as approximately 2.808, which is a bit off.
You may find it hard to believe that we once had books on our shelves that contained tables, since now we can compute any of these functions instantly on our laptop or even on our phone—see here [urlsic] for computing logs. But once they were very much a necessary item for any well-equipped researcher.
The one problem with tables, however, is that at some point they run out. Computers can effectively give you the power to make bigger tables, but the problem may be merely deferred. An idea can look good on small numbers in tables, or even on fairly big numbers tested by computer, but utterly fail on large numbers that people would ultimately care about. So also take our small examples in this post with a grain of salt—a lot more work is needed to tell whether ideas like what follows are possible or can be “busted.”
A Generic Attack
Let be a subset of the natural numbers. They are required to have one property: there is a fast factoring algorithm that works on elements of . We allow to be probabilistic or even to not be an algorithm, but to be a circuit. By “fast” we would like polynomial time, yet even time just better than known methods would be acceptable. Since we can combine fast factoring algorithms, we can suppose includes the Carmichael numbers.
The factoring method is quite simple:
- Select a number and compute . Call a probe.
- Try on .
- If returns a factor of , then we are done.
- If returns a wrong answer or returns just a divisor of , then go back to step (1) and repeat.
The probe might be a prime number, to minimize the chance of getting nothing in the last step, and might be chosen at random. What we wonder is whether probes can be chosen strategically to improve the prospects of hitting the success set of . At least we know from work by Red Alford, Andrew Granville, and Carl Pomerance that there are infinitely many Carmichael numbers, indeed that the number of Carmichaels below any given number satisfies
for large enough . In 1956 Paul Erdős conjectured that there were many more such numbers—note the time reversal. He reasoned there should be at least . Pomerance used heuristic arguments to guess that there might be many more. Steven Hayman and Andrew Shallue constructed a Carmichael number with over ten billion factors, drawing on a quantum algorithm by Greg Kuperberg. But all we actually need is a set of numbers that are “Carmichael-like” enough for to succeed, and either dense and divisor-rich enough for random probes to hit it with high probability, or distinctive in a way that successive probes can use earlier-gathered information to drive toward it.
src |
One case where our probe idea is silly is when the success of depends on having a factor with a special property. One property that has been used is when is smooth, meaning is a product of primes that are all small, or powersmooth, meaning also that none of those primes is raised to a high power. The point is that if this is true of and is different from , then it was already true of , so the probe didn’t gain anything. Thus for instance our idea isn’t useful when is an original form of the algorithm devised by John Pollard. But we think the variant of Pollard’s gcd-based idea used in the Rabin-Miller “part 2″ might escape this problem.
Part II
The Rabin-Miller idea is to write uniquely as with odd. Then given a trial , we actually work with the number . The “Part II” is to compute
For good measure we can try in the gcd’s too. If all of these return (or an immediate zero) we fail, but anything else gives us a factor of .
We pause to note that Pollard’s idea was to try the gcd with and where ranges over products of small primes and over small powers of them. Then a powersmooth would surely yield in the wash. But here we do not demand sureness, only a fighting chance of getting a factor.
One thing that intuitively helps is if we can probe to so that is divisible by a large power of . Then we get a large number of cracks at the gcd. The special role of powers of is similar to that in Miller’s original idea which we covered here.
Here is a small example to show that combining this with our probe idea at least isn’t silly. Consider which is incidentally a Blum integer. We have so we get three cracks at a gcd. Consider the trial . Alas this is a “miss” because:
- but is prime;
- but is coprime to ;
- but is useless.
However, the probe makes , which gives eight cracks. But for the same trial we only need the first one: which divides .
One can make tables of other small cases that can make our idea look good. But remember the caveat that we would need better reasons for optimism when gets large. The prospective “buster,” however, has the same problem. It’s not enough to argue that Carmichael numbers that are multiples of a given may not exist or may be too big or sparse for probing to hit them, because this part may need only some shadow of the Carmichael property and Rabin’s theorem to succeed.
Even we have and a prime probe fails by returning just , we can derive a kind of heuristic consolation: If we’d instead had and happened to hit on the probe , then the same outcome would be a success. The same for with the probe . When has prime factors and gives one of them, the density of for which the given factor is or is still non-negligible—no worse than . This is one reason we think our probe idea is worth considering.
Open Problems
Can there be a Rabin-Miller “Part III”? Are there smart strategies for choosing successive probes that have a non-negligible chance of hitting the set of numbers for which a non-negligible portion of trials succeed? How dense is such an anyway? Or can this idea be “busted” along similar lines to the analysis of Pollard’s or other heuristics?
[added "three-or-more" to intro]
Do you think if factoring is easy, all one way functions can be broken?
I am unsure how Levin’s construction of Universal OWFs is related to factoring and discrete log problems. May be you can do a post sometime?
QuestionMan:
Well I think P=NP could be true, so yes. But even if not equal one-way is a much harder issue. Good idea on a possible topic.
Thamks
You are not alone. Two of my professors think so and one of my professors said a Turing award winner also said so. It is not difficult to come up with a problem with parameters that when changed give problems in the following hierarchy:
P -> Integer Factor/Discrete Log -> NP complete
Firstly: My feeling is Levin’s Universal OWF should be sitting there after P and before NP complete. If it sits before Integer factor/Discrete Log and these problems are in P the no Universal OWF (someone is going to write a paper on this).
It possibly cannot sit after Integer Factor. Can it? After all OWFs are easier than NP complete. So they should philosophically be as close as possibly to P.
Secondly, for no such problem where one can come up with such a parametrization hierarchy (P -> Integer Factor/Discrete Log -> NP complete) there is a gap between the last two (if there is a gap we have proved P!=NP).
In fact, philosophically and probably technically some groups are not far from a possible outline solution for Integer factor/Discrete log. A modification of the problems I am hinting has parameters which has solution for the Integer factor/Discrete log in P but so far there is no map from those modified problems to Integer factor/Discrete log.
I believe P != NP, indeed in an exp(\Omega(n/log^k n)) circuit size lower bound for n-variable 3SAT (k = 1 or 2), but side with Dick that factoring is in BPP. I’m agnostic on getting k=0 (for any language in E) which would make BPP=P.
Well it is interesting that you would not take sides in the battle of P and BPP.
Razborov shows if we cant say much about OWF we cannot say much about P and NP either. I would be really interested in knowing how far Levin’s UOWF stands next to P and if BPP can contain it (like you are suspecting contains Integer Factoring). Given the equivalence to existence of PRNGs, may be this question can be attempted with known mathematical machinery.
Or in other words where exactly does Levin’s UOWF sits in the Ladner hierarchy (if it exists)? Whether in the bottom of bottom close to P or to the top of the top close to NP-complete (I suspect these two boundary conditions as the only reasonable ones where Levin’s UOWF can exist). I would suspect the second case may not be possible simply because again with the close connection with PRNGs where simple polynomial complexity suffices to create randomness. So if the second case is true we may be giving randomness huge power to solve all problems just below NP-complete. This then will wildly go against conventional wisdom that P=BPP.
Just my suspicion. Your opinion welcome.
Unfortunately Miller Part 2 will not work very often. The clearest way to see this is by considering the order of 2 mod p1 and mod p2. If N*-1=2^s*v and the order of 2 mod p1 is 2^i1*k1 and 2^i2*k2 mod p2 then I think that Miller Part 2 works iff i1 is not equal to i2 and at least one of k1 and k2 divides v. Since k1 and k2 will tend to be large the chance of success will essentially be the same the chance of success of just trying to guess a multiple of p1-1 or p2-1 and using the algorithm you discussed a while ago in A Lemma on Factoring.
This is with respect to a=2 to be concrete, but the case for general a’s is the same.
Thanks. As I read this, the import with further regard to our “probing” idea is that in going from N to N* = rN = 2^t w, say, we would be effectively trying to guess r so that w becomes a multiple of the unknown k1 or k2. The nub is whether it is possible to gather any partial information from earlier probes that would help in finding such an r. To answer “Tester” too, we haven’t made a concrete strategy and are most immediately interested in (and grateful for) statements of obstacles. Moreover I found the fascinating Hayman-Shallue–>Kuperberg connection only yesterday while doing a final pass on the post, but several “insane” chess-cheating developments this week and work on the FIDE-ACP commission I’ve been appointed to are taking precedence.
Did you give your new factoring idea a try on classical benchmarks like 8051, F_5 or higher Fermat numbers or even an RSA-challenge? Also, the question is how to select your probes (a small prime, a random prime of length(N)/2, or what else)?