Factoring Could be Easy
Factoring and diophantine equations
Dan Boneh is one of the top cryptographers in the world, one of the best. He combines the ability to prove deep mathematical theorems with the ability to design and build real systems. He combines the ability to break crypto-systems with the ability to create new ones. I believe these are hard combinations to find: who else can use Weil Pairings while building real running systems. Not many.
I remember when he was a graduate student at Princeton–I was his advisor–I asked him if he knew of any “simple” proxy engines. A proxy engine is a piece of software that acts as a kind of repeater for Internet packets. A packet from arrives at the proxy engine, the engine sends the packet on to ; on the return path the packet from is redirected back to . Often some other processing is done to the packets, sometimes as simple as logging, sometimes much more. Proxy engines can be quite complex, since they must deal with various failures modes and other issues, but the simplest possible one would act like an identity function: how complex can that be?
At the time I was working on a problem and needed a proxy engine that would be easy to modify. The ones I found on the Internet were quite complex: huge general purpose systems. They were like huge gray battleships–big and complex, great for some tasks, but way beyond what you might want for a day cruise around the bay.
So I asked Dan if he knew of any simple proxy engines. He replied that he did, but he had a sheepish grin. He knew of one because he had written his own in Perl, and it was one page of code. One page. I was surprised, and asked him the obvious question, “why did you do that?” He said, partly to learn how the Internet protocols worked and partly for another reason that I will not say. His code was great and for a while, after he put it on his home-page, there was a small “Dan Boneh’s Proxy Users Group”. Many used his proxy engine, it was complete, yet elegant. Great for a day cruise on the bay.
Factoring Again: Always Factoring
One of the themes that we pursed, while Dan was a graduate student, was: is factoring easy? I believed (and still do) that factoring could be easy. I cannot speak for Dan then or now, but as long as the complexity of factoring is open, it makes sense to look at what are the consequences of assuming that factoring is hard. One possibility is that we could show that assuming it is hard yields a contradiction: this would be great, but seems unlikely. A more interesting outcome we felt was that we might be able to show that it has strong mathematical consequences. Our dream was to prove some deep mathematical theorem based on the asssumption that factoring was hard.
Let’s call the assumption that factoring is hard . You can think of this as saying that there are no polynomial size circuits for factoring. Later on I will state the exact assumption more carefully. In any event, the assumption we are making concerns circuits, not algorithms. This is an important difference. There could be a polynomial size circuit that can factor each size of numbers, without there being a polynomial time algorithm for factoring. Circuits are a type of non-uniform model of computation. Most people who believe that factoring is hard probably also believe that factoring does not have small circuits. The ideas we will discuss today only apply to circuits.
Dan and I tried to prove theorems of the form:
where was the statement that some Diophantine equation had only a finite number of solutions. What was interesting about these results was that proving the same result without looks beyond the ability of current number theory.
Here is an example that I have thought about recently. Let be the primes in the interval and let . Suppose that we could compute
by a straight-line computation of length polynomial in (Recall a straight-line computation is one that starts with the input variables, and at each step performs an arithmetic operation. The length of the computation is the number of operations performed, and the last step is the answer or the output.) Then, factoring based crypto-systems would be broken. The proof is quite simple–I will explain the details shortly.
We can make this even more concrete. Suppose that
was equal to
where and all are less than . Then, again we could break factoring based crypto-systems. When Dan was still a student he gave a talk on earlier versions of these ideas to an audience that contained a very strong number theorist. After hearing Dan’s talk, he sent us a multi-page “outline” of a potential proof that a much weaker Diophantine problem had no solutions. It was only a sketch, it used another unproved hypothesis, and was incomplete. Our reply was we could show that the Diophantine equation in question had only a finite number of solutions with a trivial argument, provided we could assume . I think the number theorist missed the point: we were showing how powerful was.
How To Break Crypto-Systems
Let’s get back to showing that if the above Diophantine problems have solutions, then we can break crypto-systems based on factoring. The key is a basic idea from computational number theory: called the fast modular exponentiation method. I will discuss this in detail in a later post, but for now you need to know that one can compute
in time that is polynomial in the number of bits in Note, the number could be huge. But the value of modulo must lie between and This “trick” also uses repeated squaring, but for now, if you do not know this method, please accept that is easy to compute even for large numbers.
Suppose that the crypto-system selects two random primes and from a given range. Then, is made public. Our goal is to find the value of and . Assume that lies in the range and that does not. Then, recall that is equal to
where are the primes in the interval . Note, that the greatest common divisor (gcd) of and is equal to . Thus, we are done if we can compute this value in polynomial time. But the assumption is that is also equal to
where and all are less than . The critical point is that we cannot compute it’s way too big, but we can use the exponential trick to compute . Call this last value . By basic properties of the gcd ( denote the gcd of and by ), it follows that
There is a point that was already raised in one comment: The existence of an that is equal to is enough to guarantee the existence of a circuit that factors. A circuit is not an algorithm, so the fact that finding the values of the could be very hard is immaterial. If they exist, then there has to be a circuit that can break the systems as described. Note, this is weaker than an algorithm, but would be, I believe, be upsetting to the crypto-community.
This is what happens. If the above Diophantine equation has a solution for a given size then, there exists a series of gates that make up a boolean circuit of size at most polynomial in This is a “magic” object. Its gates describe a computation that runs fast, and breaks any crypto-system that uses primes as above. It is magic because we do not know the actual circuit. But someone could be smarter than us and find the magic circuit, they could then program their laptop so that it would smash our crypto-systems as described above. This is the threat.
There are two more points to make about this approach. It clearly depends on “knowing” an interval that contains one prime and not the other. There is nothing special about the range any suitable range would work. We only need that there is a range where it is likely that one prime lies and not the other. More exactly if the crypto-system selects the primes and from some range, then we only need that there is an interval so that we can compute
for the primes in this interval. And also that when the primes are selected randomly with a reasonable probability only one of the primes will be in the interval .
Second, the assumption that contains all the primes in the interval is over-kill. Even if many of the exponents were , then the method would still work. As long as a polynomial fraction of the primes had positive exponents, the method would still break the system with a reasonable probability.
I promised earlier to state what means. For us it will mean there is no boolean circuit (not algorithm) that can break a crypto-system that selects random primes and We must know an interval that is likely to contain only one of the primes. That’s the assumption. Note, almost all schemes that use primes can be made to fit this model. For schemes that are more complex we can generalize the methods here to handle them too.
There are two types of open problems. First, can we show that the above equations have no solutions. In particular, can we show that
cannot be equal to
where and all are less than A challenge:
If you believe that factoring has no small circuits, then you must also believe that the above Diophantine equations have no solutions. Prove it.
Things are even worse. Even if the equations had solutions for an infinite number of , there would still be a problem for cryptography. Actually, even if the above equation held for, say, , then certain systems would be broken. This seems to me to be worrisome. This and other ideas make me believe that it is quite possible that is false. That the conventional wisdom is wrong.
Second, are there similar questions that can be proved based on assuming that discrete logarithm is hard? One of the great mysteries in computational number theory is why are factoring and discrete logarithm linked. Almost every time a better algorithm is discovered for one, there is shortly a similar improvement for the other. This is true for classic algorithms as well as for quantum algorithms. More on this in a future post. A most perplexing question, since there is no known direct reduction from one to the other.