The search for a factoring algorithm

Matt Damon is one of Hollywood’s iconic actors. His career was launched as both actor and screenwriter in the movie Good Will Hunting, in which he played a janitor who solves open problems in algebraic graph theory that are posed by a Fields Medal-winning MIT professor. He is recently known for playing Jason Bourne in movies The Bourne Identity, The Bourne Supremacy, and The Bourne Ultimatum based on Robert Ludlum’s thriller novels. The next movie in the series, The Bourne Legacy, does not have Matt Damon and does not have the Jason Bourne character. So where are Damon and Bourne?

Today Ken and I would like to discuss The Bourne Factor, the movie and its plot.

Why haven’t you heard about a movie called The Bourne Factor? Besides starring Damon once again as Jason Bourne, it has Kate Beckinsale as his love interest, and Helen Mirren as their powerful adversary. Why rush out in its place a non-Bourne Bourne movie sans Damon, and with a different director? I wonder if the real-life story of The Bourne Factor may have become a bit too real-life…

The problem is that the movie is based on the problem. Not ${\mathsf{P}=\mathsf{NP}}$. The problem is always the factoring of integers. Always. As you might imagine it is exciting to see a movie based on one of my favorite problems—it has been so long since Sneakers. Len Adleman made sure the equations filling blackboards in the opening scenes of Sneakers were real and about factoring. Damon undoubtedly wanted up-to-date equations. But what if he or someone else on the set found a new way to solve one of them?

As I said the plot plays out in real life. In real life…

The story starts when a graduate student, played by Brie Larson, discovers, almost by accident, a new fact about integers. Brie gives her paper explaining it to her bored professor, played by Beckinsale, at the end of a class. That night Beckinsale cannot sleep, so she decides to glance at the paper. Kate expected it to be wrong, but realizes it is correct, and that Brie has made a major discovery. Not only this, she realizes that the discovery can be used to factor integers—and thereby break RSA. She e-mails Brie congratulating her on the breakthrough, and suggests they meet in the morning at Starbuck’s.

Beckinsale is stood up, and after finally giving up and starting to go to her office, she decides to stop by Brie’s dorm. Kate finds Brie’s door unlocked, enters her room, and discovers that Brie has been killed in a horrible manner and her laptop stolen. She calls 911, but something scares her and she leaves before finishing the call.

An attempt on Kate’s life forces her to go on the run. She is actually—unknown to her—running from two groups. One group wants the secret, while the other group, Mirren’s people, seems only to want the secret destroyed. Damon enters the movie as Beckinsale’s ex-boyfriend, who at first does not understand why anyone would care about Kate’s secret. But when she is almost killed by a sniper through a computer lab window, he reluctantly agrees to help her get to safety.

There is a very unusual car chase—I will not spill the details here but it has to do with hijacking communications between self-guiding cars. There are many near misses and surprises as the movie unfolds, with government agents seeming to be on all three sides. There is a twist when Beckinsale is kidnapped by Mirren and Damon has to rescue her before a certain test related to David Deutsch’s explanation of how quantum computers work is carried out on her…

$\displaystyle \S$

Okay I am just dreaming. I think we could write such a script, and it would be a cool movie. But what I would like today is to discuss what we know about factoring and what we do not know. In the brilliant 1993 HBO movie And the Band Played On about the start of the AIDS epidemic one of the discussions at CDC starts:

Dr. Jim Curran: All right, what do we think? What do we know? What can we prove?

## Think

Most think that integer factoring is computationally hard—in particular, that factoring sufficiently large ${N}$ that are the product of two randomly selected large primes is hard. It is critical that this problem not just be hard in worst case, but that it is hard in this strong sense: Pick two random ${p}$ and ${q}$ randomly in a large enough range, then given ${N=pq}$, finding ${p}$ and ${q}$ is hard in all but a vanishing fraction of cases.

Note that for even “harder” problems such as ${\mathsf{SAT}}$, generating hard instances is quite non-trivial. And even more, it appears for ${\mathsf{SAT}}$ that all kinds of randomly generated instances may actually be easy. Certainly one must use great care in generating sets of clauses to make them hard: too few clauses are easy and too many are easy. What makes factoring so useful for cryptography is that it appears to be very easy to generate random hard instances.

It is important to have hardness for a great many instances. The unwelcome clustering of RSA keys ${N}$ currently in use that was found in the paper by Arjen Lenstra and others may owe at least in part to unwitting repetition in the selection of primes ${p}$ and ${q}$ based on too-simple use of pseudo-random number generators.

## Know

We do know that it is unlikely that factoring is very hard, that it is ${\mathsf{NP}}$-complete for example. This is because factoring lies in ${\mathsf{NP} \cap \mathsf{co-NP}}$, and because factoring has quite good algorithms such as the famous Number Field Sieve.

We also know that there is a quantum polynomial time algorithm for factoring due to Peter Shor. Many use this to argue that quantum is more powerful than classic algorithms. Yet, if there were a sufficiently good simulation of Shor’s algorithm, then factoring would be easy. We have just discussed this in a recent post—see here.

## Prove

Nothing. Well not quite. It is known that ${\mathsf{AC}_{0}}$ cannot compute factoring. Peter Sarnak and Gil Kalai recently formulated an attack on this problem, and Ben Green solved it.

But it a real sense for all we know factoring could have a linear time algorithm. This may be of low probability but it is possible.

$\displaystyle \S$

There is another thread that I have worked on, for years, to understand factoring. Suppose factoring is indeed hard, then what are the consequences of this assumption? Not consequences like “we can build a secure cryptosystem,” but consequences in other parts of mathematics. For example, consider equations of the form

$\displaystyle (2^{n})! = \sum_{i=1}^{m} a_{i}b_{i}^{c_{i}},$

where ${m}$ is polynomial in ${n}$ and ${a_{i}}$, ${b_{i}}$ and ${c_{i}}$ are numbers with only polynomial in ${n}$ bits. Note that the number of bits in the left hand side is exponential in ${n}$, but the right hand size also has exponential in ${n}$ bits—so no trivial size-based argument will destroy the possibility that such an identity is true. If it is true in general, then factoring has polynomial size circuits. Even better, ${(2^{n})!}$ can be replaced by a number that has many factors. Thus the hardness of factoring would show at a sweep that certain kinds of equation Pierre de Fermat might have written in his margins are unsolvable.

This is even more concrete than the connection found in the famous “Natural Proofs” work of Alexander Razborov and Stephen Rudich. They proved that if factoring is hard in the strongest possible senses, than there cannot be certain kinds of proofs that ${\mathsf{SAT}}$—or any other similar problem—lacks polynomial-size circuits.

## Open Problems

Would you like to see another movie on this subject? Could it actually work? Perhaps I will take the summer off and try to write a screenplay. If you are an agent please give me a call at 555-385-1212.

1. February 27, 2012 9:31 am

If you came up with a practical polynomial algorithm for factoring what would you do? Simply post on Arxiv? Announce it in a comment here? Don’t publicize it but quietly submit to a top journal? Send it only to a few peers? Notify the CIA?

Seriously.

• February 28, 2012 6:54 am

LOL … Imre, here’s a concrete answer to your question: William Macready, Geordie Rose, and Peter Love, System, Method, and Apparatus for Factoring Numbers (USPTO 12/849,588, 2010).

So perhaps one component of Hollywood’s disinterest in Pip’s fictional screenplays is that documentaries are known to be in-production?

February 29, 2012 2:26 pm

paranoid man’s guide to publicizing your factoring algorithm and getting credit without getting killed.

your main concern is that the ominipotent NSA will use deep packet inspection to intercept your brilliant code en-route to the recipients you wish to share with. your secondary concern is you don’t want to hurt the economy and people’s livelihoods by just dumping the code out there before important infrastructure has time to get off RSA.

first confirm your algorithm works on the RSA challenge numbers. (i notice most solutions of SAT do not do this.)

second, set up redundant dead-man’s-switch emails set to forward your code to as many interested people as possible in multiple countries. this is insurance in case you are kidnapped by the crypto mafia.

third, send the RSA challenge factors signed with some signature scheme which does not depend on factoring to cryptome, wikileaks and those types of folks from a public wifi hotspot in a different city, using a computer bought with non-sequential bills from a blind shopkeeper, masking your scent, etc. the goal here is to publicize the fact that RSA is broken and instigate the shift away from it, so that there is no longer value in being one of the few parties with access to the factoring algorithm.

after the fuss has died down and important communications are mostly off of RSA and onto ECC of some sort, you can publish the factoring algorithm signed with the same private key as before to prove your priority and collect your fame, movie cameos, women, etc.

• March 1, 2012 9:50 pm

Off the top of my head: you go to New York, buy a web-enabled device for cash, and anonymously post the factors of all the RSA challenge numbers along with a cryptographically strong (in light of the break) authentication or signature, wearing a loose hoodie in a crowded place with as few cameras as possible. Then you leave on a bus and wait to think your next move over as the world debates whether you worked for RSA or not.

2. February 27, 2012 9:33 am

Far more thrilling than any spy movie is a movie that (uniquely) has no villains in it at all: Ron Howard’s Apollo 13. So I would like to see a 21st century version of it titled QIST 3.0.

———————————

PLOT SUMMARY   A spacecraft with ${7\times 10^7}$ people on it is short on resources and on-trajectory to burn-up in the atmosphere.

By heroic effort, personal sacrifice, and ingenious creativity, a team of astronaut-engineers saves the day.

CAST

Flight Director:   Steven Chu

Astronaut Team:   Geordie Rose, Barbara Terhal, and Scott Aaronson.

Engineer Team: John Preskill, Carlton Caves, Michael Nielsen, and Charles Bennett

Reporter: Dick Lipton

DIALOG SAMPLES

Nielsen:   “Chief, the planetary systems are failing!”

Chu:   “Let’s focus our minds, people … what have we got on the planet that’s good?”

Preskill:   “Well chief, we’ve got this second-generation QIST module — it was intended for quantum computing, but maybe we can repurpose it for planetary life support.”

Chu:   “I don’t care about what anything was DESIGNED to do, I care about what it CAN do.”

Caves:   “I suggest, gentlemen, that we invent a way to put a square peg in a round hole. Rapidly.”

Rose:   “Folks, that planet’s getting awfully big in the viewport!”

Terhal:   “All right, there’s a thousand things that have to happen in order. We are on number eight.”

Bennett:   “We’ve got half the Ph.D.’s on the planet working on it.”

Aaronson:   “We just can’t throw this together at the last minute. So you’re gonna get the procedure up to us, whatever it is.”

Nielsen:   “Goddamnit! We don’t want another estimate! We want the procedure! Now!”

Lipton (worried):   “This could be the worst disaster the STEM community’s ever experienced.”

Chu:   “With all due respect, I believe this is gonna be our finest hour.”

———————————

Now, wouldn’t QIST 3.0 be better than a silly Matt Damon movie? And the best thing is, we need not wait for Hollywood to call. Because the filming of QIST 3.0 already is underway … and we are all of us starring in it!  :)

February 27, 2012 2:03 pm

Yes, I WOULD like to see another movie on the subject. Sneakers was an EXCELLENT film… one of my all-time favorites.

If I were to guess, I would say that there is probably a significantly faster integer factoring algorithm out there waiting to be discovered (faster than the Number Field Sieve)… but that factoring cannot be done in polynomial time.

February 28, 2012 6:58 pm

Ernie,

Well I certainly agree with half of what you say.

4. February 28, 2012 7:20 pm

There are several interesting challanges which represent even very tiny progress in factoring. An obvious challenge is to improve the world-record algorithm which is (but we cannot prove it) exponential in the third root of the number of digits.

Another challenge offer by Sarnak is to find a sequence a_1,a_2,…a_n whose entries required polynomial time computation (in log n) with plusminus 1 entries, which has, in the interval [1,n], a positive, bounded away from zero, correlation with the Mobius function.

We can also ask: if you have such a sequence with small correlation >= epsilon can you amplify in polynomial time (or more) the correlation to say 2epsilon.

We can also ask, what is the best correlation (as a function of n) that you can achieve.?

You take the all 1 function and change it to -1 for primes; this give you correlation 1/log n. Can you improve it by just a little bit??

Another quite different thing we can try to do for the interval [1,n] is to compute all the small primes and then modify the all 1 sequence by letting a_m be defined according to how m relates to these small primes. I dont you if you can approach (+-) 1/log n correlation by such a method or what you can do. We can try to see what is the best we can do with computation power exp((log n)^{1/4})

5. March 1, 2012 12:17 pm

Someone posted this. The complexity is unknown.
http://eprint.iacr.org/2012/097.pdf