A learning problem that is related to factoring

Peter Shor is the discover of the famous quantum algorithm for fast integer factoring. It is an algorithm that launched a thousand papers—with all due apologizes to Christopher Marlowe, who wrote:

$\dots$ that launched a thousand ships, and burnt the topless towers of Ilium?

“The topless towers of Ilium” mean “the high towers of Troy.” Perhaps the high towers stand for the world of classic algorithms.

Today I want to talk about a learning problem that is related to integer factoring.

The Last Step Of Shor’s Algorithm

Shor’s quantum factoring algorithm can be viewed as implementing an oracle. The oracle depends on the number you wish to factor, ${N}$, but it does not return a factor. Instead it returns another number ${r}$ that allows you to factor ${N}$ via a polynomial time classic algorithm. The brilliant insight of Peter is twofold: such an oracle suffices to factor, and such an oracle can be implemented by a quantum computation.

Here is how the oracle works. Each time we ask the oracle it returns a natural number ${\ell}$ with the following property: ${\ell}$ is such that ${|\ell r - k Q| \le r/2}$ for some ${k}$. The number ${Q}$ is much larger than ${N}$, but is a known power of ${2}$. The number ${r}$ is the “secret” that allows us to factor ${N}$. Further ${k}$ is a uniform random integer in the interval ${[0,1,\dots,r-1]}$.

The number ${r}$ can be recovered by the continued fraction method, or by noticing that recovering ${r}$ is equivalent to solving the system of integer equations:

$\displaystyle \begin{array}{rcl} \ell x - Q y &=& z \\ -x/2 &\le& z \\ z &\le& x/2. \end{array}$

This is an integer programing problem in ${x,y,z}$, where the value of ${x}$ is the value ${r}$ that we want. Note in general integer programming is ${\mathsf{NP}}$-hard, but for a fixed number of variables it is in polynomial time thanks to the brilliant work of Hendrik Lenstra, which is based on the beautiful LLL algorithm of Arjen Lenstra, Hendrik Lenstra, and László Lovász.

Our Problem

Shor’s oracle ${\mathcal{O}}$ is very powerful since it allows one to factor ${N}$ in total time that is polynomial in the length ${n}$ of ${N}$, where ${n \approx \log N}$. The quantum part and the classical part both run in ${n^{O(1)}}$ time. We wish to study what happens if we replace his powerful oracle by a much weaker one that only gives us partial information about ${\ell}$ each time. The question is, can such an oracle still be used to find ${r}$?

When we ask the new oracle ${\mathcal{O}'}$ for information, it selects an ${\ell}$ just as before, and the value satisfies the same inequality,

$\displaystyle |\ell r - k Q| \le r/2.$

However we only get to see ${s}$ bits of ${\ell}$. We can pick which ${s}$ bits before the oracle call, but those are the only positions we see. They can vary from one oracle call to another. Thus we could ask for the bits in positions ${1,2,3}$ one time and then ${1,5,6}$ another time. Of course ${Q}$ is still constant and known, while ${k}$ varies with ${\ell}$ and is unknown.

Can we still find the value of ${r}$? We now allow many oracle calls, and each can ask for a different or the same set of bits. It seems plausible that some information is leaked, but perhaps not. The trouble is that the value of ${k}$, and hence ${\ell}$, can change between calls. Thus the ${s}$-many bits of ${\ell}$ may be no help for completing the bits of a different ${\ell'}$. We can generate many equations, but they involve huge disjunctions of congruences. It may help, however, that values of ${k}$ arise uniformly.

We might suppose that ${s}$ is at least linear in ${n}$. It is possible for ${Q}$, and hence ${\ell}$, to have length ${n^2}$ or more, and that ${s}$ be a similar polynomial in ${n}$. However, we would even be interested in a method that used ${2^{n^{\epsilon}}}$ oracle calls and each time got ${n^{\epsilon}}$ bits of the ${\ell}$ for that call.

Open Problems

The interest in the problem is we believe that we may be able to construct such an oracle without using quantum methods. We are still checking this carefully, but in any event the above seems like an interesting problem. There are many examples in cryptography where partial leakage of information still breaks a system. Does this happen here?

June 13, 2011 8:10 am

So for us amateurs out there reading this blog, can someone explain what exactly it means for an integer program with a “fixed” number of variables to be polynomially solvable? This seems either trivial or incredibly profound.

• June 13, 2011 9:18 am

It actually is fairly profound. Let n stand for the length of the program including the precision needed to specify rational numbers it involves, and let k stand for the number of variables. It means there is an algorithm with running time of the form poly(n)*exp(k). Since k generally scales with n, this is an exponential-time algorithm overall, but when k is fixed the “exp(k)” term becomes just a big constant on the “O(…)” of the poly(n) part.

The general nomenclature for this is that integer programming is fixed-parameter tractable (FPT) when k is the parameter. There is a wide and important literature on problems in FPT and on problems for which there is hardness/completeness evidence against being in FPT, which can be accessed starting from Wikipedia’s `Parameterized_complexity’ article. (By specifying “exp(k)” I’ve actually put the problem into a narrower class than FPT.)

June 13, 2011 9:23 am

Pete,

It is deep. Perhaps I should do whole blog on it. But for now here is some intuition that will help. Consider the equation ax+by = 1 where a,b are integers and you want to find x,y integers to make this true. The equation could be 7378612746898487484487489894874746478x + 1448989087484747874874874847847847487474872499949449849y = 1. You find the x,y if they exist without some plan: simple search is hopeless. But Euclid already knew how to solve this—its an extended GCD problem.

Now allow 3,4,5, or more variables and the problem gets harder and harder. For any number its NP-complete. If the number of variables is fixed, say less than 10, then the surprise is there is a way to solve it in polynomial time.

Does this help?

June 14, 2011 12:53 am

Complete novice here, hoping for some clarification.

Re “For any number its NP-complete.” Do you mean as the number of variables increases with out bound (i.e. infinite) it is NP-complete?

Re “If the number of variables is fixed, say less than 10….” Do you mean that the general solution is polynomial time and IF there are around 10 variables the exp(k) derived constant (per KWRegan’s comment) is small enough for it to be solveable?

June 13, 2011 9:42 am

Thanks. I guess it just seems a little odd to me that this would be a big deal. I mean, for instance, say we had some algorithm for doing something to a set of size n that took O(2^n). For any fixed n, this is O(1). But this doesn’t mean this problem is inherently easy in any sense (right?).

June 13, 2011 10:03 am

Although Dick, now that I think more about that example you gave it does seem more powerful than I am giving it credit for. Thanks.

3. June 13, 2011 11:51 am

I’ve seen this before, in another form, but a very similar problem. The conclusion I’ve reached so far (but reserve the right to change my mind) is that the worst case of the larger iteration (involving L prime) is in NP, because essentially it is an unbounded loop. I’m sure this isn’t the right terminology, but I think of it as the limit (worst case) of the loop goes to expontial because the very worst case is having to re-assemble L from all of the indivdual permutations if you don’t get lucky and assemble it earlier. Or in a sense, it is heurisical in P (mostly works), algorithmic in NP (always works). (Sorry about the rather convoluted explanation 🙂

I haven’t looked very deeply, but I find I’m uncomfortable with with the apparent extended behavior of the quantum complexity classes. I keep wondering if they are just an artifact of an impedance mismatch:

Paul.

June 13, 2011 2:31 pm

Some clarification please: is r a natural number? Can we assume r>=1? Q is both known and a power of two or just known to be a power of two?