The history of the orbit problem and its generalizations

Ravi Kannan is one the top researchers in the known world on computational complexity aspects of linear algebra. His Ph.D. thesis is on the complexity of computing the Smith Normal Form of a matrix; one of his best results is on computing the approximate volume of a convex body in high dimensions; and his most “magical” result is on the power of random sampling for linear algebraic computations. He was awarded the Fulkerson Prize with Martin Dyer and Alan Frieze for their wonderful work on the volume problem.

Today I would like to talk about a problem that Ravi and I worked on years ago: called the orbit problem. The result has an interesting history, and has some nice generalizations. Yet, there still are a number of open problems surrounding the problem.

I first met Ravi when we both arrived, in the fall of 1978, to the computer science department at Berkeley. He came as a post-doc and I as a tenured faculty. We instantly “clicked,” and have been close friends ever since.

As I said earlier, Ravi’s thesis work was on the Smith Normal Form of a matrix. This is a quite beautiful result, but it addresses a subtle problem. Before his work there were algorithms that computed the normal form with only a polynomial number of operations. However, it is not obvious that these methods run in polynomial time.

Confused? Polynomial time requires not only a polynomial number of operations, but also that the size of each number computed during the computation must be polynomial. If the numbers get too big, then the algorithm is not in polynomial time. Ravi’s theorem was that the numbers do stay appropriately bounded, and so the algorithm for Smith Normal Form is really in polynomial time.

If you think such questions are trivial, prove that there is a polynomial time algorithm for finding the inverse of a non-singular matrix. The obvious answer is to use Gaussian elimination, which is easily shown to take at most ${O(n^{3})}$ arithmetic operations for a ${n}$ by ${n}$ matrix. The question is: why is the precision bounded by a polynomial in ${n}$? If any numbers during the computation of the Gaussian elimination algorithm have an exponential number of bits, then the algorithm is not in polynomial time. It is true that the numbers do not explode in size, but it is non-trivial to prove this.

That fall semester at Berkeley, Ravi gave a technical talk on his thesis work. During Ravi’s talk, William Kahan—a senior faculty member of the department—interrupted Ravi about mid-way through his talk. Kahan said, “Isn’t this all known? I believe that Lewis Carroll actually showed this over a hundred years ago.” Ravi was at a lost for words. He stumbled on, and tried to finish the talk as best as he could.

The next day Ravi came to my office, right after he had talked to Kahan. Ravi said after a long conversation, Kahan agreed that Ravi’s result was not due to Lewis Carroll, that it was new, neat, and that he had misspoken. Great news.

However, Ravi asked me: what should he do? The department had all heard Kahan say that Ravi’s theorem was known. Should Ravi try to undo the “damage?” I thought for a moment and gave Ravi my advice: “just do nothing.” I felt the best plan was to ignore the comment, move on, and start to prove some new theorems. I said that getting into “damage control mode” would only make things worse. He listened to me, and I still think it was pretty good advice.

I found out later, that I was just repeating the advice of George Shultz, who said in 1970:

Don’t just do something, stand there.

Let’s turn now to one result of my advice: the work we did together on the orbit problem.

The Orbit Problem

Mike Harrison, a member of the department and a good friend, asked Ravi and I one day if we had ever thought about the orbit problem? We answered no—what is it? The problem, Mike said, is simple: Look at the sequence of vectors,

$\displaystyle x, Ax, A^{2}x, \dots$

does the vector ${y}$ ever occur in this sequence? Here x is a start vector and ${A}$ is a rational matrix. It’s called the orbit problem, since such a sequence is the trajectory of the point ${x}$ under the transformation ${A}$ such sequences are often called orbits.

The problem is not obviously decidable: start computing the sequence of vectors and keep checking to see if the vector ${y}$ ever occurs. Clearly, if it does, then stop. However, what happens if the vector ${y}$ does not appear after many steps? When can you conclude that it will never appear? That is the crux of the problem.

The intuition is that the vectors ${x, Ax, \dots}$ will eventually get much “bigger” than ${y}$, and you can stop the search. This needs to be proved, since it seems possible for the orbit to go far from ${y}$, but somehow come back after many more steps. Mike told us that it was known that the problem was decidable in the case that ${A}$ was a ${2}$ by ${2}$ rational matrix, but that nothing else was known.

Ravi and I soon proved the following theorem:

Theorem: Suppose that ${A}$ is a rational ${n}$ by ${n}$ matrix and ${x}$ and ${y}$ are rational vectors. Then, there is a polynomial time algorithm that determines whether or not there is a natural number ${k \ge 0}$ so that

$\displaystyle A^{k}x = y.$

The problem is not just decidable, it is in polynomial time.

The Proof

As usual please look at the paper for details, but here is a high level sketch of the proof.

The first observation is that the problem is easier if the matrix ${A}$ is diagonalizable. So Ravi and I assumed that and then the problem is reduced to: Is there a ${k \ge 0}$ so that

$\displaystyle D^{k}u = v$

where ${D}$ is a diagonal matrix. The difficulty is that the entries of ${D,u,v}$ need not be rational numbers, now they can be algebraic numbers. Thus, we get a series of equations of the form,

$\displaystyle d^{k}r = s$

where ${d}$ and ${r}$ and ${s}$ are algebraic numbers. Clearly, if the absolute value of ${d}$ is not ${1}$, then either ${d^{k}}$ will tend to ${0}$ or to ${\infty}$. In either case, we can at least decide the original orbit problem.

So ${d}$ must have absolute value one, which rules out a growth argument. Also just because the absolute value of ${d}$ is${1}$, one cannot conclude that $d$ is a root of unity. However, if ${d}$ is a root of unity, then again the problem is at least decidable.

So what do we do if ${d}$ is not a root of unity? There are two cases. The first, is the case where ${d}$ is an algebraic integer. In this case, if ${d}$ is not a root of unity, then some conjugate of ${d}$ must not have absolute value ${1}$. This is a famous theorem of Leopold Kronecker, which again allows a growth argument. His theorem states:

Theorem: Every non-zero algebraic integer that lies with all its conjugates in the closed unit disc ${\{z : |z| \le 1 \}}$ is a root of unity.

Second, is the case where ${d}$ is an algebraic number, which is not an algebraic integer. Then, ${d = \frac{w}{m}}$ where ${w}$ is an algebraic integer and ${m}$ is a non-zero integer. Thus, we are interested in solving for ${k}$,

$\displaystyle w^{k}r = s \cdot m^{k}.$

This equation can be handled by using the ideal theory of algebraic integers.

This is all for the special case when ${A}$ is diagonalizable. What do we do if it is not? The key to save the day is a beautiful insight, due to Ravi alone, that we can change the whole problem into a matrix problem: is there a ${k \ge 0}$ so that

$\displaystyle B^{k} = C$

where ${B}$ and ${C}$ are rational matrices. This helps the argument immensely. I will call this the matrix version of the orbit problem. Then, instead of diagonal matrices we can reduce the problem to,

$\displaystyle J^{k} = C'$

where ${J}$ is in Jordan Normal Form and ${C'}$ is a matrix. The same algebraic number ideas now apply, but the details become a bit more technical.

With care all of this can be kept under the covers, and it yields a simple algorithm that runs in polynomial time. See the paper for the rest of the details.

Generalizations

The orbit problem was a nice result, that used interesting arguments based on algebraic number theory. However, it took another’s insight to take the next important step. Zeke Zalcstein realized that the methods that Ravi and I used in the original orbit problem might generalize to more than one matrix. Now the orbit becomes two dimensional:

$\displaystyle A, B, AB, BA, AAA, AAB, \dots$

One could imagine a question like this : For a given matrix $C$, are there $k \ge 0$ and $l \ge 0$ so that

$\displaystyle A^k B^l = C.$

However, the exact corresponding question to the original orbit problem would be: Determine if there are $k \ge 0$ and $l \ge 0$ so that

$\displaystyle A^{k}B^{l}x = y.$

Zeke had two important insights. First, that the pure matrix version was already complex enough, so we should not even think about the vector version. Second, Zeke knew that the matrices had to be commuting, ${AB=BA}$, since otherwise the problem is undecidable. Zeke is a long-time friend and mentor of mine, and is an expert on semi-groups. Jin-Yi Cai and I joined with him and started to work hard on the problem. Eventually, we proved:

Theorem: Suppose that ${A}$, ${B}$, and ${C}$ are rational ${n}$ by ${n}$ matrices. Then, there is a polynomial time algorithm that determines whether or not there are natural numbers ${k \ge 0}$ and ${l \ge 0}$ so that

$\displaystyle A^{k}B^{l} = C$

provided ${AB=BA}$.

Later it was generalized, with a better proof, to any fixed number of commuting matrices by László Babai, Robert Beals, Jin-yi Cai, Gábor Ivanyos, and Eugene Luks.

The two dimensional vector version of the original orbit problem, that I stated before, is still open — I believe. As Zeke noted, the vector problem is harder than the matrix version.

Another potential generalization is:

“Theorem”: Suppose that ${A}$ is a rational ${n}$ by ${n}$ matrix, ${x}$ and ${y}$ are rational vectors, and ${c}$ is a rational number. Then, there is an algorithm that determines whether or not there is a natural number ${k \ge 0}$ so that

$\displaystyle x^{T}A^{k}y = c.$

Is this a theorem? Terry Tao raises a related question in his famous blog on recurrence sequences. He gives it as an example of a question that we should know the answer to, but do not. I agree.

Open Problems

The main open problems are to solve the vector version of the commuting matrix version of the orbit problem. Or solve the last question I raised on the equation,

$\displaystyle x^{T}A^{k}y = c.$

Another direction is to see if the orbit problem requires polynomial time, or can be solved by a potentially weaker complexity class. Vikraman Arvind and Thirumalai Vijayaraghavan prove a pretty result on this very issue: The Orbit problem is in the ${\mathsf{GapL}}$ Hierarchy. See the zoo for what ${\mathsf{GapL}}$ is, if you do not know this class. A related question is: what is the lowest complexity class that can solve the case of commuting matrices?

[KWR 6/23/2011: fixed typo–>Ivanyos]

1. September 2, 2009 9:58 am

Very cool result. It’s also helpful of you simply to point out that linear algebra is still an active research field. It is easy to miss this point while studying it out of most undergrad textbooks. If I were to write such a book I would try to end with some tantalizing open problems like the ones you describe.

On a related note, I wish I knew of a good survey on the computational complexity of linear algebra. It’s hard to get a sense of what’s going on since individual papers may state bounds in terms of arithmetic operations, condition numbers and their variants, etc.

September 2, 2009 10:22 am

Actually Ravi is writing a book with Santosh Vempala that is due out soon—this fall. It will be on some of results that you would like to see in one place. But, alas I do not think it will cover everything.

2. September 2, 2009 10:56 am

greaaaaaaaaaaaaaaat

wait another

September 4, 2009 12:08 pm

I understand that you are talking about Gaussian elimination with full pivoting, since with partial pivoting Gaussian elimination can give exponentially big numbers during the computation. But Gaussian elimination is very stable in practice. My question is what is the best current result on showing Gaussian elimination is almost stable in some sens?

4. October 19, 2009 12:33 am

Interesting problem. Great post.