# The Orbit Problem

* The history of the orbit problem and its generalizations *

Ravi Kannan is one the top researchers in the known world on computational complexity aspects of linear algebra. His Ph.D. thesis is on the complexity of computing the Smith Normal Form of a matrix; one of his best results is on computing the approximate volume of a convex body in high dimensions; and his most “magical” result is on the power of random sampling for linear algebraic computations. He was awarded the Fulkerson Prize with Martin Dyer and Alan Frieze for their wonderful work on the volume problem.

Today I would like to talk about a problem that Ravi and I worked on years ago: called the *orbit problem*. The result has an interesting history, and has some nice generalizations. Yet, there still are a number of open problems surrounding the problem.

I first met Ravi when we both arrived, in the fall of 1978, to the computer science department at Berkeley. He came as a post-doc and I as a tenured faculty. We instantly “clicked,” and have been close friends ever since.

As I said earlier, Ravi’s thesis work was on the Smith Normal Form of a matrix. This is a quite beautiful result, but it addresses a subtle problem. Before his work there were algorithms that computed the normal form with only a polynomial number of operations. However, it is not obvious that these methods run in polynomial time.

Confused? Polynomial time requires not only a polynomial number of operations, but also that the size of each number computed during the computation must be polynomial. If the numbers get too big, then the algorithm is not in polynomial time. Ravi’s theorem was that the numbers *do* stay appropriately bounded, and so the algorithm for Smith Normal Form is really in polynomial time.

If you think such questions are trivial, prove that there is a polynomial time algorithm for finding the inverse of a non-singular matrix. The obvious answer is to use Gaussian elimination, which is easily shown to take at most arithmetic operations for a by matrix. The question is: why is the precision bounded by a polynomial in ? If any numbers during the computation of the Gaussian elimination algorithm have an exponential number of bits, then the algorithm is not in polynomial time. It is true that the numbers do *not* explode in size, but it is non-trivial to prove this.

That fall semester at Berkeley, Ravi gave a technical talk on his thesis work. During Ravi’s talk, William Kahan—a senior faculty member of the department—interrupted Ravi about mid-way through his talk. Kahan said, “Isn’t this all known? I believe that Lewis Carroll actually showed this over a hundred years ago.” Ravi was at a lost for words. He stumbled on, and tried to finish the talk as best as he could.

The next day Ravi came to my office, right after he had talked to Kahan. Ravi said after a long conversation, Kahan agreed that Ravi’s result was not due to Lewis Carroll, that it was new, neat, and that he had misspoken. Great news.

However, Ravi asked me: what should he do? The department had all heard Kahan say that Ravi’s theorem was known. Should Ravi try to undo the “damage?” I thought for a moment and gave Ravi my advice: “just do nothing.” I felt the best plan was to ignore the comment, move on, and start to prove some new theorems. I said that getting into “damage control mode” would only make things worse. He listened to me, and I still think it was pretty good advice.

I found out later, that I was just repeating the advice of George Shultz, who said in 1970:

Don’t just do something, stand there.

Let’s turn now to one result of my advice: the work we did together on the orbit problem.

** The Orbit Problem **

Mike Harrison, a member of the department and a good friend, asked Ravi and I one day if we had ever thought about the *orbit problem?* We answered no—what is it? The problem, Mike said, is simple: Look at the sequence of vectors,

does the vector ever occur in this sequence? Here x is a start vector and is a rational matrix. It’s called the orbit problem, since such a sequence is the trajectory of the point under the transformation such sequences are often called orbits.

The problem is not obviously decidable: start computing the sequence of vectors and keep checking to see if the vector ever occurs. Clearly, if it does, then stop. However, what happens if the vector does not appear after many steps? When can you conclude that it will never appear? That is the crux of the problem.

The intuition is that the vectors will eventually get much “bigger” than , and you can stop the search. This needs to be proved, since it seems possible for the orbit to go far from , but somehow come back after many more steps. Mike told us that it was known that the problem was decidable in the case that was a by rational matrix, but that nothing else was known.

Ravi and I soon proved the following theorem:

Theorem:Suppose that is a rational by matrix and and are rational vectors. Then, there is a polynomial time algorithm that determines whether or not there is a natural number so that

The problem is not just decidable, it is in polynomial time.

** The Proof **

As usual please look at the paper for details, but here is a high level sketch of the proof.

The first observation is that the problem is easier if the matrix is diagonalizable. So Ravi and I assumed that and then the problem is reduced to: Is there a so that

where is a diagonal matrix. The difficulty is that the entries of need not be rational numbers, now they can be algebraic numbers. Thus, we get a series of equations of the form,

where and and are algebraic numbers. Clearly, if the absolute value of is not , then either will tend to or to . In either case, we can at least decide the original orbit problem.

So must have absolute value one, which rules out a growth argument. Also just because the absolute value of is, one cannot conclude that is a root of unity. However, if is a root of unity, then again the problem is at least decidable.

So what do we do if is not a root of unity? There are two cases. The first, is the case where is an algebraic integer. In this case, if is not a root of unity, then some conjugate of must *not* have absolute value . This is a famous theorem of Leopold Kronecker, which again allows a growth argument. His theorem states:

Theorem:Every non-zero algebraic integer that lies with all its conjugates in the closed unit disc is a root of unity.

Second, is the case where is an algebraic number, which is not an algebraic integer. Then, where is an algebraic integer and is a non-zero integer. Thus, we are interested in solving for ,

This equation can be handled by using the ideal theory of algebraic integers.

This is all for the special case when is diagonalizable. What do we do if it is not? The key to save the day is a beautiful insight, due to Ravi alone, that we can change the whole problem into a matrix problem: is there a so that

where and are rational matrices. This helps the argument immensely. I will call this the matrix version of the orbit problem. Then, instead of diagonal matrices we can reduce the problem to,

where is in Jordan Normal Form and is a matrix. The same algebraic number ideas now apply, but the details become a bit more technical.

With care all of this can be kept under the covers, and it yields a simple algorithm that runs in polynomial time. See the paper for the rest of the details.

** Generalizations **

The orbit problem was a nice result, that used interesting arguments based on algebraic number theory. However, it took another’s insight to take the next important step. Zeke Zalcstein realized that the methods that Ravi and I used in the original orbit problem might generalize to more than one matrix. Now the orbit becomes two dimensional:

One could imagine a question like this : For a given matrix , are there and so that

However, the exact corresponding question to the original orbit problem would be: Determine if there are and so that

Zeke had two important insights. First, that the pure matrix version was already complex enough, so we should not even think about the vector version. Second, Zeke knew that the matrices had to be commuting, , since otherwise the problem is undecidable. Zeke is a long-time friend and mentor of mine, and is an expert on semi-groups. Jin-Yi Cai and I joined with him and started to work hard on the problem. Eventually, we proved:

Theorem:Suppose that , , and are rational by matrices. Then, there is a polynomial time algorithm that determines whether or not there are natural numbers and so that

provided .

Later it was generalized, with a better proof, to any fixed number of commuting matrices by László Babai, Robert Beals, Jin-yi Cai, Gábor Ivanyos, and Eugene Luks.

The two dimensional vector version of the original orbit problem, that I stated before, is still open — I believe. As Zeke noted, the vector problem is harder than the matrix version.

Another potential generalization is:

“Theorem”:Suppose that is a rational by matrix, and are rational vectors, and is a rational number. Then, there is an algorithm that determines whether or not there is a natural number so that

Is this a theorem? Terry Tao raises a related question in his famous blog on recurrence sequences. He gives it as an example of a question that we should know the answer to, but do not. I agree.

** Open Problems **

The main open problems are to solve the vector version of the commuting matrix version of the orbit problem. Or solve the last question I raised on the equation,

Another direction is to see if the orbit problem requires polynomial time, or can be solved by a potentially weaker complexity class. Vikraman Arvind and Thirumalai Vijayaraghavan prove a pretty result on this very issue: The Orbit problem is in the Hierarchy. See the zoo for what is, if you do not know this class. A related question is: what is the lowest complexity class that can solve the case of commuting matrices?

[KWR 6/23/2011: fixed typo–>Ivanyos]

### Trackbacks

- Math World | The Orbit Problem « Gödel's Lost Letter and P=NP
- The Descriptive Complexity of Solutions « Gödel’s Lost Letter and P=NP
- Congrats Ravi Kannan « Gödel’s Lost Letter and P=NP
- generated discreteness | Peter's ruminations
- Algorithms At Oxford « Gödel’s Lost Letter and P=NP
- Forgetting Results | Gödel's Lost Letter and P=NP

Very cool result. It’s also helpful of you simply to point out that linear algebra is still an active research field. It is easy to miss this point while studying it out of most undergrad textbooks. If I were to write such a book I would try to end with some tantalizing open problems like the ones you describe.

On a related note, I wish I knew of a good survey on the computational complexity of linear algebra. It’s hard to get a sense of what’s going on since individual papers may state bounds in terms of arithmetic operations, condition numbers and their variants, etc.

Actually Ravi is writing a book with Santosh Vempala that is due out soon—this fall. It will be on some of results that you would like to see in one place. But, alas I do not think it will cover everything.

greaaaaaaaaaaaaaaat

wait another

I understand that you are talking about Gaussian elimination with full pivoting, since with partial pivoting Gaussian elimination can give exponentially big numbers during the computation. But Gaussian elimination is very stable in practice. My question is what is the best current result on showing Gaussian elimination is almost stable in some sens?

Interesting problem. Great post.

In case of two matrix generalization, equation A^k B^l = C is decidable even for non-commutative matrices. It is possible to reduce it to a similar problem with commutative matrices of higher dimensions

(see “Matrix Equations and Hilbert’s Tenth Problem”

http://www.worldscinet.com/ijac/18/1808/S0218196708004925.html ).

However, such trick does not work for a case of 3 matrices. The equation over integral non-commutative matrices A^k B^k C^k = D can encode Skolem-Pisot problem, which is still open.

Also, it is true that for a large number of non-commutative matrices

the matrix equations become undecidable.