The approximate rank of matrices—a great idea Santosh Vempala is one of the world experts on high-dimensional geometry, and has made many beautiful contributions to this and other parts of theory. We have presented some of his results and open problems concerning high dimension geometry before. Many intuitions we have, not Santosh, from geometry in two and three dimensions fail in high dimensions. For example, the ratio of the corner-to-corner diameter to the volume of the ${n}$-cube goes to infinity as ${\sqrt{n}}$. Here is a page of visualizations maintained by David Eppstein.

Today I want to talk about a great new paper from Santosh that is joint work with Noga Alon.

They have begun a systematic study of the notion of approximate rank of a matrix. There are some surprising upper bounds. That is, some matrices have approximate rank that I find surprising, perhaps you will too. Even better, these upper bounds allow Santosh and Noga to prove some cool new theorems. One is an improvement to the best known results for finding Nash equilbria for two-player nonzero-sum games.

Santosh is co-teaching this term with Gregory Abowd our Computing For Good course—it was created by Santosh years ago. This term they are studying practical problems faced by communities that lack the type of computing power most of have, and consequently people in these communities have little benefit from computing.

Something to think about as we enter the American Thanksgiving week.

## Rank

The rank of a matrix is one of the most important concepts in all of linear algebra. One way to see that it is critically important is to note that there are a multitude of ways to define the rank of a matrix. Here are just a couple: ${\bullet }$ Column Rank: This is the maximum number of linearly independent columns of the matrix. Recall vectors ${v_{1},\dots,v_{m}}$ are linearly independent provided $\displaystyle a_{1}v_{1} + \cdots + a_{m}v_{m} = 0,$

implies that ${a_{1}= \cdots = a_{m} = 0}$. ${\bullet }$ Row Rank: This is the maximum number of linearly independent rows of the matrix. ${\bullet }$ Dimension of Image: This is the dimension of the range of the operator defined by the matrix. Every matrix defines in a natural manner a linear operator and the range of the operator is a linear space. ${\bullet }$ Decomposition Rank: This the smallest ${k}$ so that the matrix ${A}$ can be written as ${CR}$ where ${C}$ is ${m}$ by ${k}$ and ${R}$ is ${k}$ by ${m}$. ${\bullet }$ Determinantal Rank: This is the size of the largest minor of the matrix that has a nonzero determinant. Here is an example: The list could go on for longer, but I believe the point is clear: the rank of a matrix is a salient concept.

## Approximate Rank

One of the fundamental insights in theory is if a concept is useful, then an approximate version of it may often be useful too. Sometimes in our world the approximate concept is even more useful: algorithms often need only an approximate value to work. This is a basic principle that is repeated throughout theory, so why not study the approximate rank of a matrix?

Indeed why not. This is exactly what Santosh and Noga have done. The ${\epsilon}$ ${\mathsf{rank}}$ of a real matrix ${A}$ is defined as the minimum rank over matrices that approximate every entry of ${A}$ to within an additive ${\epsilon}$. More precisely:

Definition: For a real ${n\times n}$ matrix ${A}$, and ${\epsilon > 0}$, the ${\epsilon}$-rank of ${A}$ is defined by: $\displaystyle \min \{ \mathsf{rank} (B) : B \in \mathbb{R}^{n \times n} \text{ where } |A-B|_\infty \le \epsilon \}.$

Note ${|A-B|_{\infty} \le \epsilon}$ is a fancy way of saying that the matrices ${A}$ and ${B}$ differ in all entries by at most ${\epsilon}$. As they reference in their paper, the concept is related to ones of matrix rigidity introduced by Les Valiant in the late 1970’s to study lower bounds on the complexity of matrix operators, though his focus is somewhat different.

## The Identity Matrix

For the special case of the ${n \times n}$ identity matrix the ${\epsilon}$ ${\mathsf{rank}}$ has been studied before in a previous paper of Noga’s. There he shows that it is at least ${\Omega(\frac{\log n}{\epsilon^2 \log (1/\epsilon)})}$ and at most ${O(\frac{\log n}{\epsilon^2})}$. He uses this, and other results to give a number of applications—take a look at the paper to see them.

I find this surprising because the rows of the identity matrix are as “independent as possible”—in the sense they have no nonzero elements in common. This would seem to me to suggest that the approximate rank should be large. But my intuition is obviously very wrong, as the the approximate rank is exponentially smaller than the rank.

## Applications

Santosh and Noga give many applications of their bounds on the approximate rank of various matrices. The following theorem is one of the most interesting in my opinion, which is totally biased. They prove:

Theorem: Let ${A,B \in [-1,1]^{n \times n}}$ be the payoff matrices of a ${2}$-player game. If ${A+B}$ is positive semidefinite, then an ${\epsilon}$-Nash equilibrium of the game can be computed by a randomized algorithm using ${{\mbox poly}(n)}$ space and expected time $\displaystyle n^{O(\log (1/\epsilon)/\epsilon^2)}.$

Recall some of the terms used here: A two-person game in general is defined by two matrices ${A}$ and ${B}$. These are respectively the payoff matrices for the “row” and the “column” player. A matrix is positive semidefinite provided all its eigenvalues are positive.

This result generalizes a nice previous result of Ravi Kannan and Thorsten Theobald, who proved a similar bound in the case that ${A+B}$ has constant rank. Note that familiar zero-sum games are defined by ${A + B = 0}$, so this result is a natural extension of the well-understood zero-sum case.

The above theorem can also handle the case when the rank of ${A+B}$ is ${O(\log n)}$. Both of these results are related to a result of Evangelos Markakis and Aranyak Mehta and me. This shows that an ${\epsilon}$-Nash for any ${2}$-player game can be computed in time ${n^{O(\log n/\epsilon^2)}}$. It has been an important open question to determine whether this problem has an algorithm of complexity of ${n^{f(\epsilon)}}$.

I cannot avoid noting that ${A+B}$ may fail to be positive semidefinite, yet there are ways to modify the game matrices of a 2-player game and not change the game. Can these modifications serve to extend Santosh’s and Noga’s result? For example, if there is a convex combination of ${A}$ and ${B}$ that is positive semidefinite, then the above theorem applies.

## Open Problems

I think this work on approximate rank could open many doors, and lead to many new results. Here is one example: Could approximate rank methods be used to prove some new results above graph isomorphism? I think there is a possible connection—more about this in the future.

One of the major mysteries in this area is to find the ${\epsilon}$-rank of the upper triangular matrices ${U}$ typified by: Santosh says that he is puzzled by this matrix. They prove that its approximate rank is of order at most ${n^{2/3}}$. But this upper cannot be the best, or is it?

1. November 19, 2012 10:10 am

Very interesting. Is there a link to the paper?

2. November 19, 2012 10:31 am

Is the paper online?

3. November 19, 2012 11:12 am

Is there a link to the actual Vempala-Alon paper?

4. November 19, 2012 11:32 am

I take it the paper isn’t available yet?

5. November 19, 2012 11:55 am

It’s interesting. this has the flavor of a “JL for L_infty” in the following sense. We know that we can pack exponentially (in d) many vectors that are ‘approximately” orthogonal in Euclidean sense (compared to d vectors for exact orthogonal). The approximation here is that the dot products are epsilon away from zero.

Your example with the identity matrix suggests that at least in that case, something related is going on.

• November 19, 2012 4:58 pm

Yes, generally if A is a positive semidefinite matrix with each diagonal at most 1, then because every coordinate of it is an inner product of vectors of at most unit norm, then you can use dimension reduction to reduce each vector to O(log n / eps^2) dimension (hence reducing the rank to that number) with at most an epsilon deviation.

• November 19, 2012 6:17 pm

Yes the paper is now available….

the paper

6. November 19, 2012 6:19 pm

Small correction: A *Hermitian* matrix is positive semidefinite provided all its eigenvalues are *nonnegative*.

7. November 20, 2012 6:24 am

Definition  Alice and Bob play the $\epsilon$-game as follows:

• At each turn the previous turn’s $(n-1)\times (n-1)$ matrix $M$ is extended to an $n\times n$ matrix.

• The new diagonal element is unity: $M_{nn} = 1$.

• Bob chooses the new column elements $M_{nk}\colon n>k, |M_{nk}|<\epsilon$

• Alice chooses the new row elements $M_{kn}\colon k< n, |M_{kn}|<\epsilon$

Thus (obviously) Bob specifies the upper triangle of $M$, Alice specifies the lower triangle, and the diagonal is unity. Then we are ready to play:

The Builder versus Destroyer Game  If Alice and Bob choose adversarily, with Alice seeking to maximize the rank of $M$, and Bob seeking to minimize it, what is the asymptotic rank of $M$ for $n\to\infty$ ?

The Error Correction Game  If Bob chooses the upper-triangle entries randomly (say from a uniform distribution), what maximal asymptotic rank can Alice achieve?

Clearly there are many reasonable variants of these games. The broad mathematical question is: How much matrix structure can be preserved, when Order battles Chaos? 🙂

Note  The above “Error Correction Game” is largely inspired by the Kalai/Harrow debate, with Bob in the role of quantum-computing/matrix-rank skeptic, and Alice in the role of FTQC/matrix-rank sustainer.

• November 20, 2012 6:56 am

Thus we say — to borrow nomenclature from fault-tolerant quantum computing — that if Alice has a rank-maximizing strategy that yields $\text{rank} M = \mathcal{O}(n)$ against random $\epsilon$-errors inserted by Bob, then Alice’s strategy is scalably fault-tolerant.

If Alice does have such a scalable rank-preserving strategy, hopefully there exists an explicit, provably PTIME algorithm for computing it! 🙂

8. 