On the arithmetic complexity of the matrix transpose

Michael Kaminski, David Kirkpatrick, and Nader Bshouty are complexity theorists who together and separately have proved many wonderful theorems. They wrote an interesting paper recently—well not quite—in 1988 about the transpose operation.

Today we want to discuss an alternative proof of the main result of that paper.

## The Transpose

The operation that maps a ${n \times n}$ matrix ${A}$ to its transpose ${A^{T}}$ is quite important in many aspects of linear algebra. Recall that the transpose is defined by $\displaystyle A^{T}(i,j) = A(j,i)$

for all indices ${i,j}$. A non-trivial issue arises already in proving this:

If ${A}$ has an inverse then so does ${A^{T}}$.

There are many proofs of this basic fact about the transpose, but it is not simple to prove it from first principles. For example look at William Wardlaw’s proof here. Or here for another.

## A Theorem

The paper of Kaminski, Kirkpatrick, and Bshouty (KKB) came up the other day, while I visited the computer science department at University of Buffalo. Atri Rudra told about some of his recent work on various complexity issues around matrix computations. One was an interesting question about the transpose operation on matrices. The result is the following:

Theorem 1 If the arithmetic complexity of ${x \rightarrow Ax}$ is ${S}$ then the arithmetic complexity of ${y \rightarrow A^{T}y}$ is at most ${O(S + n)}$.

KKB had a nice proof of this theorem over forty years ago. Indeed they can get a precise expression for the complexity that is tighter than the above statement. Their proof is a careful examination of the structure of any arithmetic circuit that computes ${Ax}$. Essentially they show one can in a sense “run the computation backwards.”

## Another Proof

Anyway in my discussion with Atri, the other day, he described another proof of this basic theorem. He said consider the function that maps ${x}$ to ${Ax}$ for a fixed matrix ${A}$. Suppose this linear map has arithmetic complexity ${S}$: that is the minimum number of arithmetic operations that are needed to compute ${Ax}$ is ${S}$. Note, ${S}$ can vary greatly with the structure of ${A}$. Even for nonsingular matrices the complexity can vary quite a bit: for a random matrix is likely to be order ${n^{2}}$; for a Fourier transform matrix it is order ${n\log n}$.

He said it is known that the arithmetic complexity of ${Ax}$ and ${A^{T}x}$ are about the same. But proving this, while not hard, requires some care. Atri told me about a quite neat argument that proves it. Further, the proof is “two-lines” as Atri said:

Proof: Consider the function ${f(x)=y^{T}Ax}$ as a function of ${x=(x_{1},\dots,x_{n})}$. The gradient ${ \nabla f(x)}$ is equal to all the partial derivatives of ${f(x)}$. This means that it allows us to compute ${y^{T}A}$, since the function is linear in each variable ${x_{i}}$. The famous Derivative Lemma of Walter Baur and Volker Strassen shows that a single arithmetical circuit of size order ${n}$ plus the complexity of ${f}$ can compute ${\nabla f}$. It follows that we can compute all the ${n}$ partial derivatives in ${O(S+n)}$ arithmetic steps. But for our function ${f(x)}$ this is equal to ${y^{T}A}$. Transposing the resulting vector gives ${A^{T}y}$ in the required number of steps. $\Box$

## Open Problems

I like this proof quite a bit. Can this argument be used to prove the basic fact about inverses of a matrix and its transpose?

1. December 10, 2018 1:55 pm

Is there an error tolerant version of the Baur-Strassen theorem? Or is it ruled out?

2. December 11, 2018 1:21 pm

The result you mention is (also) known as Tellegen’s transposition principle, mainly in the field of Computer Algebra. You can read about it and its history on Dan Berstein’s website: https://cr.yp.to/transposition.html. (Note that according to him, the name should not be Tellegen’s transposition algorithm, and also that “Kaminski, Kirkpatrick, and Bshouty don’t deserve any credit.”)

3. December 16, 2018 9:09 pm

(Sorry for being late to this.)

Thanks for posting on this. As Bruno mentioned above this is indeed the transposition principle. The proof above was pointed out to me by Albert Gu. I’m not aware of a specific ref for Albert’s proof and would be really interested if someone knows a ref for the same!

4. December 16, 2018 9:21 pm

Sorry should clarify my Q a bit more. Bernstein’s page on transposition principle does mention some papers refer to the proof in the passing but I was wondering if there is a ref (perhaps some lecture notes?) that writes down the proof.

5. 