A Pretty Identity
Or rather two beautiful identities
Joseph Lagrange was a mathematician and astronomer who made significant contributions to just about everything. Yet like all of us he could make mistakes: he once thought he had proved Euclid’s parallel postulate. He wrote a paper, took it to the Institute, and as they did in those days, he began to read it. But almost immediately he saw a problem. He said quietly:
Il faut que j’y songe encore.
“I need to think on it some more.” He put his paper away and stopped talking.
Today Ken and I thought we would talk about a beautiful identity of Lagrange, not about the parallel axiom.
Many of you may know the identity, almost all of you probably know one of its consequences, the famous Cauchy-Schwarz inequality:
Here is the usual inner product and is the square of the norm,
The finite-dimensional case of this inequality for real vectors was proved by Augustin-Louis Cauchy in 1821. Then his student Viktor Bunyakovsky obtained the integral version, by taking limits. The general result for an inner product space was proved by Hermann Schwarz in 1888.
Bunyakovsky worked in theoretical mechanics and number theory. He conjectured, in 1857, a very natural result, one that is surely true, but even after its sesquicentennial anniversary, it remains completely untouched. The conjecture is: For each integer polynomial there are an infinite number of primes in the sequence
provided the polynomial satisfies certain trivial constraints. These are: (i) it must be a polynomial that tends to positive infinity—primes are positive, (ii) it must be irreducible, and (iii) it must not always be divisible by a fixed prime.
The first two constraints are trivial and the last avoids polynomials like . The conjecture is still open for all polynomials of degree at least two. Thus Cauchy’s student made a conjecture that is both beautiful and hard. We wonder, how was it to work with Cauchy as an advisor?
Lagrange’s identity is quite pretty. For any real numbers and :
Using notation for the norms and inner product we can shorten it to
The importance of this identity is that it immediately implies the famous Cauchy-Schwarz inequality over the real numbers.
For a historical note, the case was known going back to antiquity:
Wikipedia names this for Brahmagupta, whom we recently mentioned, and for Fibonacci, but notes that it goes back (at least) to Diophantus. Brahmagupta actually stated and proved this identity for any as well:
Incidentally, this shows that not only are numbers that are sums of two squares closed under multiplication, but also numbers of the form , for any fixed . Also we recently mentioned Lagrange’s use of a lemma that numbers that are sums of four squares are also closed under multiplication, on the way to proving that they encompass all whole numbers.
Two Complex Versions
The identity shows that an inner product can be reduced to the computation of only sums of positive quantities. This works also in the complex case, but there are two interesting versions that are not immediately interchangeable. First, let
where we will have in mind that are unit vectors. Then by the identity we get that is equal to
This consists of only two sums, each of which is a sum of squares. Now we may interchange each by its conjugate , so that we get a proper complex inner product
which is thereby equal to
Note that the conjugates do not go away even though they are squared, so we do not obtain sums of positive quantities. To get this, we need a different version that was also found by Lagrange:
This is the same as Wikipedia’s formula with replaced by , which in turn fixes an apparent typo in its source. We note that the brute-force proof of this for leads to Lagrange’s four-square lemma. Namely, let
Then the left-hand side becomes
which represents an arbitrary product of sums of four squares that we want to write as a sum of four squares. The right-hand side becomes
This is a sum of four squares, and to verify that it equals the left-hand side, one need only see that all twenty-four cross-terms in cancel.
Toward New Applications?
Written more compactly using inner product and norm notation, what we have is
Thus we have written the squared inner product as a difference of two positive real terms, where each term is a sum of squared real sub-terms or a product of the same. Ken and I have been trying to use this to improve some known simulations of quantum algorithms. So far we have no new results, but the above manipulations make a tighter connection seem plausible. The question for the applications we seek is:
Under what conditions can good approximations to the squared sub-terms yield a good approximation to ?
In general, the sticking could be the minus sign, together with the presence of situations where each of the two terms on the right-hand side has magnitude much higher than . Thus approximations of these terms to within will not help unless is truly tiny.
However, in quantum algorithms, we can expect and to be unit vectors. Depending on the algorithm, we may be able to arrange for the squared inner product to yield the acceptance probability, so that is a number between and . Then simply equals . Thus we have:
Hence we can hope that good approximations to the squared terms in the sum can yield a good approximation to . Moreover we are helped by the promise for a algorithm that either is close to zero (so the sum is close to ) or is close to one (so the sum is close to ). What can go wrong?
The sticking point again could be the minus sign, together with what could be exponentially many complex “cross terms” of the form . It may be hard to get enough of a handle on those terms to approximate all their (squared) differences. Two further complications are that sometimes our acceptance probability involves not just one inner product but many, though in some cases we can arrange polynomially many, and that the indices may be limited to some subset of . What could help us most would be a further manipulation of , taking into account the “promise” condition that be near or .
Did you know this identity? Besides Cauchy-Schwarz, what useful inequalities and estimates can be derived—in the presence of certain promise conditions?