Triangulating proofs to seek a shorter path

 Cropped from 2016 Newsday source

Mehtaab Sawhney is an undergraduate student at MIT. His work caught my eye on finding his recent paper with David Stoner about permutations that map all three-term arithmetic progressions mod ${n}$ to non-progressions. Here a progression is an ordered triple ${(a,b,c)}$ where ${a -2b + c = 0 \pmod{n}}$. The paper addresses when such permutations can be found in certain small subgroups of ${S_n}$ while I am interested in senses by which they are succinct. This made me curious about Sawhney’s other work.

Today Ken and I wish to report on Sawhney’s simple new proof of the famous triangle inequality in ${\mathbb{R}^n}$.

Sawhney presents his new proof in a short note which has just appeared on p218 of last month’s issue of the College Journal of Mathematics:

An Unusual Proof of the Triangle Inequality.

Summary: A standard proof of triangle inequality requires using Cauchy-Schwarz inequality. The proof here bypasses such tools by instead relying on expectations.

Recall that the triangle inequality for the Euclidean norm on ${n}$ dimensions says that for any vectors ${X}$ and ${Y}$,

$\displaystyle || X + Y || \le ||X|| + ||Y||.$

Here as usual the norm of a vector ${V}$ is

$\displaystyle \left( \sum_{i=1}^{n} V_{i}^{2} \right)^{1/2}.$

The “standard proof” he refers to is represented by this one taken from Wikipedia’s triangle inequality article:

$\displaystyle \begin{array}{rcl} ||X + Y\|^2 &=& \langle X + Y, X + Y \rangle \\ \\ &=& ||X||^2 + \langle X, Y \rangle + \langle Y, X \rangle + ||Y||^2 \\ \\ &\le& ||X||^2 + 2|\langle X, Y \rangle| + ||Y||^2 \\ \\ &\le& ||X||^2 + 2||X|| \cdot ||Y|| + ||Y||^2 \\ \\ &=& \left( ||X|| + ||Y||\right)^2. \end{array}$

Here Cauchy-Schwarz is used to obtain line 4. Now Cauchy-Schwarz also requires a few lines to prove—indeed one could write a book about it. It feels like the combined proof is tracing two sides ${X}$ and ${Y}$ of a triangle, when there ought to be a shorter and direct third side. That is what Sawhney offers.

## The New Proof

He can prove the triangle inequality in ${n}$ dimensions by only using the trivial one-dimensional version. That is the fact that for the absolute value

$\displaystyle |x+y| \le |x| + |y|,$

where ${x}$ and ${y}$ are real numbers. Well, it needs the notion of mathematical expected value ${E_{x \in S} f(x)}$. This is formally defined via integration on ${S}$. But he really only needs that expected value obeys some simple properties that one could say are “expected”: additivity, linearity, and ability to manipulate its argument. The one special property is that the norm of a vector ${V}$ in ${\mathbb{R}^n}$ scales as the expected value of its inner product with a unit vector ${U}$. Formally:

$\displaystyle \begin{array}{rcl} E_{||U||=1}(|U \cdot V|) &= &E_{||U||=1} \left( \left| U \cdot \left( \frac{V}{||V||} \right) ||V|| \right| \right) \\ \\ &=& ||V|| E_{||U||=1} ( |U \cdot (1,0,\dotsc,0) |) \\ \\ &=& C ||V||, \end{array}$

where ${C}$ is a fixed nonzero constant. For ${n = 2}$ one takes the integral of ${|\cos \theta|}$ over the circle, which is ${4}$, and divides it by ${2\pi}$ to make an average, so ${C = 2/\pi}$. The values for higher ${n}$ are different but the difference doesn’t matter, only that ${C}$ is fixed and nonzero. The rest of the proof needs only the 1-dimensional triangle inequality to go from line 1 to line 2:

$\displaystyle \begin{array}{rcl} ||X || + ||Y|| -||X + Y|| &=& \frac{1}{C} E_{||U||=1}( |U \cdot X| + |U \cdot Y| - |U \cdot (X + Y) | ) \\ \\ &\ge& \frac{1}{C} E_{||U||=1}( |U \cdot X + U \cdot Y| - |U \cdot (X + Y) | )\\ \\ &=& \frac{1}{C} E_{||U||=1}( |U \cdot (X + Y)| - |U \cdot (X + Y) | )\\ \\ &=& \frac{1}{C} E_{||U||=1}(0)\\ \\ &=& 0. \end{array}$

Pretty neat. No?

## Why Care?

I must say that I was quite surprised to see a radically different proof that did not use Cauchy-Schwarz or some equivalent inequality.

Sawhney’s proof is also one a computer theorist could have found. The idea that he relies on is quite neat: the norm of a vector can be computed by taking random projections. This is not elementary but it is intuitive. It is something that “we all know”—yet we did not make the connection to the triangle inequality. This is another example of the power of expectation as a concept.

As Sawhney remarks in his note, the Cauchy-Schwarz inequality can be proved from the triangle inequality by reversing the flow of the proof cited above from Wikipedia, hence it can now be derived via his proof. But that again would be taking two sides of a triangle, while Cauchy-Schwarz has a direct proof. What’s nice is that now both inequalities have a proof that doesn’t reference the other and have a nice bridge between them.

## Open Problems

The obvious open problem is: can we use a similar randomness trick to prove other inequalities? Perhaps there are new proofs to be discovered; perhaps there are open inequalities that can be attacked by this method.

[fixed absolute value bars]

16 Comments leave one →
1. May 5, 2018 2:12 pm

A small correction: It seems like that should read E_{||U||=1}(|U·V|), rather than E_{||U||=1}(U·V) as it currently does?

• May 7, 2018 8:38 am

Thanks! Coulda sworn I’d fixed that but maybe reverted it when changing \begin{align*} to \begin{eqnarray*} in the source LaTeX when WordPress didn’t pick up the former.

2. Ravi Boppana permalink
May 5, 2018 8:06 pm

Nice proof by Mehtaab! Here is another inequality where this random projection method can be used. Let z_1, z_2, …, z_n be n complex numbers. Show there is a subset S of 1..n such that the sum of the z_j (for j in S) has absolute value at least 1/pi times the sum of the absolute values of all z_j (for j in 1..n).

May 5, 2018 9:08 pm

Ravi

Thanks. Nice. There must be whole collection of these…

3. John Marks permalink
May 6, 2018 12:34 am

Could it give a new view on the sqrt sum problem?
http://cs.smith.edu/~jorourke/TOPP/P33.html

4. May 6, 2018 8:14 am

Nice proof! I think the main point of it is the proof that $E_{\|U\|=1} | U \cdot X| = c\|X\|$.

It is fairly easy to prove by direct calculation that $E_{\|U\|=1} | U \cdot X|^2 = d\|X\|^2$ for some $d$ but the use of rotation symmetry to show that the expectation of the absolute dot product is linear in the norm is very nice.

5. May 7, 2018 5:38 pm

If you know that is basis-invariant, then Cauchy-Schwarz is pretty easy: just rotate until X=(x_1,0,…,0). Then

|| = |x_1|.|y_1| ,

but |x_1| = |X| and

|y_1| <= sqrt(y_1^2 + y_2^2 + … ) = |Y|.

• May 7, 2018 5:39 pm

sorry, inner products with angle brackets get your commenting system into trouble. that should read

| inner product of X and Y | = |x_1|.|y_1|

• May 7, 2018 5:40 pm

Indeed, the fact that Exp_U U.V is proportional to |V| is also a use of basis independence.

• May 7, 2018 6:24 pm

Argh. and my first sentence should be “If you know the inner product is basis-invariant”. I’m not helping my case that this proof is just as simple :-O

6. May 8, 2018 11:58 am

Should the third step in the standard proof for triangle inequality be $=$ instead of $\le$ ; the inequality comes in the fourth step by Cauchy-Schwarz inequality?
Wikipedia entry https://en.wikipedia.org/wiki/Triangle_inequality has what you have written – but the following math exchange states differently https://math.stackexchange.com/questions/91181/proof-for-triangle-inequality-for-vectors

• May 9, 2018 12:03 am

The difference is the inner products in line 2 not having | … | around them.

May 10, 2018 1:54 pm

1) I think one should say at the outset that the triangle and Cauchy inequalities are equivalent (as you say much later). Michael Steele’s book is very clear on that (and of course
does not spend the whole book proving CS).

2) I worry about the use of rotation symmetry in the probability measure on the unit sphere. That is based on Euclidean geometry, in most presentations of which the triangle inequality appears at an early stage.

(I was not able to read the original article due to lack of a subscription at my institution.)

8. May 14, 2018 11:03 am

This is perhaps too simple an example of a well known method that spawned the field of integral geometry. A nicer example, due to Cauchy (I believe) is the proof that if V and W are bounded convex sets in the plane with V a subset of W, then the perimeter of W is greater or equal than the perimeter of V. Indeed the perimeter of V is $\pi$ times the average length of the projection of V. This is derived from a more general statement: Let $\mu$ be the natural rotation and translation invariant infinite measure on the space of lines L in the plane (obtained e.g. by picking the angle of L uniformly and the distance of L from 0 according to Lebesgue measure on the positive reals). Then the length of a rectifiable curve $\gamma$ is $\pi/2$ times the $\mu$-integral of the number of intersections of L with $\gamma$. To verify the latter statement up to a constant, approximate $\gamma$ by a polygonal curve and reduce it to the case of a line segment by additivity. The constant is most easily determined by considering a circle.

9. December 4, 2018 8:57 am

Thanks for sharing.