Basic theorems that rarely get proved in full detail

Laura Smoller is not a theorist, but a historian at the University of Arkansas at Little Rock. She sports the middle name “Ackerman,” which is one “n” off of the famous mathematician Wilhelm Ackermann. What makes me think of this is her informative page on the history of matrices.

Today I want to discuss an interesting phenomenon: certain bedrock theorems are rarely if ever checked.

Apparently matrices were discovered by the Chinese around 300 BC, while the term “matrix” was only introduced around 1850 by James Sylvester. Matrix theory was then developed by Arthur Cayley in detail, but is unclear who first proved that matrix product, in general, is associative. Smoller points out that “matrix” is the Latin word for womb, which is an interesting choice for this important concept.

The phenomenon we are interested in is that certain basic, critical, fundamental, what else can I say, facts in mathematics are rarely proved. It is not that they are obvious, not at all. It is that they are felt too tedious or technical to prove in a course or even a textbook. A prime example is:

${\bullet }$ Matrix Product Is Associative: Suppose that ${A,B,C}$ are square ${n \times n}$ matrices over the integers. Then we know that

$\displaystyle A \times (B \times C) = (A \times B) \times C,$

where “${\times}$” denotes the usual matrix product. This is not hard to prove, but is a bit tedious. Have you checked the details yourself? The following is a typical proof that is given:

## A Better Proof of The Associative Law?

Let’s try to give a higher-level proof of this law. Our proof is longer than the usual proof by far, but length is not the only measure of clarity of a proof. I believe that the following proof explains to me in a different way than the usual proof why matrix product is associative.

Let ${G}$ be an algebraic system that is an abelian group under addition denoted by “${+}$” and also has a well defined multiplication operation denoted by “${\times}$.” Suppose that ${S}$ is a subset of ${G}$ that generates ${G}$ under addition: every element of ${G}$ is the sum of a finite number of elements of ${S}$.

Consider the following two laws:

1. The distributive laws: for all ${a,b,c}$ in ${G}$, ${a \times (b+c) = a \times b + a \times c}$ and ${ (b+c) \times a = b \times a + c \times a }$.
2. The associative law for ${S}$: for all ${a,b,c}$ in ${S}$, ${a \times (b \times c) = (a \times b) \times c}$.

We claim that any system with these two laws is associative.

A simple notation convention: If ${a}$ is an element in ${G}$, then we will use

$\displaystyle a = \sum_{i} a_{i}$

to denote a representation of ${a}$ as a sum of elements ${a_{i}}$ from ${S}$. We will leave out the limits on the index ${i}$, since that should cause no confusion. This representation is not necessarily unique, but that has no effect on the remainder of this proof.

So we are all set to start showing that the above two laws imply that multiplication is associative. Suppose that ${a,b}$ are elements in ${G}$ with representations

$\displaystyle a = \sum_{i}a_{i} \text{ and } b = \sum_{j} b_{j}.$

Then the product of ${a \times b}$ is equal to

$\displaystyle \sum_{i} \sum_{j} a_{i} \times b_{j}.$

This follows directly from the distributive laws:

$\displaystyle \begin{array}{rcl} a \times b &=& \sum_{i}a_{i} \times \sum_{j} b_{j} \\ &=& \sum_{j} \left( \sum_{i}a_{i} \right) \times b_{j} \\ &=& \sum_{j} \sum_{i} a_{i} \times b_{j} \\ &=& \sum_{i} \sum_{j} a_{i} \times b_{j}. \end{array}$

The last step uses that addition is cummutative. Now let’s prove the associative law: let ${a,b,c}$ be elements in ${G}$,

$\displaystyle a = \sum_{i}a_{i} \text{ and } b = \sum_{j} b_{j} \text{ and } c = \sum_{k} c_{k},$

where as before all indexed elements are in ${S}$.

It follows that ${a \times (b \times c)}$ is equal to

$\displaystyle a \times \sum_{j,k} b_{j} \times c_{k},$

which is equal to

$\displaystyle \sum_{i,j,k} a_{i} \times (b_{j} \times c_{k}).$

But ${ (a \times b) \times c}$ is also equal to

$\displaystyle \sum_{i,j,k} (a_{i} \times b_{j}) \times c_{k}.$

Since elements in ${S}$ are associative the later two sums are equal, and we have proved the associative law.

So how do we prove the associative law for square ${n \times n}$ matrices over the natural numbers? We need only check that the above laws are true. The distributive law is easy—it is essentially that matrices are linear operators. Finally what plays the role of ${S}$? Just consider matrices of the form

$\displaystyle E_{ij},$

where ${E_{ij}}$ is the matrix that is all zeroes except that in row ${i}$ and column ${j}$ it is ${1}$. Here is an example for ${n=3}$,

$\displaystyle E_{12} = \begin{bmatrix} 0 & 1 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{bmatrix}.$

Clearly these generate the space of matrices by addition. It remains only to show that the associative law holds for these matrices. The key is to note that

$\displaystyle E_{ij} \times E_{uv}$

is the zero matrix unless ${j=u}$ and then it is the matrix ${E_{iv}}$.

## Two More Examples

Here are two additional examples of this phenomenon.

${\bullet }$ Existence of Universal Turing Machines: We all know that there are universal Turing Machines; this is proved in the original paper of Alan Turing. How many of us have worked through the full details of this theorem, including checking all the details that the universal machine works exactly as claimed? Okay, Ken did—in the late 1980’s he designed a universal Turing Machine using the now out-of-print Turing’s World software developed by John Etchemendy and the late Jon Barwise.

${\bullet }$ Elliptic Curve Product Is Associative: For any elliptic curve there is a “natural” addition defined on the curve. The proof that this addition is associative is critical to prove that it forms an abelian group. This group is one of the reasons that elliptic curves have such a beautiful theory. Again the checking this is elementary but tedious. Perhaps even more than tedious. Have you checked it yourself?

Here is a comment from a Wikipedia talk page on this very issue:

It’s not obvious to anyone. However, the definition of addition on elliptic curves is quite natural, if you are familiar with Bézout’s theorem, and once you have a commutative binary operation with identity and inverses it’s not hard to conjecture that it constitutes a group operation, and then the proof writes itself (with many computations). The other perspective, that the points of an elliptic curve correspond to divisors of degree 0 in its Picard group, is not so obvious, but once you see it, you immediately get a group operation on the points and you start to wonder whether the other operation which you couldn’t prove associative actually is, and is the same operation. And from the definition of the Picard group it is quite easy to prove they are the same thing. Neither of these qualifies as “obvious,” but they are not obvious in different ways.

The following is a proof of the associative law from the paper, but note it is incomplete. At the end it appeals to a math package. Other proofs state things like, “The proof follows from simple manipulation of the expressions derived in the discussion preceding the theorem.” Okay.

## Open Problems

Does the associative law proof given above help in your understanding? I would like to know, please. Also I think it would be good to find other “better” proofs for the other two examples. Also are there are other examples of statements that are fundamental but are rarely checked?

July 25, 2012 10:13 am

The addition law for elliptic curves can be appreciated as a trap-door function for our appreciation of mathematical naturality, in that the following representations are a natural sequence: the torus (viewed as a Lie group) is geometrically abelian; Weierstrass elliptic functions naturally present this abelian group structure in the complex plane; elliptic curves are a natural algebraic presentation of the Weierstrass elliptic functions.

For whatever reason, reversing this implication order yields an algebraic group representation whose verification is dauntingly intricate and whose various cancellations and simplifications seem miraculous.

So perhaps algebraic checks of the elliptic curve addition law are best understood, not as derivations, but rather as an (intricate) algebraic verification of a (natural) geometric insight?

Just to mention, in both quantum trajectory integration and cryptography it commonly happens that geometrically natural ideas are reflected in intricate algebraic relations; hence my own practical interest in these ideas.

How commonly do we find that geometric simplicity is reflected in algebraic intricacy and vice versa?

July 25, 2012 10:32 am

Dick Lipton asks: Have you checked it [associativity] yourself?

Yes. It was an illuminating exercise in Mathematica to verify the associativity property both numerically (straightforward) and symbolically (not too much harder) … the challenging part was evolving an appreciation of the naturality of associativity (regarding which the Mathematica calculations were not much help).

July 25, 2012 6:33 pm

I don’t see the geometrical insight. Can you please explain or link to a good explanation?

July 25, 2012 8:50 pm

Craig, my Mathematica notebook (which I wrote a few years ago) references the (english-language) elliptic curve notes that are available on the (swedish language) website Seminariekurs i teoretisk datalogi, and I seem to recall also that Neil Koblitz’ textbook Algebraic Aspects of Cryptography was a good source … mostly I recall that it was difficult to find *any* suitable texts … and so your mileage may vary! 🙂

July 25, 2012 9:05 pm

… and searching about on Google books, on page 322 of Ireland and RosenA Classical Introduction to Modern Number Theory we find a soothing assertion that — in light of the periodicity of the Weierstrass functions — “associativity is ‘obvious'”.

Craig, I hope these references help, but to be candid, the set of folks to whom these matters are entirely ‘obvious’ does not include me, particularly in regard to generalizing group properties from the complex plane to finite fields. Good luck!   🙂

July 26, 2012 11:56 am

July 27, 2012 7:36 am

For Gödel’s Lost Letter readers who are interested in the historical and contextual aspects of elliptic curves, please let me recommend also the recent article by Koblitz, Koblitz, and Menezes (KKM, 2011):
 @article{Author = {Ann Hibner Koblitz and Neal Koblitz and Alfred Menezes}, Journal = {Journal of Number Theory}, Pages = {781-814}, Title = {Elliptic curve cryptography: the serpentine course of a paradigm shift}, Volume = {131}, Year = {2011}} 
The KKM article’s discussion of “narrative inversion” in cryptography (Section 12.3) is particularly relevant to recent Gödel’s Lost Letter posts, in that the ongoing Kalai-Harrow debates on the feasibility (or not) of efficient quantum dynamical simulations are analogous to the question of the feasibility (or not) of efficient inversion of ECC.

How strong *is* Nature’s quantum encryption? That is a wonderfully interesting and strategically important question!   🙂

July 25, 2012 10:32 am

Once you check that matrix multiplication corresponds to composition of endomorphisms of (i.e., linear operators on) your vector space (call it V), you are reduced to showing that End(V) is an associative k-algebra (k being the ground field). In fact, for any ring R and any R-module M, End_R(M) is an associative R-algebra. Loosely, “composition is always associative”. This is essentially obvious, but if you like you can easily write down a simple Proof: let A, B, C be three endomorphisms of an R-module M. For m in M, we check that A(BC)(m) = (AB)C(m). By definition, the lefthand side is A(BC(m)) = A(B(C(m))). By definition the righthand side is (AB)(C(m)) = A(B(C(m))). qed.

Oh wait. un-q.e.d, un-q.e.d! We still need to check that given linear endomorphisms A,B of a finite dimensional vector space space V over a field k, then upon choosing a basis v_1, …, v_n for V, the matrix of the composition AB is the matrix product [A][B], if the matrices of A and B are [A] and [B] respectively. Let’s recall a definition of “the matrix” of a linear transformation B. It has entries in the jth column equal to the coefficients of Bv_j in the basis in question. I.e. $$Bv_j = \sum [B]_{ij} v_i$$. So by linearity, $$ABv_j = \sum_i [B]_{ij} Av_i = \sum_{i,l} [B]_{ij} [A]_{li} v_l = \sum_{i,l} ([A]_{li}[B]_{ij}) v_l$$. This says that the entry in the l’th row and j’th column of [AB] is $$\sum_i [A]_{li}[B]_{ij}$$, i.e. the (l,j) entry of [A][B]. qed for real this time.

I think this proof is short and elegant because it is applicable for finite-rank free R-modules M over any commutative ring R.

Regarding elliptic curves: I think brute-force algebra proofs of the associativity are worse than useless. (Well, ok, if one has taken the time to turn one’s ugly proof into a computer-verified formal proof in a language like Coq or HOLlight, etc, then maybe it would give assurance of validity to some people who otherwise lacked confidence in the result. But otherwise, the odds are, I think, a priori greater that one has made an algebra mistake or typo somewhere than that one hasn’t. Maybe the mistake is substantive and maybe it isn’t. But who wants to go through line by line and check!?)

The Picard group proof mentioned on the wikipedia page is undoubtedly the “right” one, in my opinion. You could see Chapter 21 of this book: http://math.stanford.edu/~vakil/216blog/FOAGmay1612public.pdf, for a nice account, but be aware you might backtrack some earlier results or else take some stuff on faith.

I believe that using the theorem here: http://en.wikipedia.org/wiki/Cayley%E2%80%93Bacharach_theorem, it is possible to give a purely “synthetic” proof of associativity in the “generic” case of three points that are all sufficiently distinct. Here by “synthetic” I mean the proof does not make use of coordinates, so the specific polynomials entering into the definition of the group law play no role; one works purely from the geometric interpretation. One could hope to use a degeneration argument to deduce the general associativity from this, but it has been a while since I thought through the details of how this might go, so I’m not sure if it works.

July 25, 2012 10:33 am

The associativity of matrix multiplication doesn’t have to be “checked” because of the well-known isomorphism with an algebra of homomorphisms.

I’d rather state your final question in this way: are there are other examples of statements that are fundamental, are generally considered to be true but – being never fully checked – might be false?

July 25, 2012 10:47 am

but note it is incomplete.

Is it?

Suppose the last line had been “which as per Riemman’s theorem lies in I” instead. You wouldn’t have called it incomplete.

Yet, when a similar invocation is made to a software package is called incomplete.

Why the difference?

5. July 25, 2012 11:43 am

• July 25, 2012 5:46 pm

This is offtopic. Furthermore, you’re wrong. The Cook-Levin Theorem is not false — at least, not by your reasoning. For a language to be in NP, there is a NTM that decides it and a polynomial p(n) that bounds its running time along any path. When you construct a reduction from an NP language to other languages, you can use this NTM and the associated p(n) freely, since it is assumed that they exist. After all, that’s what it means for a language to be in NP.

• July 26, 2012 11:29 am

How can we use that NTM and the associated p(n) freely, if we even do not know what p(n) is (n^2? n^3? n^100? …?)? Think: In order to Cook-Levin Theorem works, this p(n) must exist, of course, but it also must be given, which is tacitly used as a faulty hidden assumption into all the “proofs” of that “theorem”. It’s very very sad, but it is true.

July 28, 2012 10:42 am

With regard to the above Barbosa-Mota exchange, it is reasonable to have mixed sympathies as follows:

—————————–

Logical Sympathies  Logically my sympathies are entirely with Francisco Mota: the Cook-Levin theorem does follow rigorously from the standard postulates of complexity. And conversely of course, those standard postulates are tuned to accommodate theorems like Cook-Levin.

Natural Sympathies  Naturally my sympathies do appreciably overlap with André Luiz Barbosa’s concerns, in that presently it is not known whether the Cook-Levin reduction may be accomplished concretely in all cases, versus requiring assistance from (non-physical) oracles in some cases.

Pragmatic Sympathies  Pragmatically my sympathies reside wholly with Scott Aaronson, who offers the eminently sensible advice that in-depth mathematical queries are best posed as questions on the various math stacks, or alernatively — but far more slowly and less accessibly to students and non-specialists — as articles in the mathematical literature.

A synthesis of these logical, natural, and pragmatic sympathies is reflected in the TCS StackExchange question “Does P contain languages whose existence is independent of PA or ZFC?”, which by a slow community-driven process has evolved into a reasonably well-tuned (and to judge by its rating, reasonably well-regarded) TCS community wiki.

André Luiz Barbosa, please let me suggest that you study the process by which this community Wiki was established — in particular the absolute requirements of scrupulous referencing, collegially respectful dialog, constructive response to criticisms, and careful attention to definitions, sustained throughout a lengthy and necessarily iterative social process — and that you perhaps post your remarks either as a question on the above Wiki, or alternatively, as a standalone question on a forum like TCS StackExchange.

Note especially that a very considerable degree of patience is required … these ideas originate largely in Juris Hartmanis’ monography Feasible Computations and Provable Complexity Properties (1978), whose ideas are still being assimilated 34 years later.

And finally, I do *not* regard Barbosa’s concerns as being wholly off-topic, for the reason that if there exist mathematical theorems that are commonly employed yet seldom checked, how much more commonly do we embrace mathematical postulates that are seldom reconsidered?

Thanks especially are extended to Scott Aaronson for pointing out the time-tested merits of this traditional, slow-but-sure strategy for mathematical question-posing and question-answering, and to Dick Lipton and Ken Regan, for helping to instill (in me and many readers of Gödel’s Lost Letter) a healthy respect for both the practical merits and the practical challenges, of oracle-independent proofs in complexity theory!   🙂

• July 30, 2012 12:32 pm

Dear Dr. Sidles, thank you for your support. As I haven’t suffice reputation into the top foruns in the area, I ask you to enter my conclusion at http://www.andrebarbosa.eti.br/The_Cook-Levin_Theorem_is_False.pdf as a standalone question on the TCS StackExchange.

July 30, 2012 3:11 pm

André, no reputation is required to post a question on any of the major StackExchanges. Your post must however *be* a question, and it is very desirable that the question you asked be well-posed and broadly interesting to the community.

To start from bare beginning, a reasonable strategy is: (1) earn a few hundred reputation points by being a good citizen answering other folks’ questions, (2) in so doing, assimilate the norms of good citizenship on that forum, then (3) pose your question carefully using all the arts of good citizenship that you have learned, (4) as an incentive, offer your points as a bounty for good answers.

This strategy may seem obvious … because it *is* obvious. And it may seem like lot of work … which it is. And it may seem slow … which it is. In fact, the process is academia in-miniature. André, good luck!   🙂

July 30, 2012 4:18 pm

André, you might also take a look today’s very active topic on TCS StackExchange titled “Are there non-constructive algorithm existence proofs?“. This well-rated question, its answers, and the forum members who are providing those answers, all are role models for the class of ideas that you are interested to pursue.

July 31, 2012 10:15 am

I didn’t understand the thesis of André’s paper when reading it the first time. His English has a lot of grammatical mistakes and his paper in the bibliography links to some other paper in arxiv.

• July 31, 2012 11:38 am

Dear Dr. Craig, despite of the lot of grammatical mistakes, have you understood the thesis of my paper when reading it the second time? What do you think about that?

July 31, 2012 2:04 pm

I didn’t understand it the second time I read it.

• July 31, 2012 3:15 pm

I believe Andre’s point is:

Suppose there is a problem Q in NP. Then, we know there is a polynomial p(n) that is a upper bound for the runtime of a NTM that decides Q.
Now, assume we do *not* know what the actual p(n) is.
In this case, Cook-Levin Theorem does not gives us a “concrete” DTM that reduces Q into SAT in polynomial time.
Andre seems to claim that for this reason CLT is bogus.

I believe Andre’s conclusion is not correct. CLT is showing that a DTM exists, though it does not provide the actual DTM (when we do not know p(n)).
John Sidles provides a link to other examples of non-constructive existence proofs.
Note that if we do not accept CLT because it is non-constructive, then we should also not accept the claim that a problem Q is NP without knowing the corresponding polynomial p(n).

• July 31, 2012 5:10 pm

Yes, Leonardo, you went to main point. The problem is that, even though is very clear that the two facts cannot be both true, the established theory in the area states both:

(i) a problem Q is NP despite without knowing the corresponding polynomial p(n); and

(ii) The Cook-Levin Theorem gives us a “concrete” DTM that reduces Q into SAT in polynomial time.

I think that (i) is obviously true. Then, (ii) [the CLT] is (must be) false.

July 31, 2012 6:48 pm

Rather than asserting “the CLT is false”, perhaps it is more fair to observe that there is a realm of complexity theory within which the CLT is true, and moreover this complexity-theoretic realm is immense in scope, intricate in conception, beautiful in proof, consistent in postulates … and (very often) nonconstructive in application.

Whether this vast realm comprises a Heaven or a Hell for the coming generation of young complexity theorists (and for engineers who rely upon advances in complexity theory) is largely a matter of opinion, regarding which mathematicians like Doron Zeilberger (among others) write thought-provoking essays.

Perhaps this is one of those questions, regarding which it is neither necessary, nor feasible, nor even desirable, that everyone think alike.

Could a complexity-theoretic realm of attractive scope, beauty, and utility be founded entirely upon constructive principles? Since I am myself among the persons least qualified to offer an informed opinion in this matter, perhaps other folks will care to offer theirs.

• August 1, 2012 6:30 pm

Dear Dr. Sidles, I believe the realm of Computational Complexity Theory that really matters is the real world, even though its hardnesses and sadnesses… Beyond this human and extraordinary place anything goes but anything melts into air…

July 25, 2012 7:31 pm

Got your quantifiers swapped there, Andre: CLT says that every lang in NP is computable by a procedure that runs in an initial deterministic phase, and that finishes off with a single query to a SAT solver, as in, “return SAT(z)”, where z is prepared in the initial deterministic phase. Note it’s not saying that the initial phase is the same for every lang in NP.

• July 27, 2012 4:36 pm

It doesn’t matter at all whether or not that “initial phase” is the same for every language in NP: In order to Cook-Levin Theorem works (construction of a poly-fime reduction from an NP language to other one in P [SAT]), the p(n) that bounds the running time of an NTM that decides that language in NP must be given, which is tacitly used as a faulty hidden assumption into all the “proofs” of that “theorem”.

August 5, 2012 1:22 am

Sorry Andre, I’m against you on the existence of p(n) issue. However, my dispute with the CLT proof is that in the reduction, the “source problem” is NOT a property, but only a “computation” of the property by an NDTM. A description of a property’s computation steps on a (ND)TM is NOT the same as the description the property itself. On the other hand, in almost every NP-completeness reduction, we use property descriptions, NOT their computation steps on a TM. (By property, I refer to a decision problem)

• August 6, 2012 11:16 am

In fact, the converse holds here: The “source” in the reduction into CLT IS a property (a decision problem), NOT a “computation”. See that if we prove that a problem is in NP, then we must can construct a poly-time NTM that decides it. Do not confuse “machine” (NTM, DTM, DFA, NFA, PDA, etc.) with “computation” (a particular running of a machine).

The CLT is really a kind of translator that translates an instance of a poly-time NTM (a “materialization” of an NP problem) in an instance of a Boolean formula. The issue is that in order to CLT does it, the related polynomial p(n) from that NTM MUST be given. However, if this p(n) is not known, then the CLT doesn’t can do this “translation”, despite the NTN is poly-time: Hence, the CLT is false.

• August 6, 2012 1:15 pm

Complementing: The CLT translates an instance of a poly-time NTM with a particular input for it (an instance from the NP problem that it decides) in an instance of a Boolean formula, and can be considered a poly-time function: CLT(M, input) = BoolForm. See that its input is not the computation of M on input.

July 25, 2012 11:56 am

The right way to prove the associativity of matrix multiplication is to show that matrices correspond to linear maps and multiplication to composition of linear maps. Composition of functions is necessarily associative. This correspondence is, after all, the whole point of matrices.

July 25, 2012 5:01 pm

There’s no “right way” in mathematics and I understand that, in a blog about computer science, the computational way of proving things appears to be the most natural. But I agree with you: when you think of matrices as linear maps, the associativity of their multiplication needn’t be proved at all.

July 26, 2012 9:50 am

But then you would have to show unicity of the representation of linear mappings. All your argument gives is that performing the multiplication in a different order is still the same mapping, but if there exists more than one representation it wouldn’t follow that you would have the same one as if you had done the multiplication in the original order.

July 26, 2012 10:33 am

This is true: complexity always has to hide somewhere. It seems to behave like a form of energy or mass. Or is computational complexity indeed a form of energy? Why is it always the case that, whenever something becomes easy, something else has to become difficult? That never ceases to puzzle me…

July 25, 2012 3:56 pm

I agree with Sally. Doesn’t associativity of matrix product follow directly from associativity of function composition, after making the two (easy) observations that matrices correspond to linear functions and multiplication corresponds to composition? If so, then this whole blog post is beyond ridiculous. I must be missing something. What am I missing? Anyone?

July 26, 2012 10:21 am

Of course it’s not really accurate to suggest the associativity was fully checked. 🙂 But jokes aside, the theorems of linear algebra are elegant but when you want your computer to actually work with matrices – inversion, reduction to normal forms, multiplication – you have to program it the hard way because the easy way’s much too inefficient for any practical purpose. In real life you can have thousands of lines and columns…

July 26, 2012 4:08 pm

“the hard way” = Gaussian elimination and stuff like that:
http://en.wikipedia.org/wiki/Gaussian_elimination

August 1, 2012 8:09 am

I’d like to add that it’s quite possible to base a whole course about linear algebra on Gauss’s elimination algorithm. Of course, the proofs are more difficult to understand – and indeed, the notion of linear map relieves the brain – but they’re readily usable on a computer. I think Seymour Lipschtuz uses this approach in his “Outline of Linear Algebra”.

August 1, 2012 10:23 am

Serge,

You are correct. That’s all linear algebra really is. Even the hard stuff at the end of an introductory course (Jordan normal form) can be easily understood in terms of Gaussian elimination.

It’s only the definitions like eigenvectors, eigenvalues, rank, dimension, determinant, kernel, vector space that make things difficult. Math has too many definitions that make intuitive things nonintuitive.

August 1, 2012 11:20 am

Craig, it is notable that the elements of linear algebra that your post classifies as “difficult” and “nonintuitive” are precisely those elements that are essential in connecting algebra with geometry. See for example, the appendix to John Lee’s Introduction to Smooth Manifolds that is titled “Review of Prerequisites: Linear Algebra”.

If algebra is largely concerned with choosing good coordinates for efficient computation, then perhaps geometry is concerned with not having to choose coordinates at all. As the Preface to Lee’s book (which is well-worth reading) puts it:

The old joke that “differential geometry is the study of properties that are invariant under change of notation” is funny primarily because it is alarmingly close to the truth.

So perhaps it is unsurprising that algebraists and geometers employ differing cognitive tool-sets, and that many fruitful mathematical ideas have originated at the boundary of these two disciplines.

August 2, 2012 4:56 pm

I wonder if all efficiently-solvable mathematics problems can be reduced on some level to Gaussian elimination.

July 26, 2012 10:23 am

erratum: “wasn’t fully checked” … “thousands of raws and columns”

July 26, 2012 3:58 pm

rows – pardon my English. 🙂

July 27, 2012 8:29 am

> Also, are there other examples of statements that are fundamental but are rarely checked?

When I first read this sentence, I completed it subconsciously: “… and might be false”. To this question the answer is: probably no.

But if Dick had in mind the following end: “… although it’s a good exercise to do it for yourself”, the obvious answer is: of course yes.

July 26, 2012 5:48 am

The addition of points on elliptic curves and the multiplication of matrices have a common property: there are two very different way of viewing the same subject – the computational one and the geometric one, more conceptual.

With the computational view, calculations are easy for the computer but proofs are difficult to understand for a human brain. With the geometric view, theorems are more straightforward to prove but they lead to very inefficient programs: just try to inverse a matrix using its determinant. As far as I know, computing a determinant requires exponential time…

So here’s the paradox: whenever it’s easy for the computer, it’s hard for the brain… and the other way around! Given this situation, it’s no wonder that even the best thinkers can’t manage to prove much about the complexity of problems! 🙂

July 26, 2012 7:12 am

Another view of matrix computations associates all of their (intuitively intricate) algebraic properties to the (intuitively simple) geometric naturality of volume forms. In this regard Michael Spivak’s Calculus on Manifolds includes the charming remarks:

There are good reasons why theorems should all be easy and the definitions hard … Definitions serve a twofold purpose: they are rigorous replacements for vague notions, and machinery for elegant proofs.

… Stokes’ Theorem shares three important attributes with many fully evolved major theorems: (1) It is trivial. (2) It is trivial because the terms appearing in it have been properly defined. (3) It has significant consequences.

The top-rated Amazon review of Spivak’s book has this to say:

When you are in college, the standard calculus courses will teach you the material useful to engineers. If you want to become a mathematician (pure or applied), you must pretty much forget the material in these courses and start over. That’s where you need Spivak’s “Calculus on Manifolds”. Spivak knows you learned calculus the wrong way and devotes the first three chapters in setting things right.

Hmmm … so ought we to be concerned that students nowadays are learning computational complexity theory “the wrong way”? Classical and quantum dynamics “the wrong way”? Thermodynamics and statistical mechanics “the wrong way”? Even chemistry and biology “the wrong way”?

It is both daunting and inspiring to reflect that the 21st century answers to all these questions very plausibly are “yes.”

So what is “the right way”? As Han Solo says: “Well, that’s the trick, isn’t it? And it’s going to cost you something extra.” Something extra, in the work of evolving an integrative appreciation of these challenges. As George Marshall said (in another context):

None of these problems can be solved alone. They are directly related, one to the other.

🙂

July 26, 2012 7:42 am

I don’t know if there is any right way to learn anything, but I have found that the most effective way to learn something is to actually do it. With respect to math, this means trying to solve math problems and asking lots of questions.

July 26, 2012 8:41 am

As Donald Knuth says: “Science is knowledge which we understand so well that we can teach it to a computer”.

July 26, 2012 9:56 am

Stanislaw Ulam (circa 1943):

“I have sunk so low that my last paper contained numbers with decimal points.”

Nowadays — as mathematicians like Doran Zeilberger and Donald Knuth rightly emphasize, and with the help of fast-evolving tools like SAGE and Mathematica — it is very common for computers to assure us that our decimal points are OK … and those same computers even assure us that our symbol manipulations are OK … for example, with regard to the examples of Dick Lipton’s post, computers can readily verify associative laws numerically and validate them symbolically.

What the computers cannot (as yet) accomplish, is to synthesize sharable narratives (in the style of Bill Thurston) that explain why these relations are natural. Even we humans cannot (as yet) accomplish narrative construction by any known deterministic algorithm.

Yet somehow, it happens. Lucky for us!

9. July 27, 2012 9:28 am

“Does the associative law proof given above help in your understanding? I would like to know, please.”

Not for me. It includes an interesting more generalized observation. But I think I personally prefer to see the lower-level “guts” of matrix details clearly on display (association and distribution of the ring operation).

July 28, 2012 7:10 am

Another example for a fact that almost everybody knows, some people may have used in their work and almost nobody checked is Kuratowski’s Theorem (at least this is my impression).

11. July 29, 2012 2:21 am

Has there been a case where some theorem is taken for granted by many, proof for which is later found to have some subtle flaw ?

• July 29, 2012 10:37 pm

I took for granted that Jensen-Shannon divergence is additive when comparing product distributions a-la JS(A_1 X B_1, A_2 X B_2) = JS(A_1, A_2) + JS(B_1, B_2), but to my then-shock this is false in general. This is slated for a future post.

July 30, 2012 9:53 am

It was believed at first that simple convergence of functions was enough to preserve the continuity of their limit. This mistake was an incentive for the invention of uniform convergence.

There are certainly many other historical examples. I wonder which is the “false theorem” that was believed to be valid for the longest period before it was proven false. Let’s hope there aren’t too many “fundamental” ones that are still awaiting a disproof…

July 29, 2012 10:48 pm

Smooth elliptic curves are isomorphic to complex tori C/L, where L is a lattice (technically with a point at infinity removed). So you need only check that the multiplication on an elliptic curve is just multiplication on the torus, and then it will be automatically associative. In fact, the only complex automorphisms of a complex torus (as a Riemann surface) correspond to multiplication of points and some torsion symmetries. So one need only check that complex curve multiplication is an automorphism when you fix either coordinate, and check the identity axiom, then it must be commutative.

An example I like of an “obvious” statement that nobody can usually face checking is that every norm on $\mathbb{R}^n$ can be arbitrarily closely approximated by a smooth norm (where that can mean infinitely differentiable if you like).