Does notation shape our thinking?

Kenneth Iverson was a mathematician who is most famous for designing APL. This was the name of his programming language, and it cleverly stood for “A Programming Language.” The language is unique—unlike almost any other language—and contains many powerful and interesting ideas. He won the 1979 Turing Award for this and related work.

Today I want to talk about notation in mathematics and theory, and how notation can play a role in our thinking.

When I was a junior faculty member at Yale University, in the early 1970′s, APL was the language we used in our beginning programming class. The reason we used this language was simple: Alan Perlis, the leader of the department, loved the language. I was never completely sure why Alan loved it, but he did. And so we used it to teach our beginning students how to program.

Iverson had created his language first as a notation for describing complex digital systems. The notation was so powerful that even a complex object like a processor could be written in his language in relatively few lines of code: the lines might be close to unreadable, but they were few. Later the language was implemented and had a small but strong set of believers. Clearly, Alan was one of them who once said:

A language that doesn’t affect the way you think about programming, is not worth knowing.

The language APL was great for working with vectors, matrices, and even higher order objects. It had an almost uncountable number of built-in symbols that could do powerful operations on these objects. For example, a program to find all the primes below ${R}$ is:

$\displaystyle (\sim R \in R \circ . \times R)/R \gets 1 \downarrow \iota R.$

This is typical APL: very succinct, lots of powerful operators, and no keywords. The explanation of this program is here.

Famous for its enormous character set, and for being able to write whole accounting packages or air traffic control systems with a few incomprehensible key strokes.

Not quite right, not very nice, but this comment actually captures the spirit of Iverson’s creation: make complex functions expressible as very short expressions was an interesting idea. For example, Michael Gertelman has written Conway’s Game of Life as one line of APL:

This should make it clear that this language was very powerful, perhaps too powerful.

When I had to learn APL in order to teach the beginning class I decided to start a project on building a better implementation. The standard implement of the language was as an interpreter and one of my Yale Ph.D. students, Tim Budd, eventually wrote a compiler for APL. We also proved in modern terms a theorem about the one liners of the language. Each one liner could be implemented in logspace: this was not completely trivial, since the operators were so powerful. Perhaps another time I will discuss this work in more detail. For now see this for Tim’s book on his compiler.

Let’s turn to discuss various notations used in math and theory.

Good Notation

It is hard to imagine but there was a time when mathematicians did not even have basic symbols to express their ideas. Many believe that the notation helps to shape the way that you think, clearly without basic symbols modern mathematics would be impossible. Or at least extremely difficult.

${\bullet }$ Robert Recorde is credited with introducing the equality symbol in 1557. He said

${\dots}$ to avoid the tedious repetition of these words: “is equal to,” I will set (as I do often in work use) a pair of parallels of one length (thus ${=}$), because no two things can be more equal.

${\bullet }$ René Descartes is know for the first use of superscripts to denote powers:

$\displaystyle x^4 = x \cdot x \cdot x \cdot x.$

François Viète introduced the idea of using vowels for unknowns and consonants for known quantities. Descartes changed this to: use letters at the end of the alphabet for unknowns and letters at the beginning for knowns.

Descartes thought that ${x,y,z}$ would be equally used by mathematicians. But it is ${x}$ this and ${x}$ that ${\dots}$ The story—may be a legend only—is that there is a reason that ${x}$ became the dominant letter to denote an unknown. Printers had to set Descartes papers in type, and they used many ${y}$‘s ${z}$‘s, since the French language uses them quite a bit. But it almost never uses ${x}$, so Descartes’ La Géométrie used ${x}$ as the variable most of the time.

${\bullet }$ Isaac Newton and Gottfried Leibniz invented the calculus. There is a controversy that continues to this day on who invented what and who invented what first. This is sometimes called the calculus war—see here for some information.

Independent of who invented what they did use different notations, at least that is without controversy. Newton used the dot notation and Leibniz the differential notation. Thus Newton would write ${\dot x}$ while Leibniz would write${\frac{dx}{dt}}$ for the same thing. The clear winner, most agree, is the better notion of Leibniz. It is even claimed that the British mathematicians by sticking to Newton’s poorer notation lagged behind the rest of Europe for decades.

${\bullet }$ Leonhard Euler introduced many of the common symbols and notation we still use today. He used ${e}$ and ${\pi}$ for the famous constants and ${i}$ for the square root of ${-1}$. For summation Euler used ${\Sigma}$ and also introduced the notation for functions: ${f(x)}$. One can look at some his old papers and see equations and expressions that look quite modern—of course they are old, but look modern because we still use his notation.

${\bullet }$ Johann Gauss introduced many ideas and proved many great theorems, but one of his most important contributions concerns the congruence notion. Leonhard Euler earlier introduced the notion, but without Gauss’s brilliant notation for congruences, they would not be so easy to work with. Writing

$\displaystyle x \equiv y \bmod m$

is just magic. It looks like an equation, can be manipulated like an equation—well almost—and is an immensely powerful notation.

${\bullet }$ Paul Dirac introduced in 1939 his famous notation for vectors. The so called bra-ket notation is used in quantum everything. It is a neat notation that takes some getting used to, but seems very powerful. I wonder why it is not used all through linear algebra. A simple example is:

$\displaystyle |x \rangle = [a_0,a_1,\dots,a_n]^T.$

The power of the notation is that ${x}$ can be a symbol, expression, or even words that describe the state values.

For a neater example, the outer-product of a vector ${R}$ with itself used in the APL program for primes above is written this way in Dirac notation:

$\displaystyle |R\rangle\langle R|,$

which flips around the inner product ${\langle R | R \rangle}$. Now to multiply the matrix formed by the outer product by a row vector ${a}$ on the left and a column vector ${b}$ on the right, we write

$\displaystyle \langle a | R\rangle\langle R| b \rangle.$

The bra-kets then associate to reveal that this is the same as multiplying two inner products. Another neat feature is that ${\langle x |}$ versus ${|x\rangle}$ captures the notion of a dual vector, and when ${x}$ has complex entries, the ${\langle x |}$ notation implicitly complex-conjugates them.

${\bullet }$ Albert Einstein introduced a notation to make his General Relativity equations more succinct. I have never used the notation, so I hope I can get it right. According to his rule, when an index variable appears twice in a single term, once as a superscript and once as a subscript, then it implies a summation over all possible values. For instance,

$\displaystyle y = c_i x^i$

is

$\displaystyle y= \sum_{i=1}^3 c_i x^{(i)},$

where ${x^{(i)}}$ are not powers but objects.

${\bullet }$ Dick Karp introduced the notation ${C/f(n)}$ to complexity theory in our joint paper on the Karp-Lipton Theorem. Even though the notation was his invention, as a co-author I will take some ${\epsilon}$ credit.

Good Notation?

Not all notation that we use is great, some may even be called “bad.” I would prefer to call these good with a question mark. Perhaps the power of notation is up to each individual to decide. In any event here a few “good?” notations.

${\bullet }$ John von Neumann was one of the great mathematicians of the last century, who helped invent the modern computer. He once introduced the notion

$\displaystyle f(((x)))$

where the number of parentheses modified the function ${f}$. It does not matter what they denoted, the notation could only be used by a brilliant mind like von Neumann’s. This notation is long gone as used by von Neumann, but in ideal theory ${(p)}$ is not the same as ${((p))}$. Oh well.

${\bullet}$ The letter ${\pi}$ for pi is is fine, but maybe it denotes the wrong number? Since ${2\pi}$ denotes the full unit circle and occurs all the time in physics, maybe we should have used the symbol ${\pi}$ for that number? Lance Fortnow and Bill Gasarch once posted about this here, with many interesting comments.

${\bullet}$ Why is the charge of the electron negative? Evidently it is because Benjamin Franklin believed the flow of an unseen fluid was opposite to the direction the electron particles were actually going.

${\bullet}$ Why has humanity been unable to establish that ${\subset}$ means proper subset and only ${\subseteq}$ means subset, by analogy to ${<}$ and ${\leq}$? Hence for proper subset one often resorts to the inelegant ${\subsetneq}$ notation.

${\bullet }$ I will end with one example of notation that many feel strongly is “good?”: ${f'(x)}$ to denote the derivative of a function. See here for a lively discussion.

Open Problems

Does good notation help make mathematics easier? Are there some notions that are in need of some better notation? Would you rather discover a great theorem or invent a great notation?

November 30, 2010 8:40 am

As your nice article discusses Kenneth Iverson, I feel I have to highlight a perfect example of good notation, apparently invented by Iverson in APL: the convention of using [X] to mean a term which evaluates to 1 if the expression X is true, and 0 if X is false. Wikipedia knows this as the Iverson bracket.

This notation is used throughout “Concrete Mathematics” by Graham, Knuth and Patashnik to great effect, and I personally find it invaluable.

2. November 30, 2010 8:47 am

One example is the use of commutative diagrams in algebra, which I am sure was essential to the development of homological algebra. I don’t know much about this, though.

Another fantastic example is string diagram notation in linear algebra. I think some version of it was first discovered by Penrose, but nowadays people recognize it is much more than just a notation: it is related to higher category theory, Feynman diagrams, quantum groups, braid groups, knot theory, topological quantum field theory… there is a nice collection of resources at this MO question.

Personally, I hate the notation $f(x)$ for a function. If you write $f : A \to B$ as an arrow from a box labeled A going right to a box labeled B, and $g : B \to C$ as an arrow from a box labeled B going right to a box labeled C, then diagrammatically their composite $A \to C$ looks like it ought to be written $fg$, but it is actually written $gf$. To be consistent with the arrow notation one really ought to write $xf$ or $(x)f$, and then composition looks the way it ought to.

• November 30, 2010 9:45 am

Qiaochu Yuan, I too am exceedingly fond of commutative diagrams in category theory … Feynman diagrams in field theory … block diagrams in control theory … projective diagrams in simulation theory … graph diagrams in … uh … graph theory … even dataflow diagrams in “B” (which is LabView’s native language).

These various diagrammatic notations can be regarded (it seems to me) as one base language, that for cultural reasons has been instantiated in multiple dialects … rather like German, French, and English.

Regarding prefix-versus-postfix notations, it is interesting that (in effect) all of the above diagrammatic languages are postfix. Perhaps this is why both you and I (and many folks) find prefix symbolic conventions to be annoying nowadays: the ubiquity of graphical representations in mathematics is conditioning readers to associate a postfix representation to any mathematically natural expression.

A great virtue of Mathematica is that it supports an eclectic blend of prefix, postfix, and infix notations—just like the mathematical literature. Very conveniently, Mathematica’s front end automatically associates common mathematical symbols to build-in operators. For example, Mathematica automatically parses the string “f : a →b” to the internal representation “Colon[f, RightArrow[a, b]]”, where the built-in operators “Colon” and “RightArrow” have no assigned evaluation rules, and thus can readily be defined to mean anything one wants. Good!

In consequence, pretty much any “Yellow Book” symbolic notation automatically parses to a valid (internal) Mathematica expression. In exploiting this freedom, our experience has been that postfix conventions turn out to be generally the most natural choice for associating symbolic expressions to diagrams.

That is why postfix mathematical notations are preferred for most purposes—preferred diagrammatically, preferred symbolically, and preferred computationally. And should this observation inspire a counter-post from ardent fans of prefix notation (or even infix notation) … well … heck … that would be fun!

• November 30, 2010 12:38 pm

I’d like to concur with the rant about the function composition notation fg or (f o g). In algebra texts discussing permutations I am never sure what AB represents exactly where A,B are permutations.
Interestingly, older functional languages like ML have the ‘o’ operator for composition, but in the language F# it was replaced by a ‘>>’ operator such that (f >> g) is the “postfix composition” of g over f.

• January 9, 2012 1:45 pm

Using the other arrow, f: B ← A makes the meaning (g∘f) obvious enough.

3. November 30, 2010 8:53 am

In a logic context, I still see people using $\supset$ for $\rightarrow$ (implication), which I find definitely confusing — it always takes a few seconds to adapt. (http://en.wikipedia.org/wiki/Material_implication)

• November 30, 2010 9:51 am

Ha! Michael, you’ve supplied the one example that Dick and I decided to leave out of my revision of his original post. The reason it’s confusing, my note would have said, is this: Logically A implies B should be a subseteq relation, exactly the opposite, insofar as it means the denotation of A is contained in the denotation of B.

November 30, 2010 10:13 am

Feynman diagrams! In terms of showing the simplicity behind what used to looked highly intimidating, Feynman diagrams rank waaay up there … probably alongside “=”.

4. November 30, 2010 9:34 am

Iverson also invented floor and ceiling notation, which is pretty indispensable. The old notation [x] for the floor of x is clearly much worse.

November 30, 2010 1:06 pm

Thanks to all,

I did forget to include Iverson’s other notations. Correction: I knew some, but did not know about all. Thanks for all the comments.

5. November 30, 2010 9:40 am

I’m surprised you left out Iverson’s indicator notation. [P] equals 1 if the predicate P is true and 0 otherwise. This is one of my favourites. Knuth even thought it was worth writing half an article about:

(the other half is about Stirling numbers).

November 30, 2010 9:45 am

Numerals themselves are also notation, and I’ve always liked this from Alfred North Whitehead:

“By relieving the brain of all unnecessary work, a good notation sets it free to concentrate on more advanced problems, and, in effect, increases the mental power of the race. Before the introduction of the Arabic notation, multiplication was difficult, and the division even of integers called into play the highest mathematical faculties. Probably nothing in the modern world would have more astonished a Greek mathematician than to learn that … a large proportion of the population of Western Europe could perform the operation of division for the largest numbers. This fact would have seemed to him a sheer impossibility … Our modern power of easy reckoning with decimal fractions is the almost miraculous result of the gradual discovery of a perfect notation. [...] By the aid of symbolism, we can make transitions in reasoning almost mechanically, by the eye, which otherwise would call into play the higher faculties of the brain. [...] It is a profoundly erroneous truism, repeated by all copy-books and by eminent people when they are making speeches, that we should cultivate the habit of thinking of what we are doing. The precise opposite is the case. Civilisation advances by extending the number of important operations which we can perform without thinking about them. Operations of thought are like cavalry charges in a battle — they are strictly limited in number, they require fresh horses, and must only be made at decisive moments.”

–from An Introduction to Mathematics, 1911

• November 30, 2010 2:27 pm

bravo, great quote

7. November 30, 2010 10:59 am

I think it would be interesting to poll established mathematics users as well as students first learning the subjects. It could be that some notations are difficult to get a handle on (as you mentioned, bra-ket notation takes some getting used to), but eventually pay off, whereas others are obtuse at first and never really get past tolerable. Perhaps there are even some notations that are magically intuitive from the very beginning?

8. November 30, 2010 10:59 am

That’s the first time I’ve ever heard him called “Johann Gauss.” But what do you know, that actually was his legal first name! I know it’s rare to ever know everything about a topic, but somehow I’m still surprised that Gauss’s name was one of those topics for me. At any rate, “Carl Friedrich Gauss” is the usual way to refer to him.

November 30, 2010 11:14 am

At first I thought: “Who is Johann Gauss?” , but after checking I discovered that Johann indeed was one of his first names. I think he is more known as “Carl Friedrich Gauss”, though.

10. November 30, 2010 11:39 am

Iverson was also responsible for the denotation of truth as 1 and false as 0, “Iverson’s Convention”. See my (now ancient) post trying to promote the name.

11. November 30, 2010 12:37 pm

Why is improper use of asymptotic notation tolerated — even in research papers?

• December 2, 2010 4:33 am

Good question – and a fine example: on one hand, asymptotic notation (the Omicron, Omega and Theta) definitely shaped my thinking in a profound way. I believe that it does that to many students of Algorithmics (this is also a notation that Knuth found worthy of writing an article about).
On the other hand, it is a fact (which to me is puzzling) that many “Computer Scientists” do not use it properly – such as not distinguishing lower bounds from upper bounds. Personally, I do not tolerate such improper use, and point out the correct usage when I have the chance to do so.

12. November 30, 2010 12:49 pm

One field which could really benefit from a notational genius like Leibniz: my own field, Logic. For just about any concept beyond the very basics (and sometimes even those), there are many different notations and the only thing they have in common is arbitrariness. I guess it’s all part of the mutation one develops from constant exposure to Church’s Thesis

November 30, 2010 12:53 pm

“I wonder why [Dirac notation] is not used all through linear algebra.”

Well, probably because the notation is not really useful without the notion of scalar product, so it really makes most sense for Hilbert spaces. Surely linear algebra is not just about scalar product but rather about generic linear transformations.

Besides, the most interesting examples of Hilbert spaces are infinite-dimensional (at least from a physicists point of view) so this is not even linear algebra but rather functional analysis.

• November 30, 2010 7:45 pm

““I wonder why [Dirac notation] is not used all through linear algebra.”

One answer can be found in Michael Stone and Paul Goldbart recent graduate-level textbook Mathematics for Physics:

These p-forms may seem rather complicated, so it is perhaps surprising that all the vector calculus (div, grad, curl, the divergence theorem and Stokes’ theorem, etc.) that you have learned in the past reduce, in terms of then, to two simple formulae. Indeed, Elie Cartan’s calculus of p-forms is slowly supplanting traditional vector calculus, much as Williard Gibbs’ and Oliver Heaviside’s vector calculus supplanted the tedious component-by-component formulae you find in Maxwell’s Treatise on Electricity and Magnetism.

Nowadays, undergraduate-level quantum mechanics textbooks are pretty much the last citadel of undiluted Dirac bra-ket notation … most new physics textbooks develop geometry and dynamics (linear and nonlinear, classical and quantum) in the unified context that Cartan’s supple notation provides. A wonderfully entertaining polemic on this subject is William L. Burke’s unpublished (but on-line) 1996 manuscript Div, Grav, Curl Are Dead.

Sadly, Burke was killed in an accident before his book-length manuscript was published … the Stone-Goldbart textbook is (in effect) a well-written, 800-page realization of Burke’s pioneering geometric and dynamical vision.

November 30, 2010 1:35 pm

Odd: This post does not show up at the front page: http://rjlipton.wordpress.com/

November 30, 2010 1:41 pm

Thanks for the great article Mr. Lipton. I am a huge admirer of Mr. Iverson and his beautiful notation.

For another historical perspective of Iverson’s notation I highly recommend Prof. Donald McIntyre’s paper “Language as an intellectual tool: from hieroglyphics to APL”, which appeared in the IBM systems journal in 1991.

The article can be found here: http://www.electricscotland.com/mcintyre/index_f/menu_f/j_f/sysj_r_29.pdf

November 30, 2010 1:51 pm

I am always wondering whether all those different types of notations for operators from different fields actually are a benefit for the reader. Instead of having superscript, subscript, over, under, parenthesized, bracketed, braced, infix, postfix, prefix notations, with lots of arbitrary precedence rules, and that mean different things depending on context, it might be better to have a uniform syntax for operators.

• December 1, 2010 3:04 pm

Ken Iverson’s notational approach involved just this manner of rationalization, making consistently linear what has been a hodge-podge of typographic relationships. There are costs involved with such a drastic reformation, but there is also the shining benefit that simple rules can be applied without exceptions.

Iverson settled on a combination of prefix notation (for simple functions) and postfix notation (for higher-order functions, a.k.a. Heaviside operators.) He retained infix notation as an extension of these. He abandoned precedence rules other than those implied by these same prefix/postfix conventions.

My experience has been that the longer I work with Iverson’s notation the better I like it as a means for expressing mathematics.

November 30, 2010 3:10 pm

A very timely post. A few days ago I was having a discussion with some other programmers about the use of comprehensions in Python. They were arguing that comprehensions, while more concise, make the code less readable. My argument was that they make the code more readable, because, like Sigma notation for summation, they express an idea concisely.

November 30, 2010 3:42 pm

Linear algebra prefers Householder notation. I think it is more of a concern on who got there first, but for sure Householder is much less verbose.

November 30, 2010 5:37 pm

When it comes to “good?” notation, Principia Mathematica takes the cake: http://plato.stanford.edu/entries/pm-notation/

20. November 30, 2010 7:12 pm

I believe it was Conway who invented symbols for the trigonometric functions when he was a child. For sine, a big lower-case sigma extended over the number like a square root symbol, and for tan a big lower-case theta. I can’t recall the symbol for cosine.

• February 10, 2011 1:54 am

That was Feynman (it’s a story from “Surely You’re Joking…!”).

21. December 1, 2010 12:39 am

Never having learned many math notation when I was young, I find much of math difficult to comprehend and consequently remember unless used regularly – especially those used in multiple contexts with different meaning. Looking at those math examples given, I can’t help but think how important the geometry of the symbols and syntax is to their comprehension.

22. December 1, 2010 1:30 am

The positional (or decimal) notations for numbers is considered as one of the turning points in human colture and perhaps the greatest oversight of the ancient greeks. (I think there is a quote of Gauss to that effect.)
Great topic. I wonder which notations we could have invented had we lived in 4 dimentions with 3-dimensional books…

December 1, 2010 5:48 pm

Nice comment on a great post, but I don’t think that there is an intrinsic problem in creating 3-d notations given that we are 3-d beings. Maybe it is still a little unnatural, awkward, because technology is not yet developed enough for it, but in principle it seems possible. Could you think on something that would improve if a 3-d notation was available?

December 1, 2010 3:07 am

24. December 1, 2010 4:33 am

25. December 1, 2010 5:33 am

I am surprised you do not mention \forall and \exists. Complex statements can almost never be understood correctly in natural language or take several sentences. This could well be extended to logical operators in general without which precise proofs are hard to write down.

As a computer scientist, Landau notation immediately comes to my mind. It is a short way to grasp the essential growth of a function without all the hassle. It is often misused, though.

December 1, 2010 7:38 am

I think the entire idea of what I call “circumfix notation” is bad: with this I mean the case when an operator on some expression x is denoted by prefixing and suffixing x with the same symbol.
Of course the most prominent example would be the popular method to denote de cardinality of a set S by |S|. This is especially confusing when you combine this with set-builder notation using a vertical bar, e.g., |{x | 1 < x < 10}|. Clearly superior notation would for example be #{x | 1 < x 0}.

December 1, 2010 8:31 am

Two well-accepted TCS-specific notations (of which there really are surprisingly few other than names of complexity classes): A ≤ B for reduction and superscripts to denote the use of oracles. The former clearly seems “good”. I am not sure about the latter.

Improvable with a minor tweak: The usual mutual information notation of I(X;Y) is symmetric but with the semi-colon looks asymmetric. Some authors write I(X:Y) which looks much better.

Often it seems that what we want to discuss are complicated processes rather than simple functions or relations and this may be the reason for a shortage of specific notations. A really accurate notation that is popular in crypto to handle this is to write probabilities that are defined based on the outcomes of specific program fragments, e.g.

Pr_{r’} [ A(C,r')=b : b <~ {0,1}; r <~ U^{p(n)}; C <- E(m_b,r) ] <1/2 + epsilon.

28. December 1, 2010 8:56 am

Excellent article Mr. Lipton. Thank you.

I know this is slightly off topic, but I can’t resist bragging. I’ve inherited a copy of a talk given by Alan Perlis at an APL conference in 1978!

http://lathwellproductions.ca/wordpress/2010/10/10/alan-perlis-and-apl-is-more-like-french/

December 1, 2010 10:09 am

I thought that Dirac introduced his bra-ket notation in his famous book “The principles of quantum Mechanic” in 1930 (and not 1939)?

December 1, 2010 10:17 am

Another very powerful idea/notation contributed by Dirac is of course the delta-Dirac distribution function; legend has that his entire PhD thesis consisted of 1 single page explaining this delta distribution.

December 1, 2010 10:27 am

How did computers change the evolution of notation?

The constant evolution of programming languages is another great, and a very contemporary example of how notation can improve our expressive and deductive efficiency, and how computers can aid with it (although in this case, they only aid in communicating with themselves). Great efforts are taken until today to further improve notation in programming.

Typesetting software also greatly simplified (part of) the process of preparing research papers for publication. Unfortunately, researches now more often typesetting their documents themselves also has some caveats. The effort of using a particular notation now depends very much on technical properties lying outside the scope of the particular topic. I think this fact tempts to re-use existing, easy-to-access notation, sometimes overfreighting it with too many unrelated meanings.

I challenge every researcher to look at their hand-written notes (or blackboard scribbles). These mind-maps – made with notation-neutral tools – often innocently invent good notation. If you used little squares with marked corners to denote a particular thing there, use it in your papers as well. Don’t call it gamma or angle-bracket-something just because it’s easier to get your typesetting software to produce that. If more people read your paper because of this, get the grasps of it more quickly scanning over it or get insights that otherwise would be hidden in odd notation because of this, it pays off very well.

December 1, 2010 10:53 am

Probability theory also has some nice examples of good and hugely successful notation. Take for example the notation [; \mathbb{P}(X \le c) ;] as a simple way to write the probability measure of the set of all [; \omega \in \Omega ;] such that [; X(\omega) \le 2 ;], or the notation [; \mathbb{E}[X|X \le c] ;] for conditional expectations, which are mathematically not a trivial construction, but can be manipulated quite intuitively in this way.

33. December 1, 2010 1:27 pm

Two comments. 0ne is about the $\subset$ vs $\subseteq$. What I personally find annoying is that in English, $\subset$ almost always means “proper subset” while in French, it almost always means “subset or equal”. I cannot remember but I am pretty sure other notations have different meaning depending on the language of the text you are reading.

My second comment is kind of disrespectful: I would have suggested you to put the C/f(n) notation in the “good?” section. There is always a kind of confusion with a division sign, so much so that the Complexity Zoo Pronunciation Guide has to precise the way it is pronounced. Moreover, this “slash” pronunciation does not mean anything. I have to acknowledge that I do not have any better idea!

December 1, 2010 1:51 pm

Anybody who is interested in the relationship between language and thought should start with the classic papers by Benjamin Lee Whorf reprinted in Language, Thought & Reality, MIT Press, 1956.

December 2, 2010 8:43 am

Tim,

Thanks for this pointer.

35. December 1, 2010 1:52 pm

I am going to offer a few thoughts on notation that are adapted from two of Edsger Dijkstra’s most celebrated essays, namely his On the cruelty of Really Teaching Computing Science and his GOTO statement considered harmful.

It appears that Dick (and Ken) have not yet devoted a Dödel’s Lost Letter and P=NP to Dijkstra’s work … any such Dijkstra-themed post would be well worth reading. As Lemony Snickett would say, students are advised to skip the rest of my post, and read Dijkstra’s essays instead.

We can transpose the dyspeptic themes of Dijkstra’s essays into a friendler, more optimistic key by focussing upon two koan-like questions: (1) ”What is the notation that has no symbols?” (1) ”What are the elements that have no names?”

These questions find natural answers in the language G (of LabVIEW), as discussed in an essay by LabView cofounder Jeff Kodosky, titled Is LabVIEW a general purpose programming language? (all of the essays that this post mentions are readily found on-line).

To convert Kodosky’s essay into the answers to our two koans, we exploit the following natural duality between the mathematical diagrams of category theory and the dataflow diagrams of G … keeping in mind that these diagrams don’t represent a notation … they *ARE* a notation.

The duality is: (I) category diagram nodes ↔ G dataflow lines, and (II) category diagram lines ↔ G dataflow nodes. Then the koan’s answer is that the natural elements of category flow along G lines that have no names, and the natural maps of category theory are instantiated by G nodes that have no symbols (but rather are associated to pictograms).

Our experience (as engineers) has been that G is the only programming language such that the thesis software of Student A can (usually) be given to Student B, without undue fuss an difficulty … and the lack of names and symbols in G is the primary reason for this portability.

The tremendous virtue of category theory as instantiated in G, is that concrete realizations can be generated simply by entering elements via controls … and the induced flow of elements is completely transparent. You wanna integrate a concrete dynamical trajectory on a symplectic manifold? No problem … just wire-up the elements and have fun.

To the best of our knowledge, no textbook has ever been written that systematically instantiates category theory in a G-like environment. A near approach is Gerald Sussman’s and Jack Wisdom’s Structure and Interpretation of Classical Mechanics (SICM). This is the same Sussman who co-authored with Harold Abelson the classic Structure and Interpretation of Computer Programs (SICP).

But alas … Sussman and Wisdom implemented their notation in the symbol-burdened and name-laden language LISP rather than the zen-like G … moreover they restricted their formalism’s domain narrowly to dynamics rather than broadly to category theory … with the result that (by Amazon sales rank at least) SICM has not found the same popularity as SICP. But theirs was a brave attempt!

It is clear that category theory’s pioneering emphasis on mathematical naturality is becoming ubiquitous in all branches of the STEM enterprise, with consequences that we can hope are consonant with Dijkstra’s vision:

Teaching to unsuspecting youngsters the effective use of formal methods is one of the joys of life because it is so extremely rewarding. Within a few months, they find their way in a new world with a justified degree of confidence that is radically novel for them; within a few months, their concept of intellectual culture has acquired a radically novel dimension. To my taste and style, that is what education is about. Universities should not be afraid of teaching radical novelties; on the contrary, it is their calling to welcome the opportunity to do so. Their willingness to do so is our main safeguard against dictatorships, be they of the proletariat, of the scientific establishment, or of the corporate elite.

Aye, lasses and laddies … now *that’s* what good mathematical notation is all about!

The above was prepared for today’s Litotica seminar, and the theme of the following seminar will also be Dijkstra-inspired, namely, “Dirac notation considered harmful.” When one focusses a seminar around a Dijkstra essay, the results are pretty much guaranteed to be lively and fun.

• December 1, 2010 8:26 pm

PS: At the seminar, my colleague Jon Jacky told us that the aphorism “APL is a mistake, carried through to perfection” is Dijkstra’s … Jon also has this funny poster of Dijkstra hanging above his desk.

• September 1, 2011 8:34 am

In June of 1975, Edsger W. Dijkstra wrote an essay called: “How do we tell truths that might hurt?” Which is characterized as a series of aphorisms about computer programming languages, one of which is APL.

This quote has transformed into most disastrous anti-APL marketing campaign in history. I once counted the number of times it has been referenced in cyberspace and then decided this was a depressing tract to follow. However, in his biography Grace Hopper, Kurt W. Beyer, makes the excellent case that innovation needs advocacy (see p.318 “Proactive Invention: Inventor as Salesman”)

When I wrote about this publicly, I got a little bit of heat from my father (“I always liked Dijkstra”).

December 1, 2010 6:35 pm

It’s a really interesting topic, I think.
As someone who has learnt to write down things very precisely from functional analysis lectures, I’m often annoyed with abuses of notation, especially in stochastics. I admit, it is handy to write {f < 0} for {ω∈Ω | f(ω) < 0}, supposing the space Ω you’re talking about is fixed at the moment. But then probabilists overdo (imho) and write things like {lim X_n = 0}.
Also, freezing of variables is probably a widely used notational technique. So Ω is often use as symbol for a sample space, where in functional analysis, X usually denotes some Banach or topological vector space, while in differential geometry, X might be the standard symbol for some vector field. But staying in a certain (sufficiently small part of a) field of mathematics, symbols like that seldomly clash. It’s different for objects used throughout most mathematics, such as integers. Mostly, i,j, k, n and m are letters denoting integers, such as indices. Hardly anyone would write “let n be a complex number”, but in another context, n might also denote some normal vector, which is why in this case I would always decorate n somehow.
Another thing is the use of different fonts and upper and lower case letters. Sets are usually denoted by standard upper case letters, if they contain “basic” objects such as numbers, vectors etc. Having those sets and introducing sets of “higher order”, so power sets of those sets, some people resort to script upper case letters. This can of course be iterated with other fonts.

December 2, 2010 1:11 am

Great article and comments — notation is an awesome topic! I really enjoyed reading Knuth’s article, in which he echoed the Whitehead quote: “I like to have mechanical operations like this available so that I can do manipulations reliably, without thinking.”

Every teenager knows that emoticons rank among the most powerful symbols

I guess parentheses, absolute value |x|, the decimals point, percents, commas, bit operators (especially ~ ^ | &) and conjunction/disjunction are all too obvious to mention? Many more examples abound in programming languages, such as: quoted strings, curly braces, and indentation.

Niklaus Wirth’s EBNF has had a huge impact on how we reason about regular languages and have become part of every modern programming language via regular expressions.
http://en.wikipedia.org/wiki/Extended_Backus%E2%80%93Naur_Form

December 2, 2010 3:09 pm

I think it would be remiss not to mention orbifold notation (Thurston et al., popularized by Conway).

By understanding how to construct the concise “names” of the symmetry groups that describe periodic tilings of the strip, plane, sphere, space etc.; one can follow a very natural path to understanding those groups and why there are only so many of them. This is presented wondrously and accessibly (with many pretty pictures) in “The Symmetry Of Things” by Conway et al. This synthesis of knowledge is surprisingly recent when compared to how long we’ve been looking at tiling patterns as humans. By understanding the link from symmetry to topology of the pattern, one even begins to understand why particular symbols are used in the notation (which seem whimsical, at first).

As a crystallographer, we learn about symmetries of tilings and they are useful (although not really necessary to understand deeply, with the amount of automation present these days). I used to be very frustrated with the fact that I could never really understand why there were only 17 ways tile the plans and only 230 to tile space, and proofs using “semi direct products” were simply too opaque to me. Orbifold notation helps emphasize that it is the connection to topology that is key to understanding things. Other notations (Schönflies, Hermann–Mauguin) emphasize the generators of the group and understanding why we chose particular generators to use in the name is difficult to understand.

I often find myself in hotels and determining the orbifold name for the wallpaper or carpet pattern.

To answer your question, I think I good notation helps make certain mathematics a *lot* easier. And I think I would rather invent a new notation than discover (or prove) a great theorem (that is probably a very good way to partition different types of people, come to think of it). I think that good notations are the result of someone who has gained a deep understanding of a subject, and some empathy for those who might want to learn it.

39. December 2, 2010 6:36 pm

This is a wonderful thread with some wonderful posts … the above post by maneesh on orbifold notation is really excellent.

Regarding on notation, how many folks have seen the Cohen brothers’ recent movie A Serious Man? There is a scene in that movie that—for quantum systems engineers in particular—hilariously depicts the perils of even the most elegant mathematical notation.

The accompanying dialog is:

Professor Larry Gopnik:… The Uncertainty Principle. It proves we can’t ever really know… what’s going on. So it shouldn’t bother you. Not being able to figure anything out. Although you will be responsible for this on the mid-term.

Please let me say that I vehemently reject this movie’s point-of-view and yet I dearly cherish the movie itself!

But nonetheless, for a world of 21st century systems engineering, in which most enterprises blend discrete-with-continuous, classical-with-quantum, macroscale-with-microscale, and deterministic-with-stochastic, it’s undeniably a pretty tough challenge to create valid, verifiable system models … and no one notation is going to do the whole job.

For system-level engineering (AFAICT) the path forward will be via minimalist, visual formalisms (category theory? block diagrams? LabVIEW G-type languages?) that are naturality-oriented and (nearly) symbol-free and variable-free. Because when it takes a stack of books 4000 pages high just to define the formalism(s) that go into a system-level dynamical simulation, is any other path forward feasible?

40. December 4, 2010 10:53 am

The notation used for logarithms may be a significant part of why people have trouble understanding them. There’s great discussions about this at The Exponential Curve, written by Dan Greene, and at f(t), written by Kate Nowak. I love Dan’s big L notation.

I also wrote just last week about how the words and symbols used in math can introduce a lingering fear that makes it harder to learn a topic. ‘Eigen’ did that to me.

41. December 4, 2010 9:50 pm

Great post. it took me through the different notation I used in college in such a breeze. When I looked at the notation, I envisioned different problems I toiled over back in the day. So yes, I think notation influences the way we think.

42. December 7, 2010 10:51 am

I have to jump in with a favorite awkward notation…

$\sum^{1}_{i_1,\ldots,i_n=0}\psi_{i_1}(a_1)\cdots\psi_{i_n}(a_n)$

Where $i_k \in \{0,1\}$.

I think this is sufficient context to see what the original author was driving at…I’m curious to hear what others think about this one. I’m also curious to know if its a common idiom. I don’t have more than an undergraduate training in mathematics so I wouldn’t know.

December 10, 2010 3:06 am

This is indeed more than common, I’d say almost ubiquitous in CS papers.

January 14, 2011 11:07 pm

A good notation is one thing, a computer language is another. Iverson and the teams of APL implementors, over the last 40 or so years, rather successfully fused the two things together. The resulting language offered a very high level of abstraction, thus when writing programs, many of the tedious details of programming simply boiled away into the vacuum of space, leaving a language which nicely supported and buttressed problem-oriented thinking.

What followed in the mainstream, possibly due to the exquisite timing of the development of the C language followed by the mass emergence of personal computing, is that a computing culture which embraces detail developed and took hold of mainstream thinking. One look at “newer” languages like C# and Java, languages which from an evolutionary perspective aren’t that far away from C, and it’s clear that a high level of abstraction is simply not valued by this culture.

Also, diversity in computing is going away. It’s best to know both languages, C# and Java.

One would hope that ideas which made APL great for problem solving would eventually make their way into the mainstream.

• January 15, 2011 3:22 pm

Except that APL is some sort of a “write only” programming language, the APL code can be a mystery for even its own author after a while.

• April 17, 2012 11:42 am

Even though it’s more than a year later, I can’t abide leaving this cheap shot as the last word. It’s usually made by someone who’s made no effort to learn the language.

As someone who programmed in APL for many years, as well as in numerous other languages, I think I can speak from a position of knowledge rather than one of ignorance when I dispute this. Poorly-written code in any language is hard to understand; many languages other than those in the APL family further obscure an algorithm by burying it under a load of unnecessary verbiage.

Conversely, the density of code written symbolically often requires more time to understand the meaning of a single line of it – which may accomplish quite a lot – as opposed to the very short time it takes to read a line in a more vacuous language which accomplishes very little.

• September 4, 2012 5:00 pm

I would second DevonMcC’s remarks. I’ve programmed in terse languages like APL and J for years but I’ve also waded deep into the C/C#/C++ pools and frankly the programming language is never the barrier when it comes to decoding someone else’s code. If the work is a product of superior programmers there is always clear pattern and theme. If it it’s a hodge-podge hacked together under duress the resulting mess will constantly frustrate your efforts to grasp it. Notation always cracks under a heavy load. Programmers have learned this the hard way.