The Matthew Effect

Robert Merton is not a theorist, but was a sociologist. He created terms that seem to have been always with us, including “unintended consequences,” “role model,” and “self-fulfilling prophecy.” For this and other seminal work he was awarded the National Medal of Science in 1994, among many other awards and honors.

Today I want to talk about the Matthew Effect, or: why is everything named after Gauss? The Matthew Effect is:

“the rich get richer and the poor get poorer.”

Those who possess power often can use it to get more power, and on and on. Merton named his concept after a verse in the Gospel of Matthew:

For to all those who have, more will be given, and they will have an abundance; but from those who have nothing, even what they have will be taken away. —Matthew 25:29, New Revised Standard Version.

A related phenomenon is Stigler’s Law of naming stated first by professor Stephen Stigler in 1980, which says: No scientific discovery is named after its original discoverer. Stigler named Merton as the discoverer of “Stigler’s Law,” thus creating a diagonal type statement that complexity theory uses all the time.

Another related phenomenon is Graham’s Law stated years ago to me by Ron Graham: If you win ${k}$ awards, then you will win at least ${n-k}$.

One Example

${{\bullet}}$ A Hadamard Matrix: Such a matrix was invented by James Sylvester in 1867, who called it the wonderfully complex name, an “anallagmatic pavement”—here “anallagmatic” refers to the matrix being virtually its own inverse. Almost thirty years later it was discovered by Jacques Hadamard. The defining property is that it is a matrix of ${\pm 1}$‘s and satisfies

$\displaystyle H H^T = nI.$

A bit more precisely Sylvester discovered the recursive family of Hadamard matrices that are of great importance today. Let ${H_2}$ be

$\displaystyle \left[ \begin{array}{cc} 1 & 1 \\ 1 & -1 \end{array} \right].$

Then the given an ${H}$, the following matrix is a Hadamard matrix of twice the size.

$\displaystyle \left[ \begin{array}{cc} H & H \\ H & -H \end{array} \right].$

An important open problem is, are there Hadamard matrices of all orders ${4n}$? This is usually called the Hadamard conjecture. It probably should be named after Raymond Paley, who in 1933 found another general method to create Hadamard matrices that works (for example) when ${q+1}$ is a prime power. Oh well.

Gaussian Examples

Carl Friedrich Gauss is so famous that he probably is one the best examples of the Matthew effect in action.

The famous Gaussian distribution is named after Gauss, of course. Many believe that it should be credited to Abraham de Moivre, who in 1738 published the second edition of his “The Doctrine of Chances.” Probably—no pun—it is at best unclear, de Moivre did not seem to completely understand what he had. The biggest problem was he lacked a formal notion of a probability density function.

The famous method of Gaussian elimination for solving linear systems is perhaps one of the single most important algorithms in the world. It was known to the Chinese mathematicians as early as 150 BCE; it was commented on by Liu Hui in the 3rd century. Issac Newton in 1670 supplied essentially this method. Apparently the naming of the method after Gauss only happened in the 1950’s after confusion over the history.

In the 1800’s hyperbolic geometry was discovered by János Bolyai and Nikolai Ivanovich Lobachevsky, after whom it sometimes is named. Gauss knew about it a few years before, but never published his discovery. At least the name is joint and not just due to Gauss.

Paul Erdős—Not?

The mathematician of recent memory with the widest Gauss-like sweep of results is undoubtedly Paul Erdős. Surely if we wish to observe the “Matthew Effect” in action, he is the best single person to track. Yet we cannot think of a single instance of his receiving undue credit for a theorem, or even a definition, and his single credit for the “Probabilistic Method” seems merited—although others used a similar method around the same time he first did. Perhaps this owes to his generosity in being the most prolific co-author in history.

However, we can point to the effect with “his” maxim: A mathematician is a device for turning coffee into theorems. Erdős repeated this phrase often, but according to several sources, and as echoed in Wikipedia’s biography of Alfréd Rényi, the phrase originated with Rényi.

Erdős may instead be exemplifying the “Lazarus Effect’: according to this list he has published 34 papers since 1999. Allowing 2–3 years as a typical upper bound on gestation for a journal paper, this would seem to indicate a fair bit of effort since his death in 1996. It appears not to be true that he got a paper with Adam Tauman Kalai, Ankur Moitra, and Gregory Valiant into STOC 2010 (about learning Gaussians), even though they suggested that was the case.

How It Happens

Let’s call the first discover the finder and the person who later gets it named after them the namee. Note there are several ways finders do not become namees.

${\bullet }$ Sometimes the finder does not publish it, or does so in an odd place. Perhaps this will become less of an issue in the future, but it has happened many times this way.

${\bullet }$ Sometimes the finder does not have a correct proof of the result and the namee does.

${\bullet }$ Sometimes the finder does not really know what they have. The namee does, and so perhaps they really should get the credit.

${\bullet }$ Most often it seems that it is the Matthew Effect at work: the namee is the more famous person and gets the credit.

Wikipedia’s List

Here is a partial list of misnamed theorems from our friends at Wikipedia. See it for the details—I will just give the finder and the namee.

${{\bullet}}$ Benford’s Law. The finder is Simon Newcomb in 1881 and the namee is Frank Benford in 1938.

${{\bullet}}$ Bézout’s Theorem. The finder is Isaac Newton in 1665, who only stated it. The proof was found much later by several mathematicians including Leonhard Euler as well as Etienne Bézout. This is one case where the namee beat out a more famous person, Euler. There is hope. At least in some cases the “Matthew Effect” is reversed.

${{\bullet}}$ Burnside’s Lemma. The finder was Georg Frobenius and the namee was William Burnside, who perhaps won because he included it in his textbook.

${{\bullet}}$ Frobenius’ Theorem. The finder was Feodor Deahna in 1840, who stated it and proved it. Frobenius cited the earlier work, but still became the namee in 1875.

${{\bullet}}$ Morrie’s Law. The finder was Richard Feynman. He named it explicitly after a childhood friend, Morrie Jacobs—he said that he learned it from Morrie.

${{\bullet}}$ Pell’s Equation. The finder was probably Brahmagupta in 628. Fermat made it a challenge problem to others in 1657. In those days it was common for a mathematician to prove something, and then challenge colleagues to try and solve it. The namee was believed to be created by Euler who confused John Pell with another, William Brouncker. Thus we have multiple errors, an example of misnaming due to cultural barriers and lack of the ability to find information easily.

${{\bullet}}$ Stokes’ Theorem. The finder appears to be William Thompson, Lord Kelvin. The namee is George Stokes who probably got the credit since he included it often on his analysis examinations.

Open Problems

What are your favorite examples of this phenomenon?

1. April 8, 2011 11:26 pm

Do click on the word “maxim” for the “coffee” quote.. 🙂

2. April 8, 2011 11:44 pm

Erdős and Rényi do get somewhat unfair credit for the Erdős–Rényi model of random graphs. Not that they didn’t invent it — they did, at least in the version with a fixed number of edges. But Edgar Gilbert invented the version in which the edges are independent (now also called the Erdős–Rényi model) at about the same time, independently, and gets much less credit for it.

April 9, 2011 12:52 pm

The G in G(n,p) is in his honor.

April 9, 2011 3:03 am

The “von Neumann architecture” is named after John von Neumann, but its discovery preceded him, both in theory(Turing, Zuse) and in practice (Eckert, Mauchly). For more: http://en.wikipedia.org/wiki/Von_Neumann_architecture#Development_of_the_stored-program_concept .

4. April 9, 2011 4:14 am

Pythagoras’ theorem? Who knows!

L’Hopital’s Rule is one, supposedly. What’s more it went the wrong way for the Matthew effect: away from the famous Bernoulli, to the lesser known L’Hopital.

The Argand diagram is another.

There are lots!

April 9, 2011 7:08 pm

L´Hopital’s was apparently a bought result, maybe that’s why it defies the “natural” effect.

April 9, 2011 5:22 am

Eisenstein’s criterion, once known as the Schönemann-Eisenstein criterion – see this fascinating article by D.A. Cox
http://www.cs.amherst.edu/~dac/normat.pdf

April 9, 2011 6:44 am

Cooley-Tukey FFT. The finder was Gauss and the namees were Cooley and Tukey.

April 9, 2011 12:03 pm

Although to be fair, it was Cooley and Tukey that did the asymptotic analysis and showed that the running time is O(n log n). Gauss easily could have done this, but this sort of question didn’t occur to him.

April 9, 2011 6:46 am

There is an enormous list of 20th century theorems proven first behind the Iron Curtain and thus not credited properly in the West. V. Arnol’d would make a point of mentioning a whole bunch of these in the few lectures of his I saw.

Mathematicians don’t tend to use acronyms nearly to the extent that other fields do, and instead use people’s names as adjectives, even if the people would have little idea what is being defined. My favorite example is the equivariant Euler class of a vector bundle, which generalizes
— the Euler class of a vector bundle, which generalizes
— the Euler class of the tangent bundle, which generalizes
— the Euler characteristic of a manifold, which generalizes
— the Euler characteristic of a surface, which generalizes
— the Euler characteristic of the plane, which is 1.
Euler’s contribution to the subject was to prove V-E+F=1 for a planar graph.

• April 15, 2011 3:18 pm

As it happens, in the big scheme of things, I actually think Euler’s contribution to the subject is the lion’s share.

April 9, 2011 8:32 am

What about Newton-Raphson method. Was Raphson a contemporary of Newton or this is some different Newton? There are no commonly known theorems of Raphson other than this, but I could be wrong.

April 9, 2011 9:06 am

“Cook’s Theorem”, as often credited (including on this blog) was also proven by Leonid Levin

• April 9, 2011 10:15 am

Thanks. We’ve resolved that, following our recent post here. My CRC Handbook chapters with Eric Allender and Michael Loui on computational complexity also switched to “Cook-Levin Theorem” in the recent edition.

April 9, 2011 12:12 pm

Boruvka’s algorithm for Minimum Spanning Tree preceded Prim’s and Kruskal’s.

April 10, 2011 6:01 pm

You mean Jarník’s and Kruskal’s.

• April 10, 2011 11:08 pm

I used to think that this Bernoulli person was the greatest genius that ever lived, as so many things were named after him. Only later did I come to know that there were up to 7 mathematical Bernoullis in the family.

April 9, 2011 4:32 pm

A lot of what was attributed to Newton was discovered by earlier astronomers, physicists and mathematicians, including the inverse square law (as Newton himself revealed!), and the principle of inertia (that “Newton’s First Law” was rather discovered by the mathematician, physicist and philosopher Buridan, 350 years earlier).

The Matthew effect, or rather plutocratic effect, seems caused by a neurological will to simplify, and a less glorious tendency to adulate strong leaders.

April 9, 2011 7:23 pm

There is this one:

http://en.wikipedia.org/wiki/Stigler%27s_law_of_eponymy

April 9, 2011 7:58 pm

The cause of the Erdős Lazarus effect is explained here: http://xkcd.com/599/
Seems everybody wants a joint paper with Erdős.

April 10, 2011 4:09 am

That’s the one cartoon I’ve got on the wall in my office.

April 10, 2011 11:00 am

The Erdős Number (http://en.wikipedia.org/wiki/Erdős_number) has been defined to indicate the length of the shortest path from an author to Paul Erdős, in the graph representing the “has written a paper with” relation.

April 10, 2011 3:25 am

Dear Professor Lipton, there is a slight typo in your reference to Lord Kelvin, the correct name is Thomson (no “p”).

April 10, 2011 3:38 am

Kolmogorov complexity was actually by Solomonoff, and acknowledged as such by Kolmogorov:

http://en.wikipedia.org/wiki/Kolmogorov_complexity#History_and_context

April 10, 2011 6:37 am

Burnside’s Lemma (aka the orbit-stabiliser theorem)

The misattribution annoyed Peter Neumann so much that he wrote a paper called “A Lemma that is not Burnside’s”

• April 10, 2011 10:56 am

I was taught undergraduate mathematics by the wonderful Peter Neumann, and my group theory notes are full of references to NotBurnside’s Lemma!

April 10, 2011 12:42 pm

Here are some more examples:

* Gauss divergence theorem in its general form is due to Ostrogradski. (Sometimes, it is known as Gauss-Ostrogradski theorem.)

* Polya’s counting formula was published earlier by Redfield.

* Closer to home, Chomsky-Schutzenberger theorem is not due to them. Jeff Shallit stated this in a talk, but I don’t remember who it is due to.

April 10, 2011 5:57 pm

* Gaussian elimination — independently described in print by Isaac Newton and Michael Rolle, but likely known since antiquity; published examples date back at least to 2nd century China.

* Dijkstra’s shortest-path algorithm — published by Leyzorek, Gray, Johnson, Ladew, Meaker, Petry, and Seitz in 1957, by Danztig in 1958 (as a special case of network simplex), and finally by Dijkstra in 1959.

* Bellman-Ford shortest-path algorithm — published by Shimbel in 1955, by Moore in 1957, by Woodbury and Dantzig in 1957, and by Bellman in 1958. Ford never described this particular algorithm, although he did describe the general strategy of edge-relaxation in 1956.

* Prim’s algorithm for minimum spanning trees — described by Vojtěch Jarník in a 1929 leter to Boruvka. Independently rediscovered by Kruskal in 1956. Prim’s 1957 paper properly cites Kruskal, as do later papers by Loberman and Weinberger in 1957 and by Dijkstra in 1958.

* Coley-Tukey FFT algorithm — published by Runge and König in 1924, then by Yates in 1932, then by Stumpf in 1937, then by Danielson and Lanczos in 1942, and finally by Coley and Tukey in 1965. Reportedly used by Gauss in the 1800s.

* Gale-Shapley stable marriage algorithm — used in the late 1940s by a regional resident matching program in the Boston area, and adopted by the National Resident Matching Program in 1952. Gale and Shapley published their proof of correctness in 1962.

* Edmonds-Karp (shortest augmenting path) maximum flow algorithm — described in Ford and Fulkerson’s original 1956 paper, and analyzed by Dinic in 1970. Edmonds and Karp’s weaker analysis was published in 1972.

* Čech complexes — first described by Pavel Aleksandrov in 1927, and later by Edward Čech in 1932.

* Markov’s and Chebyshev’s inequalities — Both due to Irénée-Jules Bienaymé. Specifically, “Markov’s inequality” first appears in works of his advisor Chebyshev, as a corollary of “Chebyshev’s inequality”, which was actually formulated by his friend and colleague Bienaymé.

* Gray codes — Known since antiquity. Described in 1550 by Gerolamo Cardano as a solution to the “Cardano’s Rings” puzzle (which apocryphally dates back to 2nd-century China, but is described in print by Luca Paciolli circa 1500); used for telegraphy by Emile Baudot in 1878; described in a 1947 patent application by Frank Gray.

* Hamming distance — Known since antiquity. First used by Ug the Neanderthal, who noticed that his brother Og had replaced two of his special “hitting rocks” with different “hitting rocks”. Hamming proved that it’s a metric.

* Stigler’s Law of Eponymy — Known since antiquity. Deliberately misattributed to Robert Merton by Stephen Stigler in 1980, despite an earlier statement by *Stephen Stigler’s own father* George Stigler in 1966 (“If we should ever encounter a case where a theory is named for the correct man, it will be noted.”).

April 11, 2011 7:19 am

In set theory there is a notion of an Erdős cardinal. When asked about it by a former colleague, Paul Erdős replied that he did not know what it was, and he did not define it.

April 11, 2011 11:16 am

What does Graham’s law mean? the more prizes you get, the less prizes you will get in the future?

k=5
n=20
n-k=15

k=10
n=20
n-k=10

Is this wrong?

April 11, 2011 2:10 pm

John,

The assumption is that n >> k. So if you get over the threshold k, you will get almost all.

April 17, 2011 7:03 pm

Thanks for the explanation. What if k=0, will I get all the prizes then? No matter how big n is, it seems the bigger k is, the less prizes I’ll get.

21. April 13, 2011 9:26 am

Rudrata cycle/path -> Hamilton cycle/path (mentioned in the Dasgupta/Papadimitriou/Vazirani book), although Rudrata seems to have studied it only in the context of knight’s tour on the chessboard.

April 14, 2011 10:28 pm

In support of Allen Knutson’s point: The Li-Yorke theorem in dynamical systems, which gets over 100,000 google hits, is an extremely special case of Sharkovsky’s theorem (about 5,000 hits), which was published 11 years earlier in an obscure Ukrainian journal.

April 16, 2011 12:12 am

How about a cross-effect? Tychonoff’s Theorem was fully proven by Čech (Tychonoff only had a limited version, I think for spaces of the form 2^X). On the other hand Čech cohomology as we understand it today is only partially due to Čech; in full, it was given by — yes, Tychonoff.

Rudin mentions this in one of his books but I don’t recall which nor do I have them handy.

May 24, 2011 1:21 pm

The great Sanskrit grammarian Panini described Sanskrit grammar using formal languages and production rules in 400 BC. It is still the most influential Sanskrit ever written, but gets no credit for his discovering formal languages and production rules in math/CS literature/books!

Ingerman, P.Z. 1967 “Panini-Backus Form Suggested,” Comm. ACM 10, 3, p. 137

December 14, 2011 8:29 pm

a clear example for me:

http://en.wikipedia.org/wiki/Simion_Stoilow

Stoilow is considered the author of the inverse function theorem on complex case (it is taught in the second or 3rd class of every course of analysis.)

I saw only once in my life a note that he is the author of the thorem into a book of W Rudin.

His name is not known probably due to the political system.

You did not mention the political system as a cause for Matthew effect. So, I ask myself : this theorem is really decidable ?
If so, in which system of axioms ?