Skip to content

Why Check A Proof?

April 24, 2019

Why check another’s proof?

[Russell and Whitehead ]

Bertrand Russell and Alfred Whitehead were not primarily trying to mechanize mathematics in writing their famous book. They wanted to assure precision and certainty in proofs while minimizing the axioms and rules they rest on. They cared more about checking proofs than generating theorems. By the way: They are listed in the order Whitehead and Russell on the book. See this for a discussion about the importance of the order.

Today Ken and I thought we would add a few more thoughts on why proofs get checked.

We discussed those who claim proofs in our previous post. Once a proof is claimed, it needs people to check it. This is not as fraught as the replication crisis in other sciences where “proof” is a statement of statistical significance whose most intensive check needs repeating the experiment.

If you do a Google search on “why check proofs” you get lots of hits on using automated proof checkers. Coming on eleven decades after the publication of Russell and Whitehead’s three-volume opus Principia Mathematica, these are still in their formative years. We covered a major system of this kind some years ago.

We are personally more interested in what motivates us humans to check proofs. We believe that there are various factors that make it less or more likely to find a good human checker. So today we will try to list some of them.

Why Check A Proof?

One of the questions that was raised by some commenters to our recent post is: Why should I check your proof?

This is a critical question. If their is no reason to check your proof, then your result will not get checked. It is almost a tautology. We like this question and thought we could suggest several ways to increase the likelihood that one will check another person’s proof.

So lets assume that Alice is claiming some new theorem {T} and we ponder whether Bob will spend time checking it.

Bob has to

This happens when Bob is required to check her proof. This can happen if Bob is a referee of her paper. It could also be when Bob is hired to do this task. It usually is a weak reason for making someone do the checking. In real life we think that it is unlikely to be a strong motivator.

Bob wants to

This happens when Bob feels that he will benefit from checking. The main type of situation here is: Alice’s theorem {T} uses some new method or trick. If Bob believes that this method can be used in his work, in his research, in his future papers, then he is strongly motivated.

We are all very self-centered in our research. If we think we could in the future use your method we are likely to spent time and energy on your proof. Thus if Bob is convinced that Alice has a some new ideas, he is much more likely to spent the time checking her theorem. This means that Alice should—if possible–explain that her proof of {T} uses something new. Proofs that are “just technical inductions” are very unlikely to get Bob to read them. In many areas some authors have stated things like: The proof is a careful induction… This is not a good idea.

Bob needs to

This happens when Bob has some “skin” in the game. A classic situation is when Bob has an earlier result that is affected by Alice’s new theorem. If {T} is stronger than Bob’s previous result, then he is motivated to check her theorem. Or if {T} shows that his earlier theorem is false, this is a very strong motivation. Or perhaps Alice has proven a lemma that enables Bob to push something through.

Skin in the Game

Often we have situations where you do have skin in the game. An old example that comes to mind is from group theory. The problem is a natural question about a class of groups: Let {B(m,n)} be the class of groups that are generated by {m} elements and all elements in the group satisfy, {x^{n} = 1}. Sergei Adian and Pyotr Novikov proved that {B(m, n)} is infinite for {n} odd, {{n \ge 4381}} by a long complex combinatorial proof in 1968. This is a famous result.

Shortly after another group theorist, John Britton, claimed an alternative proof in 1970. Unfortunately, Adian later discovered that Britton’s proof was wrong. I do not have first-hand information, but I was told that Adian was motivated by wanting to have the proof. He worked hard until he discovered an unrepairable bug in Britton’s 300-page monograph. The proof was unsalvageable.

A much newer example is from a recent book by Shing-Tung Yau, The Shape of a Life. He is a famous geometry expert and has made many important contributions to many areas of mathematics. We will probably discuss his book in detail in the future, but for today it has a neat example of “skin in the game”. He writes about an enumeration problem of counting how many curves lie on a certain manifold—a century old problem. One group used a clever trick to get the number

\displaystyle  317,206,375.

However another group discovered via a different method that the count was

\displaystyle  2,682,549,425.

Somewhat a different count—not even close. Clearly, both sets of authors were heavily motivated to check their work. And within a month the larger count was found to be wrong and the first was correct.

An Unresolved Claim

This is from the wonderful P vs NP pages of Gerhard Woeginger. It was pointed out to us by the commenter gentzen. Quoting Woeginger’s page, including its use of “showed”:

In February 2016, Mathias Hauptmann showed that P is not equal to NP. Hauptmann starts from the assumption that P equals {\Sigma_{2}^{p}}, proves a new variant of the Union Theorem of McCreight and Meyer for {\Sigma_{2}^{p}}, and eventually derives a contradiction. This implies P not equal to NP.

Woeginger gives a link to Hauptmann’s paper, “On Alternation and the Union Theorem,” and thanks two people who communicated this to him.

The union theorem of Albert Meyer and Edward McCreight is the classic theorem that shows how to encode many complexity classes into one. Hauptmann’s idea is not unreasonable. He makes an assumption that P=NP and tries to use it to improve the union theorem. This is a nice idea: Make a strong assumption and then try to improve a deep result. The hope is that this will lead to a contradiction. His abstract ends by saying, “Hence the assumption {P = \Sigma_{2}^{p}} cannot hold.” We do not know if this paper has received a thorough reading. Update: We have learned that a pair of experts reviewed the argument and found that part of it implied a contradiction to the deterministic time hierarchy theorem, while another part relativizes in a way that would yield a false statement under certain oracles.

A Resolved Claim

Hauptmann is a colleague of Norbert Blum at the University of Bonn. Two years ago, Blum claimed to prove P {\neq} NP by making technical improvements on a well-known circuit-based attack from the 1980s and 1990s. He has had a long track record of expertise and reliability in this area and his paper was read right away.

The reading was helped by his paper being well-organized, straightforward, and relatively short—the crucial segment was under ten pages. The news broke while we were preparing a post on the August 2017 total solar eclipse in the US. In the 24–48 hours it took us to modify our post, we were already able to draw on several accounts by first-responder readers and check those accounts ourselves against the paper.

The error was triangulated in an interesting way. It was first observed that if Blum’s attack could succeed by the means and premises stated, then it would extend to prove something else that is known not to be true. Once this was ascertained, a closer reading was able to zero in on the exact technical point of error. Blum soon acknowledged this and that the breach was unfixable. The attempt still combines circuit theory and graph theory in ways a student can benefit from learning about, and this furnished its own incentive to read it.

Open Problems

We appreciate the comments on the previous post and hope this adds some additional insights.

[added update about Hauptmann’s paper]

16 Comments leave one →
  1. Geoffrey Irving permalink
    April 24, 2019 10:22 am

    The taxonomy of why to check proofs ignores an important global reason: the existence of proof checkers (human or machine) means that less care is required on the part of the prover. It would still be possible (at least in theory) to do reliable mathematics as a human individual if no other humans read the proof, but it would require writing out a much larger fraction of the details and would take much more time. And this is an even larger part of the motivation for machine proof checking: strong machine theorem proving would allow mathematicians to write down fewer details in the course of proof and theory exploration, and thus have more time to try to understand and prove more things.

    • rjlipton permalink*
      April 24, 2019 11:43 am

      Dear Geoffrey:

      This is an interesting take on why proof machines. I never really thought it would free us to do more. I have to think about this point of yours. I think the main issue has been for me to be able to avoid errors, especially in proofs with many cases. I have personally sometimes proved something and then worried that it seemed too good a result. This caused me to check it many times, until I was able to declare it really was “obvious”.


  2. Kevin Compton permalink
    April 24, 2019 11:03 am

    There is an excellent article by Daniel O’Leary entitled “Principia Mathematica and the Development of Automated Theorem Proving” in Drucker T. (ed.) Perspectives on the History of Mathematical Logic, Birkhäuser 1991 (available as an ebook). It discusses the motivations of various researchers who developed computerized theorem provers for Whitehead’s and Russell’s system. One of them was Simon and Newell’s Logic Theory Machine. Simon wrote to Russell in 1956 about their results. This was Russell’s reply: “I am delighted to know that Principia Mathematica can now be done by machinery. I wish Whitehead and I had known of this possibility before we wasted ten years doing it by hand. I am quite willing to believe that everything in deductive logic can be done by machinery.”

    • rjlipton permalink*
      April 24, 2019 11:39 am

      Dear Kevin:

      Thanks for pointing this out. The work of Russell and Whitehead—or correctly in the order Whitehead and Russell—was always I thought an existential proof. They showed that math could be formalized actually. Of course their system has some
      issues that make it quite tedious: Recall 1+1=2 takes quite a serious proof.


  3. Bhupinder Singh Anand permalink
    April 25, 2019 8:45 am

    Perhaps we may need to clarify what we mean by a ‘proof’ in this context.

    Reason: Although the PvNP problem is essentially a problem about the nature and behaviour of recursive number-theoretic functions and relations over the domain of the natural numbers, its standard formulation is currently in set-theoretical terms; and its solution implicitly seeks a, presumably formal, set-theoretical proof (implicitly over the domain of any putative interpretation of a set theory such as ZF).

    If so, we may need to recognise that classical conventional wisdom, based on Hilbert’s approach to, and development of, proof theory, fails to adequately distinguish that:

    (1) Whereas the goal of classical mathematics, post Peano, Dedekind and Hilbert, has been:

    (a) to uniquely characterise each informally defined mathematical structure S (e.g., the Peano Postulates and its associated classical predicate logic);

    (b) by a corresponding formal first-order language L, and a set P of finitary axioms/axiom schemas and rules of inference (e.g., the first-order Peano Arithmetic PA and its associated first-order logic FOL);

    (c) which assign unique provability values (provable/unprovable) to each well-formed proposition of the language L;

    (2) The goal of constructive mathematics, post Brouwer and Tarski, has been:

    (a) to assign unique, evidence-based, truth values (true/false) to each well-formed proposition of the language L;

    (b) under a constructively well-defined interpretation I over the domain D of the structure S (when viewed as a `conceptual metaphor’ in the terminology of Lucas and Nunez’s ‘Where Mathematics Comes From’);

    (c) such that the provable formulas of L are true under the interpretation.

    In other words, whilst the focus of proof theory can be viewed as seeking to ensure that any mathematical language intended to represent our conceptual metaphors is unambiguous, and free from contradiction, the focus of constructive mathematics can be viewed as seeking to ensure that any such representation does, indeed, uniquely identify such metaphors.

    The goals of the two activities ought to, thus, be viewed as necessarily complementing each other, rather than being independent of, or in conflict with, each other as to which is more `foundational’—as is implicitly argued in contemporary literature.

    Any attempt to ‘prove’ the PvNP problem might thus need to be based within an explicit, well-defined and evidence-based, theory of arithmetical truth.

    If so, it would be worth investigating whether the perceived barriers to solutions of such problems might, thus, actually reflect the absence of such a theory.

  4. rjlipton permalink*
    April 25, 2019 1:22 pm

    Dear Bhupinder:

    Thanks for your thoughtful comment. I agree that P v NP and related questions certainly must be connected to our search for basic truth of arithmetic questions. In the past I have thought about whether it might be possible to get independence type results. This is still quite wide open.

    I wonder if you can explain a bit more about your ideas.


  5. Bhupinder Singh Anand permalink
    April 25, 2019 3:56 pm

    Dear Professor Lipton,

    I am suggesting that the perceived barriers in addressing the PvNP problem may be due precisely to beliefs such as:

    The ‘basic truth of arithmetical questions’ and the possibility of getting ‘independence type results’—in presumably the first-order Peano Arithmetic—are yet ‘open’ questions.

    The following evidence-based reasoning, from the relatively short 10-page paper (link below) which I cited in your previous post on GLL, shows that such beliefs could be seriously misleading (see Corollary 7.2).

    DEFINITION 1 (Algorithmic verifiability:). A number-theoretical relation F(x) is algorithmically verifiable if, and only if, for any given natural number n, there is an algorithm AL(F, n) which can provide objective evidence for deciding the truth/falsity of each proposition in the finite sequence (F(1), F(2), …, F(n)).

    DEFINITION 2 (Algorithmic computability:). A number theoretical relation F(x) is algorithmically computable if, and only if, there is an algorithm AL(F) that can provide objective evidence for deciding the truth/falsity of each proposition in the denumerable sequence (F(1), F(2), …).

    We note that algorithmic computability implies the existence of an algorithm that can finitarily decide the truth/falsity of each proposition in a well-defined denumerable sequence of propositions, whereas algorithmic verifiability does not imply the existence of an algorithm that can finitarily decide the truth/falsity of each proposition in a well-defined denumerable sequence of propositions.

    THEOREM 2.1. There are number theoretic functions that are algorithmically verifiable but not algorithmically computable.

    THEOREM 5.6. The axioms of PA are always algorithmically verifiable as true under the interpretation IPA(S, V), and the rules of inference of PA preserve the properties of algorithmically verifiable satisfaction/truth under IPA(S, V).

    THEOREM 5.7. If the PA formulas are algorithmically verifiable as true or false under IPA(S, V), then PA is consistent.

    (NOTE: IPA(S, V) is the standard ‘classical’ interpretation of PA over the domain N of the natural numbers. Theorem 5.7 is thus a more constructive, albeit non-finitary, proof of consistency for PA than Gentzen’s transfinite argument.)

    THEOREM 6.7. The axioms of PA are always algorithmically computable as true under the interpretation IPA(S, C), and the rules of inference of PA preserve the properties of algorithmically computable satisfaction/truth under IPA(S, C).

    THEOREM 6.8. PA is consistent.

    (NOTE: IPA(S, C) is thus the finitary interpretation of PA over the domain N of the natural numbers sought by Hilbert in the second of his 23 Millennium problems. It also dissolves the Poincare-Hilbert debate by showing that their respective arguments were actually complementary, and only seemingly contradictory.)

    THEOREM 7.1 (Provability Theorem for PA). A PA formula [F(x)] is PA-provable if, and only if, [F(x)] is algorithmically computable as always true in N.

    COROLLARY 7.2. PA is categorical with respect to algorithmic computability.

    COROLLARY 8.2. The PA formula [Neg(Ax)R(x)] defined in Lemma 8.1 is PA-provable.

    COROLLARY 8.3. In any model of PA, Goedel’s arithmetical formula [Neg(Ax)R(x)] interprets as an algorithmically verifiable, but not algorithmically computable, tautology over N.

    Kind regards,


  6. Bhupinder Singh Anand permalink
    April 25, 2019 6:20 pm

    Dear Professor Lipton,

    Since the standard formulation of the PvNP problem is set-theoretic, it might be appropriate to mention that I distinguish between proofs of arithmetical propositions involving numerals in the first-order Peano Arithmetic PA, and proofs of arithmetical propositions involving ordinals in any set theory—whether the first-order theory ZF or the second-order theory ACA0—due to the following conclusions that seem to follow from the evidence-based reasoning introduced in the paper which I cited earlier:

    LEMMA 10.1 (p.67). The structure of the finite ordinals under any putative interpretation of ZF is not isomorphic to the structure N of the natural numbers.

    THEOREM 13.1 (p.88). Goodstein’s sequence G([m]) over the finite ordinals [m] in any putative model M of ACA0 terminates with respect to the ordinal inequality `>o’ even if Goodstein’s sequence G(m) over the natural numbers m does not terminate with respect to the natural number inequality `>’ in M.

    (The above page references are to an investigation (link below) that I have just forwarded informally to an editor of a journal for pre-submission vetting as to its suitability.)

    The significance of this is that any putative resolution of the PvNP problem in set-theoretical terms might only show that such resolution is consistent with the axioms of the set theory. Since both ZF and ACA0 admit an axiom of infinity, neither can have any evidence-based interpretation that would allow us to conclude the truth of such resolution over the natural numbers under any well-defined theory of truth.

    Kind regards,


    • Bhupinder Singh Anand permalink
      April 26, 2019 2:02 am

      Dear Professor Lipton,

      It might be pertinent to add that by ‘well-defined theory of truth’ I intend Definition 15 on p.47 of my investigation (link below), which formally well-defines the logic of a formal language constructively.

      I note that by this definition, PA has a well-defined logic, whereas any set-theory that admits an axiom of infinity does not.



  7. April 26, 2019 9:52 am

    Hi there,
    Sorry to spoil the comments, i wanted to get in touch with the author of this excellent blog.
    Don’t know how to, so let me share with all of you a recent work : a “bst algorithm” for the TSP:

    As far as i know, the method is new and pretty efficient.
    But don’t worry, i’m not claiming this will prove that NP=P…
    Just than we can keep finding new “best algorithm”.


    • rjlipton permalink*
      April 26, 2019 11:11 am

      Dear Leafar:

      Thanks for the link. I see that it appears to be an algorithm that operates on planar city problems. I do expect that you
      may do quite well on such problems.


  8. April 27, 2019 2:09 pm

    Thanks for this post. It was interesting to learn that Mathias Hauptmann’s proof has been checked. The reference to the union theorem of Albert Meyer and Edward McCreight is also nice, since I had planed to learn more about this, after I tried to read Hauptmann’s proof.

    For me, reading a proof and checking a proof can be two separate activities. I often read a proof just out of curiousity. Will I understand it? What concepts does it use? How do the authors think? Do the authors delibarately introduced hidden errors? On the other hand, checking a proof feels more serious. Should I do it, if my only goal would be to demolish the proof? If I can find a nice sweet and simple counterexample, then maybe yes. Or if the author wants me to check it, then that might also be a good reason. In that case, I see it as a communication challenge, which can be interesting in its own right. For the analog SAT solvers, none of those conditions applied. In addition, the authors basically already did what I would try to get any claimer of a P=NP proof to do: implement their algorithm and try it out on relevant instances. (So this was a case where I had read a proof, but decided against checking it.)

    • rjlipton permalink*
      April 28, 2019 8:37 pm

      Dear gentzen:

      Ed and Albert are very interesting…have talked about both before. Ed did very practical stuff, Albert very great theory.

      I agree that checking is not same as understanding. I often like to avoid details when understanding a proof. Checking is very different as you explain. There are certain signs—as I have discussed before—that alert one to potential errors. An example is: “This follows as in case II”.


  9. April 29, 2019 12:54 am

    Remember, please, the case raised by the Clay Institute: “Suppose that you are organizing housing accommodations for a group of four hundred university students. Space is limited and only one hundred of the students will receive places in the dormitory. To complicate matters, the Dean has provided you with a list of pairs of incompatible students, and requested that no pair from this list appear in your final choice.”

    I ask: are the 400 students on the list of incompatible pairs?

    If that is not the case, I would assemble the list by making a distinction: unproblematic/problematic students. Then, I would add to the list of unproblematic students those of the other list to form 50 compatible pairs. A computer algorithm should have the ability to select distinctions to do this.

    Thanks for the attention. Héctor


  1. /lib/apac/india « Pink Iguana
  2. Proof Checking: Not Line by Line | Gödel's Lost Letter and P=NP

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s