Our 1977 paper on the role of formal methods

 [ Harvard ]

Harry Lewis is known for his research in mathematical logic, and for his wonderful contributions to teaching. He had two students that you may have heard of before: a Bill Gates and a Mark Zuckerberg.

Today I wish to talk about a recent request from Harry about a book that he is editing.

The book is the “Classic Papers of CS” based on a course that he has been teaching for years. It will contain 46 papers with short introductions by Harry. My paper from 1977 with Alan Perlis and Rich DeMillo will be included. The paper is “Social Processes and Proofs of Theorems and Programs”.

Harry says that “A valued colleague believes this paper displays such polemical overreach that it should not appear in this collection”. I hope that it does still appear anyway. Harry goes on to say

And though verification techniques are widely used today for hardware designs, formal verification of large software systems is still a rarity.

Indeed.

## Our Point

I have mixed feeling about our paper, which is now getting close to fifty years old. I believe we had some good points to make then. And that these are still relevant today. Our paper starts with:

Many people have argued that computer programming should strive to become more like mathematics. Maybe so, but not in the way they seem to think.

Our point was just this: Proofs in mathematics and not just formal arguments that show that a theorem is correct. They are much more. They must show why and how something is true. They must explain and extend our understanding of why something is true. They must do more than just demonstrate that something is correct.

They must also make it clear what they claim to prove. A difficulty we felt, then, was that care must be given to what one is claiming to prove. In mathematics often what is being proved is simple to state. In practice that is less clear. A long complex statement may not correctly capture what one is trying to prove.

Who proves that the specification is correct?

## Proof Of Our Point

I have often wondered why some do not see this point. That proofs are more than “correctness checks”. I thought I would list some “proofs” of this point.

${\bullet }$ The great Carl Gauss gave the first proof of the law of quadratic reciprocity. He later published six more proofs, and two more were found in his posthumous papers. There are now over two hundred published proofs.

So much to say that a proof is just a check.

${\bullet }$ Thomas Hales solved the Kepler conjecture on sphere packing in three-dimensional Euclidean space. He faced some comments that his proof might not be certain—it was said to be ${99\%}$. So he used formal methods to get a formal proof. But ${\dots}$

${\bullet }$ Maryna Viazovska solved the related problem in eight dimensions. Her proof is here. The excitement of this packing result is striking compared with Hales’s result. No need for correctness checks in her proof.

Henry Cohn says here:

One measure of the complexity of a proof is how long it takes the community to digest it. By this standard, Viazovska’s proof is remarkably simple. It was understood by a number of people within a few days of her arXiv posting, and within a week it led to further progress: Abhinav Kumar, Stephen Miller, Danylo Radchenko, and I worked with Viazovska to adapt her methods to prove that the Leech lattice is an optimal sphere packing in twenty-four dimensions. This is the only other case above three dimensions in which the sphere packing problem has been solved.

So, a proof that a great proof is a proof that helps create new proofs of something else. Okay, nasty way to say it. What mean is a great proof is one that enables new insights, that enables further progress, that advances the field. Not just a result that “checks” for correctness.

${\bullet }$ The famous ABC conjecture of Joseph Oesterle and David Masser has been claimed by Shinichi Mochizuki. Arguments continue about his proof. Peter Scholze and Jakob Stix believe his proof is flawed and is unfixable. Mochizuki claims they are wrong.

Will a formal proof solve this impasse? Perhaps not. A proof that explains why it is true might. A proof that advances number theory elsewhere might, a proof that could solve other problems would likely.

## Open Problems

What do you think about the role of proofs? Did we miss the point years ago?

Will formal verification become effective in the near future? And when it does, will it help provide explanations? We note this recent discussion of a presentation by Kevin Buzzard of Imperial College, London, and one-day workshop on “The Mechanization of Math” which took place two weeks ago in New York City.

[Typo fixed]

October 21, 2019 11:03 am

Hard says that -> Harry says that ?

October 21, 2019 11:42 am

Dear Stephen Blackstone:

Thanks so much. I fixed the typo. Hard to say was an interesting typo…

Best

Dick

October 21, 2019 3:44 pm

The “and” should an “are” on the line that begins with “Our point was”.

2. October 21, 2019 12:00 pm

I’ve long thought and often said the discipline of proof and the discipline of programming are closely related — they are both beneficial in the same way the journey is as important as the destination.

October 21, 2019 12:10 pm

Johann Gauss?

4. October 21, 2019 1:20 pm

Couple more typos. That’s Karl Friedrich Gauss, not Johann (perhaps you were thinking of Bach 🙂 )
Also Henry Cohn not Colm.

October 22, 2019 10:10 am

The german wiki says ‘Johann Carl Friedrich’

October 21, 2019 4:47 pm

Perhaps the appropriate distinction is that between a proof being mechanically correct (proof theory) and a proof being significantly correct (model theory).

I would argue that a mathematical statement has no intrinsic content per se; it is merely an assertion that serves the needs of the natural sciences and their philosophies by seeking to provide the necessary tools for adequately expressing our sensory observations—and their associated perceptions (and abstractions)—of a `common’ external world (corresponding to what some cognitive scientists, such as Lakoff and Nunez, term as primary and secondary `conceptual metaphors’) in a symbolic language of unambiguous communication.

This is achieved by:

(i) first, expressing those of our conceptual metaphors which can be well-defined in a mathematical language such as the first-order set theory, ZFC; and,

(ii) second, identifying those conceptual metaphors which can be further communicated categorically in a mathematical language such as the first-order Peano Arithmetic PA.

A proof being mechanically correct merely assures that the proven statement does not formally contradict the axioms and rules of deduction of the language in which it is proven.

A proof being significantly correct (true) assures that, under interpretation, the mathematical statement faithfully expresses the metaphor that was sought to be expressed symbolically within the language.

Thus, mathematics needs to be treated as a sub-discipline of linguistics; and any ontological commitments associated with mathematical statements taken to pertain not to the language per se, but to the conceptual metaphors that the language is intended to represent and communicate.

Moreover, the epistemological perspective should be that logic, too, can be viewed as merely a methodological tool for assigning evidence-based ‘truth’ values that seek to formalise an intuitive human ability which pertains not to the language that seeks to express it formally, but to the cognitive sciences in which its study is rooted.

In other words, a mathematical or logical truth is merely an assertion of the reliability of a mathematical language to faithfully express that which is sought to be expressed formally within the language. It has no bearing on the ontological status of that which is sought to be expressed within the language.

• October 22, 2019 6:18 am

The general direction of your comment has a valid point, but some details are flat out wrong. Take for example your statement: “A proof being mechanically correct merely assures that the proven statement does not formally contradict the axioms and rules of deduction of the language in which it is proven.”

October 22, 2019 9:02 pm

“Did we miss the point years ago?”

If I’ve read your paper correctly, you made the point that—purported style overreach notwithstanding—needed to be made not only then, but also today.

This is the question of whether the aims of program verification, vis a vis the aims of, say, a PA proof, can be made explicit; and, if so, whether the two aims are identical or essentially different. No doubt both engender degrees of confidence in whatever is sought to be verified respectively.

The question remains:

What is it that is sought to be verified respectively; and is verification in each case to be evidence-based, or faith-based?

In the case of formal verification, the aim has been described succinctly by Chetan Murthy:

“It is by now folklore … that one can view the values of a simple functional language as specifying evidence for propositions in a constructive logic …”

… `An Evaluation Semantics for Classical Proofs.’ Proceedings of Sixth IEEE Symposium on Logic in Computer Science, pp. 96-109, (also Cornell TR 91-1213), 1991.

Since every algorithmically computable function is expressible and decidable in the first-order PA, this explains the confidence generated in the PA decidability of algorithmically computable functions.

Which raises the question:

Is very PA formula algorithmically computable as either true or false under a well-defined interpretation?

If we take the word ‘social’ in the title of your paper as implicitly implying that a social convention may be involved in addressing the issue, by admitting the possibility of refining verifications by the sorts of social processes that refine mathematical proofs, it is not very difficult to recognize that as long as our interpretation of quantification in PA is faith-based—whence it admits non-falsifiable propositions containing putative infinite elements in the domain of the interpretation—the question of formal, evidence-based, (mechanical) verification of quantified PA propositions cannot be refined to yield the desired outcome even in principle.

Kind regards,

Bhup

October 22, 2019 9:12 pm

“Will formal verification become effective in the near future? And when it does, will it help provide explanations?”

If we insist upon evidence-based (in Chetan Murthy’s sense; or in the more formal—albeit ‘refined’—sense of Kleene’s ‘realizability’) interpretations of quantification by, for instance, ‘socially’ refining our definitions of quantification, then we can give—unarguably ‘constructive’—definitions of both ‘well-definedness’ and ‘effective computability’ in terms of ‘algorithmic verifiability’ and ‘algorithmic computability’.

It then turns out that PA has a well-defined (weak) standard, algorithmically verifiable, Tarskian interpretation over the domain N of the natural numbers.

It also has a well-defined (strong) finitary, algorithmically computable, Tarskian interpretation over N.

Moreover, the formulas of the first-order Peano Arithmetic PA are decidable as provable/unprovable if, and only if, they interpret as algorithmically computable true/false arithmetical propositions over N.

This is an immediate consequence of the Provability Theorem for PA, which appeared as Theorem 7.1 on p.41 of the following article that appeared in the December 2016 issue of ‘Computer Systems Research’:

‘The Truth Assignments That Differentiate Human Reasoning From Mechanistic Reasoning’.

In other words, the Provability Theorem for PA now ensures that we can always verify whether a PA proof is mechanically correct (within a formal language) by appeal to suitably-defined Turing machines (in fact Goedel’s primitive recursive Boolean function ‘xBy’ will do the trick) for specifying evidence for arithmetical propositions in a logic such as the first-order FOL.

Further, we can now always algorithmically verify further whether a PA proof is significantly correct (under the well-defined, standard, interpretation of the formal language) by verifying that the proof sequence interprets as a sequence of unique metaphors that are decidable in the same Tarskian sense in which it is decidable whether or not snow is, indeed, white.

It is this standard interpretation of PA which gives a (social?) meaning to significant (interpreted) proof verification that cannot be captured by formal (mechanical) proof verification.

For it turns out that the PA formulas which are algorithmically verifiable under the (weak) standard interpretation of PA provide the language for expressing the quantum behavior of physical phenomena, since such formulas interpret as functions that have a computable (hence predictable) value for each instantiation, even if the function itself is not algorithmically computable (whence its values are not predictable in the classical sense).

Moreover, the PA formulas which are algorithmically computable under a (strong) finitary interpretation of PA provide the language for expressing the classical behavior of physical phenomena, since such formulas interpret as functions that not only have a computable (predictable) value for each instantiation, but the function itself is algorithmically computable (hence predictable in the classical sense).

Kind regards,

Bhup

October 23, 2019 4:21 am

A correction. The paper:

‘The Truth Assignments That Differentiate Human Reasoning From Mechanistic Reasoning’

referred to in my earlier post appeared in the December 2016 issue of ‘Cognitive Systems Research’.

A clarification. The evidence-based definitions of ‘well-definedness’ and ‘effective computability’ are in the following manuscript that I’ve recently uploaded to Academia:

‘The Significance of Evidence-based Reasoning for Mathematics, Mathematics Education, Philosophy and the Natural Sciences: Quantification from an Evidence-based Perspective.

9. October 23, 2019 8:14 am

So, a zero-knowledge proof is the exact opposite of a “mathematical proof” in your sense.

October 23, 2019 8:32 am

Dear Martin Schwarz:

Interesting point. Zero knowledge proofs are used in security of course. A math proof I would say is in a sense the opposite. The best point I keep thinking about is: A proof is useful if you can use it to do something new. We are self-motivated. We want to prove our own stuff. So I would argue that read a proof, understand a proof, make a new proof—is what we wish to do. This happen for example with Wiles famous FLT proof. They used his method to prove more. This happens all the time with good proofs.

Best

Dick

October 26, 2019 4:11 pm

You can’t formalize a proof that isn’t well understood. So wherever there’s formal verification there are explanations. We don’t need a computer to explain Mochizuki’s proof of the ABC conjecture but we need to understand it thoroughly prior to having it checked by a computer.

November 16, 2019 2:09 pm

A somewhat recent application of formal techniques to a complex piece of software is Compcert’s C compiler.

compcert.inria.fr/compcert-C.html

They applied formal techniques to the middle third of the compiler (from after parsing to before translation of assembler ASTs to machine language.) The effectiveness of this was investigated by the Csmith program generator.

http://www.cs.utah.edu/~regehr/papers/pldi11-preprint.pdf

This tool, which uses differential testing on randomly generated valid C programs, found bugs in every compiler it was tried on, EXCEPT for the verified part of CompCert:

“The striking thing about our CompCert results is that the middle end bugs we found in all other compilers are absent. As of early 2011, the under-development version of CompCert is the only compiler we have tested for which Csmith cannot ﬁnd wrong-code errors. This is not for lack of trying: we have devoted about six CPU-years to the task. The apparent unbreakability of CompCert supports a strong argument that developing compiler optimizations within a proof framework, where safety checks are explicit and machine-checked, has tangible beneﬁts for compiler users.”

Does this prove the compiler is correct, and that the specification is correct? No. But it clearly demonstrates that formal verification contributes to software quality.

12. November 21, 2019 8:53 am

I think you did miss the mark, but mostly because the mark moved. Many interesting computational complexity results about the hardness of software verification have come out in recent decades (they relate to the “model-checking problem”, which is the problem of determining whether a program satisfies some arbitrary property, and apply to all sound methods for deciding the problem, including deductive proofs; I mention some of them here: https://pron.github.io/posts/correctness-and-complexity). Those results throw cold water on the notion of perfect software by giving hard verification low-bounds even in cases where we would have thought verification feasible even in the worst-case, and they even go beyond the worst case to show hard parametrized complexity, which is bad news for verification through composition.

Not only is “perfect” verification through proof intractable, it is also never truly required (except maybe when publishing an algorithm in an academic journal). After all, programs are written for the purpose of running them as software in some physical system, and a physical system can only ever hope to match any reasonable specification with some probability; certainly it cannot be “proven” correct in any mathematical sense.

The intractability results, combined with the physical nature of software systems, have therefore shifted the focus of formal methods research — aside from a few die-hards — from “proving software perfect” to finding more bugs, more affordably, and at that, there have been some great successes. If effective bug removal is the goal rather than perfection, the question of who proves the specification falls by the wayside. Like a specific verification method, a specification can be more or less effective at its task.

Formal methods for software verification are already showing some real-world results, but their goal is not in any way similar to that of proving mathematical theorem, and so they cannot be judged by similar metrics.