## A Proof Of The Halting Theorem

*Toward teaching computability and complexity simultaneously*

Large Numbers in Computing source |

Wilhelm Ackermann was a mathematician best known for work in constructive aspects of logic. The Ackermann function is named after him. It is used both in complexity theory and in data structure theory. That is a pretty neat combination.

I would like today to talk about a proof of the famous Halting Problem.

This term at Georgia Tech I am teaching CS4510, which is the introduction to complexity theory. We usually study general Turing machines and then use the famous Cantor Diagonal method to show that the Halting Problem is not computable. My students over the years have always had trouble with this proof. We have discussed this method multiple times: see here and here and here and in motion pictures here.

This leads always to the question, what really is a proof? The formal answer is that it is a derivation of a theorem statement in a sound and appropriate system of logic. But as reflected in our last two posts, such a proof might not help human understanding. The original meaning of “proof” in Latin was the same as “probe”—to test and explore. I mean “proof of the Halting Problem” in this sense. We think the best proofs are those that show a *relationship* between concepts that one might not have thought to juxtapose.

## Diagonal and Impossible

The question is how best to convince students that there is no way to compute a halting function. We can define Turing machines in a particular way—or define other kinds of machines. Then we get the particular definition

How can we prove that is not computable? We want to convey not only that this particular is uncomputable, but also that *no function like it* is computable.

Trying the diagonal method means first defining the set

We need to have already defined what “accept” means. OK, we show that there is no machine whose set of accepted strings equals . Then what? We can say that the complementary language is not decidable, but we still need *another* step to conclude that is uncomputable. And when you trace back the reason, you have to fall back on the diagonal contradiction—which feels disconnected and *ad hoc* to the particular way and are defined.

Ken in his classes goes the route first, but Sipser’s and several other common textbooks try to hit directly. The targeted *reason* is one that anyone can grab:

It is

impossiblefor a function to give the value —or any greater value.

Implementations of this, however, resort to double loops to define . Or like Sipser’s they embed the “” idea in the proof anyway, which strikes us as making it harder than doing separate steps as above. We want the cleanest way.

## The Proof

Here is the plan. As usual we need to say that represents a computation. If the computation halts then it returns a result. We **allow** the machine to return an integer, not just accept or reject. If the machine does not halt then we can let this value be undefined; our point will be that by “short-circuit” reasoning the question of an undefined value won’t even enter.

Now let be defined as the halting function above.

Theorem 1The function is not computable.

*Proof:* Define the function as follows:

Suppose that is computable. Then so is . This is easy to see: just do the summation and when computing compute the part first. If it is then it adds nothing to the summation, so it “short-circuits” and we move on to the next . If it is then we compute and add that to the summation. Let stand for the summation before the last term; then .

Now if the theorem is false, then there must be some such that the machine computes . But then

This is impossible and so the theorem follows.

## Extending Simplicity

What Ken and I are really after is relating this to hierarchies in complexity classes. When the are machines of a complexity class then the functions and **are** computable. It follows that is not computed by any and so does not belong to . What we want is to find similar functions that are natural.

Ackermann’s famous function does this when is the class of primitive recursive functions. There are various ways to define the machines , for instance by programs with counted loops only. The that tumbles out is not primitive recursive—indeed it out-grows all primitive recursive functions. Showing that does likewise takes a little more formal work.

In complexity theory we have various time and space hierarchy theorems, say where is . For any *time constructible* , we can separate from by a “slowed” diagonalization. The obtained this way, however, needs knowledge of and its constructibility to define it. By further “padding” and “translation” steps, one can laboriously make it work for , for any fixed , and a similar theorem for deterministic space needs no log factors at all. This is all technical in texts and lectures.

Suppose we’re happy with , that is, with a non-“tight” hierarchy. Can we simply find a natural that works for ? Or suppose is a combined time and space class, say machines that run in time and space simultaneously. Can we possibly get a natural that is different from what *we* get by considering time or space separately?

We’d like the “non-tight” proofs to be simple enough to combine with the above proof for halting. This leads into another change we’d like to see. Most textbooks define computability several chapters ahead of complexity, so the latter feels like a completely different topic. Why should this be so? It is easy to define the length and space usage of a computation in the same breath. Even when finite automata are included in the syllabus, why not present them as special cases of Turing machines and say they run in linear time, indeed time ?

## Open Problems

Is the above Halting Problem proof clearer than the usual ones? Or is it harder to follow?

What suggestions would you make for updating and tightening theory courses? Note some discussion in the comments to two other recent posts.

[some word fixes]

## How Hard, Really, is SAT?

*A new longest computer proof makes us wonder about things from security to the Exponential Time Hypothesis*

Marijn Heule, Oliver Kullmann, and Victor Marek are experts in practical SAT solvers. They recently used this ability to solve a longstanding problem popularized by Ron Graham.

Today Ken and I want to discuss their work and ask a question about its ramifications.

Read more…

## Descending Proofs Into Algorithms

*A way to make indirect reasoning more palpable*

Wikimedia Commons source |

Nicholas Saunderson was the fourth Lucasian Professor at Cambridge, two after Isaac Newton. He promoted Newton’s *Principia Mathematica* in the Cambridge curriculum but channeled his original work into lecture notes and treatises rather than published papers. After his death, most of his work was collected into one book, *The Elements of Algebra in Ten Books*, whose title recalls Euclid’s *Elements*. It includes what is often credited as the first “extended” version of Euclid’s algorithm.

Today we raise the idea of using algorithms such as this as the basis for proofs.

Read more…

## A Surprise For Big-Data Analytics

*A simple but interesting issue with analyzing high-dimensional data *

Peter Landweber, Emanuel Lazar, and Neel Patel are mathematicians. I have never worked with Peter Landweber, but have written papers with Larry and Laura Landweber. Perhaps I can add Peter one day.

Today I want to report on a recent result on the fiber structure of continuous maps.

## Do results have a “teach-by-date?”

* Teaching automata theory *

Noam Chomsky is famous for many many things. He has had a lot to say over his long career, and he wrote over 100 books on topics from linguistics to war and politics.

Today I focus on work that he pioneered sixty years ago.

Yes sixty years ago. The work is usually called the Chomsky hierarchy(CH) and is a hierarchy of classes of formal grammars. It was described by Noam Chomsky in 1956 driven by his interest in linguistics, not war and politics. Some add Marcel-Paul Schützenberger’s name to the hierachry. He played a crucial role in the early development of the theory of formal languages—see his joint

paper with Chomsky from 1962. Read more…

## The Mathematics Of Dan’s Inferno

* A possible error with mathematical ramifications *

Non-technical fact-check source |

Dan Brown is the bestselling author of the novel *The Da Vinci Code*. His most recent bestseller, published in 2013, is *Inferno*. Like two of his earlier blockbusters it has been made into a movie. It stars Tom Hanks and Felicity Jones and is slated for release on October 28.

Today I want to talk about a curious aspect of the book *Inferno*, since it raises an interesting mathematical question. Read more…

## The World Turned Upside Down

* Some CS reflections for our 700th post *

MacArthur Fellowship source |

Lin-Manuel Miranda is both the composer and lyricist of the phenomenal Broadway musical *Hamilton*. A segment of Act I covers the friendship between Alexander Hamilton and Gilbert du Motier, the Marquis de Lafayette. This presages the French co-operation in the 1781 Battle of Yorktown, after which the British forces played the ballad “The World Turned Upside Down” as they surrendered. The musical’s track by the same name has different words and melodies.

Today we discuss some aspects of computing that seem turned upside down from when we first learned and taught them.

Read more…