Proving That Proving Is Hard
On the computational complexity of the theory of the real numbers
Alfred Tarski is one of the great logicians of the twentieth century. He is famous for the formal definition of truth, the decidability of many important theories, and many other results—one of the strangest is his famous joint theorem with Stefan Banach. This theorem proves that any solid ball of one pound of gold can be cut into a finite number of pieces and reassembled into two solid balls each of one pound of gold. Pretty neat trick.
Today I want to talk about the complexity of theories. There are some classic results that I think should be wider known, and their proofs use techniques that may have application elsewhere.
In the early days of theory there was a great deal of interest in the complexity of decision problems. I think these are natural problems, often they could be answered with strong lower bounds, and further some of them are close to practical questions. For example, the theory of the reals occurs in many actual problems—especially special cases of the theory of the reals. The theory restricted to existential sentences is quite natural and arise in many applications.
Upper Bounds on the Theory of the Reals
Tarski worked on various theories with the main goal of proving their decidability. Since Kurt Gödel’s famous result on the incompleteness of Peano Arithmetic, a goal has been to find weaker theories that are complete. Such a theory has two neat properties:
- the theory can either prove or disprove all statements in the theory;
- the theory is usually decidable.
The latter only needs the theory to have a reasonable axiom system, then it will be decidable. Given a sentence in such a theory the decision procedure is simple:
Start enumerating proofs from the theory. If a proof of appears, then say the sentence is valid; if a proof of appears, then say the sentence is invalid.
Note, one of or must appear, since the theory is complete; and both cannot appear, since the theory is consistent. A complete theory is defined to always be consistent.
Tarski turned away from natural numbers to real numbers. The theory he studied is the theory of real numbers. It has a language that allows any first order formula made up from , and its axioms are those for a field, with additional axioms for a real closed field. The last are axioms that state that every odd degree polynomial must have a root.
One of the main results of Tarski is:
Theorem: The first order theory of the real numbers is complete.
He used a method that is called quantifier elimination. A theory has quantifier elimination if the theory has the property that every formula is equivalent to an open formula: a formula with no quantifiers. Tarski theorem immediately proves that the theory is decidable, but left open the actual computational complexity of its decison problem. Theorists to the rescue.
Lower Bounds on the Theory of the Reals
The following table lists some of the basic results concerning decidability of various theories. The problem is: Given a sentence in the theory is the sentence provable or not.
Clearly for propositional logic the problem is NP-complete, and has unknown computational complexity
Predicate calculus concerns whether or not an arbitrary formula is logically valid; as long as there is at least one predicate letter of arity at least two, the theory is undecidable. The trick is it is possible to encode the behavior of a Turing machine into a sentence using , and show that the sentence is valid if and only if the Turing machine halts.
Michael Fischer and Michael Rabin in 1974 proved several beautiful exponential lower bounds on the complexity of any decision procedure for various theories. I will discuss today their work on the theory of real numbers.
Of course no theory that includes propositional formulas can have an easy decision procedure, unless P=NP. My opinions aside, it is reasonable to expect the theory of reals to be much more powerful than NP. This expectation turns out to be true: the interplay between the ability to add and multiply real numbers with predicate calculus’s abilities allows powerful encoding tricks that cannot be done in pure propositional logic. Together these two features, operations on reals and predicate calculus, allow strong lower bounds to be proved.
Fischer and Rabin proved:
Theorem: In the theory of reals, there is a constant so that there are true sentences of length whose shortest proof in the theory is .
This is a very strong theorem. It is unconditional, and shows that the proofs themselves are huge. This is stronger than just proving they are hard to find, since if the proofs are large there can be no method that always proves valid theorems quickly—the proofs are just too big.
Lower Bounds on the Complexity of Theories
I thought that it might be useful to recall how such proofs are obtained. They are quite clever, in my opinion, and rely on several neat tricks. Partly I hope that these tricks could be used elsewhere in complexity theory.
How do you prove lower bounds on a theory? If the theory of the reals could define the set of integers, then the theory would be undecidable. Since the theory is decidable by the famous result of Tarski, it must be the case the integers are undefinable. What Mike and Michael show is how to define a very large set of integers, with a relatively small formula. This allows them to use the same techniques that proved the predicate calculus was undecidable to prove their own result.
For example, imagine we could define by where is a “small” formula. Then, the plan is to use this to simulate the halting problem for Turing machines that do not use too much tape. An easy way to see that this will work is to look at a simpler application. Suppose that stands for
Then, clearly defines the primes below . The sentence,
says that for any there is a twin prime above it. This would be very exciting if it had a short proof—actually even weaker statements would be exciting. What Mike and Michael prove is in general such sentences are hard to prove—they do not, unfortunately, show this particular sentences is hard. Oh well.
A Recursion Trick
Mike and Michael do construct a formula that defines
Their construction uses recursion: is defined in terms of for some . This should come as no shock, since recursion is so useful in all of computer science. However, in order to make their theorem work, they need to avoid making too many recursive calls. They use a “trick” that was discovered by Mike Fischer with Albert Meyer and independently by Volker Strassen.
Let be any formula and consider the conjunction:
It follows that where
The reason this construction is useful is simple: it is the size of the resulting formulas. The size of is roughly twice that of , but the size of is only the size of plus a constant. This is one of the key tricks used in their proof: it keeps the size of the formula small.
A Name Trick
In their construction of they also use many variable names. The naive method would use an exponential number of variables, and this would cause their construction to explode in size. They cleverly reuse names—in this way avoid having to use very long names. It is not too far from a trick that was used, many years later, by Adi Shamir in his famous proof that .
to make it clearer. However, the name collision is allowed in first order logic formulas, since it is easy to disambiguate the meaning of the formula (1). The importance of this trick is it allows the re-use of variable names; this re-use saves the number of variables needed and this reduces the size of the formulas .
Lower Bounds on the Theory of the Complex Numbers?
I believe their results hold for this case too, but they do not seem to explicitly state this. Perhaps this is a well known result, but I cannot track down the reference.
Can we use any of the tricks here in other parts of complexity theory? The recursion trick seems to me to be quite powerful. Perhaps we can use this trick in another part of theory.
Congratulations to Michael Rabin for being awarded the Dan David Prize for 2010. Wonderful.