Cornell CS at 50
Plus a long-promised discussion on diagonalization
TRUST security source
Dexter Kozen has been on the faculty of computer science at Cornell for almost 30 of the department’s 50 years. He first came to Cornell 40 years ago as a graduate student and finished a PhD under Juris Hartmanis in just over 2 years. He was named to the Joseph Newton Pew, Jr., professorship 20 years ago, and celebrated his 60th birthday in 2012. Besides many research achievements he co-authored an award-winning book on Dynamic Logic.
Today we salute the 50th anniversary of Cornell’s department and keep a promise we made 5 years ago to talk about a diagonalization theorem by Dexter. It yields what may be an interesting finite puzzle.
The 50th anniversary symposium earlier this fall was a great occasion; I wish I’d been able to drive over for it. There were two days of talks by Cornell faculty and alumni. It coincided with the department moving into a new building, Gates Hall, named for—who else?—Bill and Melinda Gates as principal donors. Bill Gates came for the dedication and engaged Cornell president David Skorton in a one-hour conversation on how to serve students best.
I have fond memories of my time at Cornell in 1986–7 and 1988–9 split with time at Oxford. I was given a home in the department though my postdoc came from Cornell’s Mathematical Sciences Institute, and I was treated incredibly well. It was a great personal as well as professional time for me. When I last visited two years ago, Gates Hall was just going up across the street, beyond the left-field fence of the baseball field I viewed from my office window in the 1980s. Dick and I wish them the best for the next 50 years.
“Innovative Departments” source
Dexter is on sabbatical in the Netherlands and Denmark this academic year. I first met him in Denmark, at ICALP 1982 in Aarhus. We took part in a local tradition of performing musical numbers at the conference banquet. He had a band called the “Steamin’ Weenies” when I was at Cornell. He has kept his music up with bands of various names through the “Katatonics,” which performed in a Superstorm Sandy benefit two years ago at Ithaca’s club “The Gates” (not named for—who else?—Bill and Melinda Gates).
It is hard to believe, but summer 1982 is three-fourths of the way back to Steve Cook’s 1971 paper which put the versus question on everyone’s map. Back then we knew it was hard but there was more optimism. I have told the story of sharing a train car with Mike Sipser on the way to that conference and hearing Mike’s confidence in being able to solve the question.
The aspect that captivated me that summer was formal logic issues involved in separating complexity classes. They can be approached in two ways: as sets of languages and as subsets of machines. The former I tried to conceptualize as a topological space. As with the Zariski topology of algebraic sets employed by Alexander Grothendieck, it satisfies only the axiom and so is not a Hausdorff space. The latter view was developed by Dexter’s 1979 paper, “Indexings of Subrecursive Classes.”
Dexter’s paper cut to the chase away from logic or topology by treating the problem as one about programming systems, i.e., recursive enumerations of machines that have nice properties like efficient substitution and composition. He showed a tradeoff between power and niceness of what one can do with them. The niftiest theorem, in section 7 of his paper, shows that if you insist on representing by machines that mark cells on a tape as a polynomial time clock, and insist on a composition operator giving only a constant-factor overhead in program size, then both universal simulation and diagonalization need more than polynomial space. Thus such representations of will not support a proof of different from , let alone . But in the previous section he gave a theorem showing that if you give up all efficient closure properties, then you can in principle achieve any diagonalization you like.
The only property of the class that is used by the theorem is that it is closed under finite differences: if is different from a language by a finite set, meaning that the symmetric difference is finite, then is in . And of course it uses that has an effective programming system, which is what distinguished bounded closed sets in my topological view.
- , and
- where we define .
Moreover, any reasonable formal system that can prove can prove statements 1 and 2. Indeed, we can make a permutation, so that the new indexing involves the same machines as the original, only scrambled. In the case , this yields his prose conclusion:
If is provable at all, then it is provable by diagonalization.
As he was quick to add, this doesn’t say that the mere fact of enables diagonalization, only that working by diagonalization does not impose any additional burden on a reasonable formal system.
Proof: We define in stages. At each stage we have “marked” the values for . At stage , take to be the least unmarked value such that
and define . This also marks . A suitable always exists since infinitely many languages in include , which handles the case , and infinitely many exclude , so there’s always a when too. Also clearly is one-to-one since we mark each value. To see that is onto, let be the least unmarked value at any stage . Since and differ by an infinite set, there will always be a in the symmetric difference, and the algorithm will set for the least such .
The proof works even if is undecidable—we just get that is uncomputable—and there are uncountably many ‘s to go with uncountably many ‘s. When is computable the in the algorithm is computable, and if the indexing includes many copies of machines accepting or then we can get or lower.
The inverse of , that is, the mapping , however, may have explosive growth for infinitely many , depending on how sparse the symmetric difference of with some languages in is. Still, if the statement is provable in a given formal system, then the inverse is provably total, so the diagonalization can be verified. Another way of putting this is that the formal system is able to verify that every language in —that is, every , is eventually enumerated, so the formal system can tell that the class being diagonalized against is all of .
The ease of the proof and its relation to oracle results contributed to controversy over how to interpret it. I’ve mostly felt that the logical “horse” was being put behind or at best alongside the “cart” of combinatorial proof methods. Recently I’ve appreciated how the proof focuses the question on how large your idea of “” is rather than how hard is, which was a key issue in the attempted proof by Vinay Deolalikar. The oracle result showing for some computable languages (any -complete language will do) was supposed to argue that diagonalization against the class had no special power as a tool for .
The proof relativizes by taking to be an ensemble of polynomial-time oracle Turing machines giving for each oracle set , and by taking a fixed oracle TM such that is -complete for each . We get a fixed oracle transducer such that executes the algorithm for . Whenever it computes a total function whose diagonal is . If you think of “” as “ relativized to ” then this seems like a recipe for contradiction if the unrelativized is total and . But what happens when is just that computes an undefined function—indeed it doesn’t halt. To make an algebraic analogy, should be parsed as not , so “the diagram does not commute” and there is no problem.
I came back to Dexter’s theorem this fall term while re-thinking how best to introduce diagonalization in an undergrad or grad theory course. The classical way is
where is the “Gödel Number” of the machine . My own usual style is to identify a machine or program with its code as a string, thus writing
This removes any need to talk about the correspondence between strings and numbers, Gödel or any kind of numbers. However, styling this via programming raises questions such as whether “” means the source or compiled code, and does it matter how they differ?
Recently I’ve preferred to think of as an object, indeed analogizing components to fields of a class. Then “” re-appears as an encoding of the object by a (number or) string. We define:
Now it doesn’t matter if we say is the source code or the compiled executable, and it can be anything—we can downplay the “self-reference” aspect if we wish. The encoding function need not be 1-to-1, provided it is “extensional” in the sense that
We still get the diagonal contradiction: if , then contradicting . And , which implies that there is some such that , but by extensionality we have and again that is a contradiction. (Well, you don’t have to tell students this more-complicated proof—just say that is 1-to-1 and it’s just as quick as the usual diagonal proof.)
Returning to Dexter’s diagonal set , we see it is the same as with . Looking at by itself, much as we can relax being 1-to-1, we can relax being onto, so long as is “functionally onto” the languages, in keeping with statement 1 of Theorem 1. Indeed, when the programming system repeats each language infinitely often, we could redo the proof to make an increasing function.
We can abstract this by considering any function from a set into its power set and functions on . It is interesting to try to formalize exactly when , that is,
with being extensional with respect to and being onto the image of . One interesting requirement is that when is not onto , must cover , that is:
Again I wonder if there is a useful analogy in abstract algebra or category theory for this situation. Given the function and , we can define for some choice of such that . For other we can put for any such that . Given the function , we can define for any by taking such that , since is functionally onto , and define .
A Finite Puzzle
The abstraction nicely gives a concrete puzzle when is a finite set. Then we don’t have Dexter’s property of being closed under finite differences, but we can still ask:
Is every unindexed subset a diagonal? That is, for all with , does there exist such that ?
Equivalently, does there exist such that ? If is 1-to-1, then must be onto, which means must be a permutation since is finite, and likewise must be its inverse. There are interesting cases where is not 1-to-1, in which the extra latitude for and becomes important.
For example, let and let , , and . Then is 1-to-1, so we have six permutations to consider. They become the six columns after the first in the following table:
Every subset not in the range appears; curiously appears twice. If instead we define only, then only three permutations are relevant, but having gives us three other functions to consider:
Once again we have all eight subsets between the range and the diagonals, this time with no repetition.
Which functions have this “pan-diagonal” property? If some element belongs to every set then we can never get into any diagonal, so except for some small- cases the answer is ‘no’ for such . Note that the complementary function gives equivalent results by duality. Hence the answer is also generally ‘no’ when some element is excluded from all subsets . This extends to say that if some elements are collectively included in or fewer subsets, and is one-to-one, then it is impossible to get as a diagonal, so the answer is ‘no’ unless is already part of the range. Note, however, that our second example shows that success is still possible if is not 1-to-1.
Is there an easily-stated criterion for to be pan-diagonal? Or is -hardness lurking about? That is the puzzle. One can also pose it for infinite , when the range of is not closed under finite differences.
How powerful is diagonalization? Does the puzzle have a simple answer?
We also wish everyone a Happy Thanksgiving.