Skip to content

Happy Un-Birthday Rich

January 29, 2021

Still going strong

Richard DeMillo just turned 74 years old the other day. Happy Birthday Rich, and many more.

Today I want to wish him also a happy un-birthday.
Read more…

Science Advisor

January 21, 2021

Resources for a new term from our vantage point

Crop from Broad Institute src

Eric Lander has been appointed director of the US Office of Science and Technology Policy, a post newly elevated to Cabinet status. He is thus the top science advisor in Joe Biden’s cabinet.

Today we congratulate Eric, whom we both have known personally, and survey some sources of science advice at his career’s roots.
Read more…

Edmund Clarke, 1945–2020

January 18, 2021

If only model checking could fight the virus

Edmund Clarke passed away due to COVID-19 complications. Ed will be missed. He was a winner of the 2007 Turing award for his seminal work on model checking. This was joint with Allen Emerson and Joseph Sifakis. They were cited for “developing Model Checking into a highly effective verification technology that is widely adopted in the hardware and software industries.”

Today I thought we might highlight some of his contributions.
Read more…

Priming Random Restrictions

January 14, 2021

Can we expand the base of their use?

Technion commemoration src

Bella Subbotovskaya was doing Boolean complexity lower bounds in 1961. She originated the method of restrictions of Boolean functions and proved that the formula size of the parity function of {n} variables is at least {n^{3/2}}. The parity bound was improved to quadratic in 1971 by Valeriy Khrapchenko, who passed away only last year and was the only child of the Soviet literature and arts leader Mikhail Khrapchenko. Then Johan Håstad strengthened the whole method to quadratic.

Today we discuss not possible further sharpenings but rather how to extend the ways of applying it.

Subbotovskaya was most unfortunately killed in a mysterious manner in 1982. George Szpiro in 2007 wrote an article reviewing the circumstances amid her questioning by the KGB for co-founding and hosting the “Jewish People’s University” amid rampant anti-Semitism in Soviet academies. She was also a talented violist and vocalist and investigated computer-assisted musical composition and analysis. See also her MacTutor biography.

As presented by Stasys Jukna in his 2012 book Boolean Function Complexity, Subbotovskaya gave an algorithm for optimizing the trimming of a formula over the {\wedge,\vee,\neg} basis at its leaves by substituting for {m} of the {n} variables, one at a time. She used an averaging argument that readily extends to give the same-order bounds on random restrictions. Random restrictions have other advantages that have led to wonderful work besides those above, including the recent oracle separation of {\mathsf{BQP}} from the polynomial hierarchy. We want to discuss targeting the restrictions toward possible particular sensitivities of the Boolean functions computed by the formulas.

The Method

Subbotovskaya’s argument considers every {\wedge} or {\vee} gate {g}, in a minimal formula {\phi} computing the function {f(x_1,\dots,x_n)}, that has {x_i} or {\bar{x}_i} as an input, for some {i = 1} to {n}. Note that because {\phi} is a formula, one of the two possible assignments {a_i} to {x_i} renders the other input {T_{g,i}} to {g} immaterial. If {\phi} has {s_n} leaves, then we can not only choose {i} so that {x_i} or {\bar{x}_i} occurs at least {\frac{s_n}{n}} times, we can set it to wipe out at least half of the subtrees {T_{g,i}}.

The final note is that doing so does not double-cancel any variable: if {T_{g_i}} contains {x_i} or {\bar{x}_i}, then assigning {1 - a_i} to that occurrence leaves the function computed by {g} unchanged—but that contradicts the minimality of {\phi}. Hence no subformula {T_{g,i}} contains {x_i} or {\bar{x}_i}. The upshot is that we can set {x_i} so that the resulting formula {\phi'} in {n-1} variables has {s_{n-1} \leq s_n - \frac{3s_n}{2n}} leaves. It is convenient to weaken this inequality by

\displaystyle  s_{n-1} \leq s_n\left(1 - \frac{3}{2}\frac{1}{n}\right) \leq s_n\left(1 - \frac{1}{n}\right)^{3/2},

so that by iterating we can exploit the identity that for any {k \leq n},

\displaystyle  \left(1 - \frac{1}{n}\right)\left(1 - \frac{1}{n-1}\right)\cdots\left(1 - \frac{1}{k+1}\right) = \frac{k}{n}.

The upshot is that for any {m}, we can set {n-m} variables so that the size {s_{m}} of the resulting formula in {m} variables obeys the bound

\displaystyle  s_{m} \leq s_n\left(\frac{m}{n}\right)^{3/2},\qquad\text{so}\qquad s_n \geq s_{m}\left(\frac{n}{m}\right)^{3/2}.

Because the parity function is such that any way of setting {n-1} variables leaves the formula needing (at least) one read of the remaining variable, we can carry this all the way down to {m=1} to deduce {s_n \geq n^{3/2}}. Note: this is concrete, not asymptotic.

Via induction on gates {g} deeper than those at the leaves, and techniques that adapt to how closely the two subtrees of {g} are balanced, Håstad increased the exponent from {3/2} to {2} in all cases, on pain of carrying some lower-order terms that make the bounds asymptotic. His lemmas rely more on the restrictions being random, including one that estimates the probability that the restriction already leaves a formula dependent on only one variable.

An Intuitive Framing

For sake of exposition we wave away the lower-order terms and write {\gtrdot} for the numerical inequalities. We frame the restrictions with notation that can help us target subsets {I} of variables to restrict, outside of which {f} remains sensitive.

Definition 1 Let {f : \{0,1\}^n \rightarrow \{0,1\}} be a Boolean function. We say that

\displaystyle  \textsf{free}(f,I,\alpha)

holds provided {I \subset [n]} and {\alpha: I \rightarrow \{0,1\}} and for some {x \neq y} in {\{0,1\}^n} the following hold:

  1. The inequality {f(x) \neq f(y)} holds.

  2. For all {k \in I},

    \displaystyle  x_k = y_k = \alpha(k).

Thus {\textsf{free}(f,I,\alpha)} holds provided the value {f(x)} is not determined by just knowing {x} on the set of positions in {I}. The remaining {m} positions are needed.

Theorem 2 (Main Theorem of Random Restriction Theory) Let {f : \{0,1\}^n \rightarrow \{0,1\}} be a Boolean function. Suppose that for a constant positive fraction of {I,\alpha} with {I} of size {n-m},

\displaystyle  \textsf{free}(f,I,\alpha)

holds. Then

\displaystyle  L(f) \gtrdot \frac{n^2}{m^2}.

An Example and Nonexample

To see how this scheme applies in the case of the parity function, we can set {m=1}. No matter which inputs are selected, the value of the function stays undetermined after {n-1} inputs are set. That is, the following is always true:

\displaystyle  \textsf{free}(\textit{parity},I,\alpha),

for all {I} of size at most {n-1} and all {\alpha}. Thus the main theorem says that the formula lower bound for the parity function is order {n^2}.

Note that the only property of the parity function {f} being used is the global way it satisfies {\textsf{free}(f,I,\alpha)}. We want to look for other patterns that can yield good bounds. Of course, patterns of sensitivity of Boolean functions have been looked at in many ways. We want to try to build a bridge from them to number theory.

For a non-example, consider the set {E} of even numbers. Let us number the variables so that {x_1} stands for the ones place in binary notation, {x_2} for the twos place, and so on. Then {E} depends only on {x_1}. Whether {\textsf{free}(f,I,\alpha)} holds depends entirely on {1} not belonging to {I}. If {m = n/c} for some {c > 1} then it holds with constant probability, and the framework gives

\displaystyle  L(E) \gtrdot c^2.

Well, OK, we really know {L(E) = 1} so this is where the overhead hidden by {\gtrdot} can’t be ignored. But at least this illustrates how the framework doesn’t get too excited about a frequently alternating numerical function. Similar remarks apply when the function is multiples of {2^k} where {k} is small.

A Prime Objective

We would like to apply the random restrictions method to the Boolean function {\pi(x)} that identifies whether the {n}-bit binary string {x} denotes a prime number. It has been known for decades that the Boolean circuit complexity of {\pi} is polynomial. This follows from the existence of a polynomial random algorithm for {\pi} due to Solvay-Strassen or the later Miller-Rabin algorithm and Adleman’s connection between such algorithms and Boolean circuit complexity.

There are strong lower bounds on special circuit models, but nothing so general as formulas, let alone unrestricted circuits. Certainly {\pi} is a harder function that parity, but we don’t know of a quadratic formula lower bound for it. Can we prove one in the framework?

Our intuition comes from considering various ways in which the primes are dense and uniformly—but not too regularly—distributed. Two simple ways are:

  1. The average size gap between prime numbers is {O(\log n)}.

  2. The maximum size gap between prime numbers is {O(\log^{2} n)}.

The first is of course a consequence of the prime number theorem. The second follows from the widely believed Cramér conjecture.

We will not really need this conjecture—we will not need it to hold everywhere. Our basic framework will only need it to hold for a positive fraction of small intervals of numbers that we sample. What we need instead is for a positive-fraction version to hold under a more liberal notion of “gap.”

Here is an outline to try to prove a formula size lower bound of {n^{2-\epsilon}} on {\pi(x)}, for some {\epsilon > 0}. Let

\displaystyle  t = n^{1-\epsilon}.

We will use the random restriction method for {p = 1/t}. Apply the method. The issue is now if we leave only {m} input bits unset, then what is the probability that

\displaystyle  \textsf{free}(\pi,I,\alpha)?

Certainly we cannot always leave {1} bit unset like with the XOR function. Look at this:

\displaystyle  11*111.

The input is either

\displaystyle  111111 = 63 \text{ or } 110111 = 55 .

And both are not a prime. In general, our finger has come to rest on the question of what is known or can be shown about the behavior of the set of primes on “hypercubic” sets—that is, sets that add or subtract all sums from a selected subset of the powers of {2}.

Open Problems

There must be other lower bounds on the primes. We cannot seem to find them. Any help?

Predictions For 2021

January 5, 2021

This year is a Blum integer, 43*47

2017 article, not via Zoom

Allan Lichtman correctly predicted the 2020 presidential election, based on a 7-6 edge in “keys” to Joe Biden. His model, which he developed in 1981 with the late Vladimir Keilis-Borok, simply says that the incumbent party will lose the presidential election if 6 or more keys go against them. It has been right in every election since 1984.

Today we make some predictions for the new year, 2021.
Read more…

Peter M. Neumann, 1940–2020

January 1, 2021

Memories from Oxford days

Mathematical Institute src

Peter Neumann, Professor of Mathematics at Queen’s College, Oxford University, passed away two weeks ago. Last Monday would have been his 80th birthday.

Today we join others mourning in private and remembering in public, including the authors of these stories and memories, who are among Peter’s 39 graduated doctoral students.
Read more…

P < NP In Our Stockings?

December 24, 2020

Brute force wins—sometimes source

Santa Claus is on the way to visit those of us who have been good.

Tonight is Christmas Eve, and we want to thank everyone who has been supporting Ken and me at GLL.
Read more…

Database and Theory

December 17, 2020

A dean of engineering

Jennifer Widom was appointed as the Dean of Engineering at Stanford not long ago. No doubt her research in databases and her teaching played a major role in her selection. For example, she is known for her contributions to online education including teaching one of the first massive open courses—MOOCS—which had over 100,000 students.

Today I thought we might highlight her area, databases, and one of its connections to theory.

But first, I was impressed to hear her say in an ACM podcast that she used punch cards during the start of her career:

So my passion in high school actually became music, and when it came time to select a college, I chose to go to music school. I’m pretty sure at this point, I’m the only dean of engineering anywhere who has a bachelor’s degree in trumpet performance, but that’s actually what my undergraduate degree is in. Late in my music education, I just sort of randomly took a class called computer applications and music research. And it was a class in the music school about using programming to analyze music, and it was my first exposure to computer programming. I have to say, it’ll reveal my age, but I used punch cards in that class. It was sort of the end of the punch card era.

I admit to having used these too at the start of my days as an undergrad at Case Institute. Do you know what a punch card is?

Ken once had his own experience with punch cards and music. While taking an intro programming course during his sophomore year—taught then by Jeffrey Ullman—Ken and his roommate joined an excursion to NYC for dinner and the New York Philharmonic. Still dressed in suits, on return to Princeton, they descended into the computer center to do their assignment—via punch cards on a batch server. Ken’s roommate got out before 2am, but Ken says he was still fixing punch card typos that ruined runs as late as 4am.

Data Management

Widom’s research has always been in the area of non-traditional databases: semi-structured data, data streams, uncertain data and data provenance.

We theorists view the world as made of questions like:

  • Upper Bounds Can we find a faster algorithm for problem X?
  • Lower Bounds Can we prove there are no faster algorithms for X? Sometimes we replace “faster” by other complexity measures, but these are the dominant questions we ask.

    Widom proposed a notion that is called lineage. Theorists worry about the performance of searching, but this issue is about where did the data come from? See also provenance.

    The insight boils down to garbage in and garbage out, or GIGO:

    On two occasions I have been asked, “Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?”

    Charles Babbage (1864).

    The issue of lineage and provenance is exactly this made modern. How can one track where the data comes from? Of course, finding fast algorithms to do the tracking is still an important question. Oh well.

    Databases Meet Theory

    Of course databases store and allow us to retrieve information—they are everywhere these days. One reason for their dominance is they support complex queries. For example, we can use databases to obtain all {x} so that {R(x,y)} is true for some {y}:

    \displaystyle \{ x \mid \exists y R(x,y) \}.

    From a theory point of view any first-order question is allowed.

    But searches often use approximate matches. Thus one might wish all {R(x',y)} so that {x'} is near {x}. We might want all {R(x',y)} so that {x'} is not “Michael” but also includes “Micheal”. This type of error is not too hard to handle, but a worse type of error is

    \displaystyle Michael \iff Michal

    Formally changing a letter is not as difficult as deleting or adding letters. This is the edit distance between two strings is defined as the minimum number of insertions, deletions or substitutions of symbols needed to transform one string into another. An important application is to biology.

    This leads to into the more-general notion of entity resolution (ER). Widom co-leads the Stanford Entity Resolution Framework (SERF) with Hector Garcia-Molina; they were two of the authors on its earliest paper in 2006. The page highlights the lexical nature of ER:

    For instance, two records on the same person may provide different name spellings, and addresses may differ. The goal of ER is to “resolve” entities, by identifying the records that represent the same entity and reconciling them to obtain one record per entity.

    Their approach “involves functions that `match’ records (i.e., decide whether they represent the same entity) and `merge’ them.” Their framework has an outer layer that treats the functions as generic. Widom’s emphasis has been on axiomatizing properties of the functions to enable efficient and extensible manipulation by the outer layer.

    Necessarily under the hood is a measure of edit distance. Ken recently monitored a large tournament where two players had the same surname and first three prenames, differing only in the last four letters of their last prename. The widget used to download the game file originally given to Ken truncated the names so that there seemed to be one person playing twice as many games. They could also have been distinguished by other database information. More common is when the same player is referenced with different spellings, such as (grandmaster Artur) Jussupow or Yusupov. The similarity metric needs to be extended beyond the name. This is when computation time becomes more a factor and theory enters in.

    Edit Distance

    Computing the edit distance between two strings is a long-studied problem. The best algorithm known is order quadratic time, as we covered here. There are better running times if we allow approximate algorithms or quantum algorithms. See this for the latter and this post for work related to the former.

    Theorists have tried to improve the time needed for classical exact edit distance. It remains at quadratic time, however, and there is evidence of inability to do better that we have remarked as puzzling. One of the strange traits of theorists is that when something is hard we try to exploit the hardness.

    When life gives you lemons, make lemonade.

    For instance, factoring integers is hard—so let us make crypto systems. The lemonade in the case of edit distance is a connection to the hardness of really-hard problems. To quote the seminal paper by Arturs Backurs and Piotr Indyk:

    In this paper we provide evidence that the near-quadratic running time bounds known for the problem of computing edit distance might be tight. Specifically, we show that, if the edit distance can be computed in time {O(n^{2-\epsilon})} for some constant {\epsilon > 0}, then the satisfiability of conjunctive normal form formulas with {N} variables and {M} clauses can be solved in time subexp. The latter result would violate the Strong Exponential Time Hypothesis, which postulates that such algorithms do not exist.

    Edit Distance—Searching

    An open problem that is closer to databases is not computing edit distance. Instead, the problem how to search for strings that are near a given string. This can be a generic use of an edit-distance function. A larger question is whether this can be approached more directly.

    I believe that while edit distance may indeed be quadratic time, it still is open how fast searches can be done for approximate matches. The problem is given a large collection of strings {S} how fast can we find the closest string in the collection to some word {w}? That the best algorithms for edit distance are probably quadratic does not—I believe—entail that this search question is difficult.

    Open Problems

    By the way, see this for other open problems in databases.

    [deleted stray line]

Is The End Near?

December 12, 2020

Queen’s Gambit and more

Kenneth Regan is my partner here at GLL and a dear friend.

He is a longtime faculty member in computer science at the University of Buffalo, a longtime fan of the Buffalo NFL football team, and a longtime international master of chess. He is also one of the world experts on detecting cheating at chess. Players cheat by making their moves based on computer programs instead of relying on their brains. The key issue is: computer programs play chess better than any human.

Today I thought we might send Ken a tip-of-the-hat for being in the Wall Street Journal.

That is he is featured in a WSJ article called Queen’s Gambit about chess cheating.

This is not his first mention in the WSJ—see here. See this for additional comments by Cory Nealon, who says:

The game of chess is no stranger to cheating. But since the pandemic hit and tournaments have moved online, the services of chess detectives have never been more in demand.

Ken is perhaps the world’s most feared chess detective.

Beyond Chess

Chess is important, but it is a game. Yet chess cheating may be an early warning of things to come. Might robotic methods in other areas soon start to dominate humans? Might chess be one of first areas to be lost to robots?

A 1970 movie that comes to mind is The Forbin Project. The plot is what happens when a computer system, Colossus, takes over the world. Its creator is Dr. Forbin, has the following exchange with it at the end of the movie:

Colossus later tells Forbin that the world, now freed from war, will create a new human millennium that will raise mankind to new heights, but only under its absolute rule. Colossus informs Forbin that “freedom is an illusion” and that “in time you will come to regard me not only with respect and awe, but with love”. Forbin responds, “Never!”

I think there is a reason to worry about computers like Colossus wiping us out. Perhaps not taking absolute control, but nevertheless making us second class citizens. I wonder if the following areas could be potential ones where we lose out to computers:

{\bullet } Protein folding? They already are doing quite well.

{\bullet } Play calling in the NFL? I made this one up, but I do wonder if using all the statistics available programs could do a better job than players and coaches. They probably could already do better than the Jets defensive coach, who was recently fired.

{\bullet } Solving Open Math Problems? We recently talked about Lean the proof checking system. But what if Lean could start to conjecture, to suggest directions, and give a proof outline? This would be, I believe, much more disruptive than the ability to check proofs.

Open Problems

What are some areas that you think could be next? That could be taken over by robots?
Moreover, as more areas are dominated by robots, we will need more Ken’s to stop cheaters.

Thanks again Ken.

The Future of Mathematics?

December 10, 2020

Proving proofs are proven

Kevin Buzzard is Professor of Pure Mathematics at Imperial College London. Wikipedia says that he specialises in algebraic number theory, rather than “specializes.” Is this because he was born in Britain, works in Britain, both, or could it be neither?

This month, he has an article in the AMS Notices that highlights the system L\exists \forall N. We thought we’d discuss L\exists \forall N and its potential to change math—or maths—as we know it.

For starters we should say that L\exists \forall N is:

Lean is an open source theorem prover and programming language being developed at Microsoft Research. Lean aims to bridge the gap between interactive and automated theorem proving, by situating automated tools and methods in a framework that supports user interaction and the construction of fully specified axiomatic proofs.

This wording should perk up the ear of us complexity theorists. We know that polynomial-size interactive proofs have the power of {\mathsf{PSPACE}} for one prover, {\mathsf{NEXP}} for two, and the Halting Problem for two quantum provers, all of which are believed bigger than {\mathsf{NP}} as for non-interactive proofs. We believe that our concrete capabilities are shaped by this abstract lay-of-the-land, though this is a meta-scientific assertion that we could never try to prove. What Buzzard’s article and L\exists \forall N are on about is the scientific state of what we do try to prove, or think we have proved.

Aftermath as We Know It

Buzzard spoke early this year at the second annual “Lean Together” segment of the Formal Methods in Mathematics conference at CMU.

His talk was not so much about details of L\exists \forall N but about the fact of errors and specter of gaps in published mathematics. While a student of Richard Taylor, he had a front-row seat to a major instance. As he relates on a slide marked Gaps in big letters:

  • In 1993, Andrew Wiles announced a proof of Fermat’s Last Theorem. There was a gap in the proof.
  • In 1994, Wiles and Taylor fixed the gap, the papers were published, and our community accepted the proof.
  • In 1995, I pointed out to Taylor that the proof used work of Gross which was known to be incomplete.
    Taylor told me it was OK, because he knew another argument which avoided Gross’s work completely.

He also highlights the current unfinished state of the classification theorem. The process of writing its proof may outlive all of the first generation of provers. We at GLL have accepted it almost all of our working lives. The “after math” can be harder than the initial publication even in the traditional process, let alone the totally free dissemination we have today.

Buzzard points out that some long proofs have now been written in L\exists \forall N or related systems. These include the famous Feit-Thompson odd order theorem in Coq, and the 2017 Ellenberg-Gijswijt proof of the cap set conjecture. This shows that complex arguments can be written in systems like L\exists \forall N and others. He maintains his own blog, “The Xena Project,” on practical and educational uses of L\exists \forall N and similar software.

Really Proved?

But there is a problem with L\exists \forall N and all the other similar systems.

Their correct proofs may be incorrect.

L\exists \forall N is a large software system and may have bugs. This is of course a major concern, but perhaps one that can be handled. Perhaps not.

I thought about this issue years ago when people argued for verification of operating systems to prove that they had no errors. They wanted to argue that software systems could be error-free only by making a formal proof that they were correct. Over forty years ago, Rich DeMillo and Alan Perlis and I wrote a paper on this issue. Our paper starts with

Many people have argued that computer programming should strive to become more like mathematics. Maybe so, but not in the way they seem to think.

We go on to argue against formal verification as the answer to correctness. It is interesting to see that now the argument is flipped: Should mathematics become more like programming?

Ken opines out that a bug in L\exists \forall N would probably be more catchable because it would likely replicate. Mistakes in proofs ought by nature to be more recognizable than telling whether a tactic to try to build a proof is likely to succeed.

Leaning Together at CMU src

Leaning Into L\exists \forall N

The fundamental idea of L\exists \forall N is to create a language that does make proofs more like programming. The key is that a proof written in L\exists \forall N can be checked automatically. This is like saying that a program in Python can be executed.

The project that Buzzard and others are interested in is several-fold. One is that L\exists \forall N can be a teaching tool. Another is it can help us ensure that the proofs we write down are correct.

The use of L\exists \forall N is akin to LaTeX. Most of us now write our proofs in LaTeX or some similar typesetting language. We love and hate the issues involved in getting the LaTeX to compile and to look good. Buzzard argues that adding the new ability to be sure that one’s proof is solid is a great leap. LaTeX lets us be sure the proof looks good, while L\exists \forall N allows us to be sure it is correct.

We have generally taken a reserved attitude toward formal methods on this blog—or maybe not so reserved. We keep our reservations expressed above. But what may help L\exists \forall N overcome them is its emphasis on interaction—on being an assistant. Maybe writing a proof in L\exists \forall N won’t be like taking castor oil proverbially used to be—it might be more like Marmite.

Open Problems

Two final thoughts. For the main one: Proofs are not as important as discoveries. Can we use L\exists \forall N to suggest ideas, open problems, approaches, and more? The gene folding program we discussed recently does not claim to prove its fold is correct. But it seems useful. If L\exists \forall N could help make guesses or provided intuition that would be valuable. Perhaps more valuable than assuring that a long proof is correct.

Another possible idea: How about we insist that claimed proofs of P{=}NP or other famous conjectures must be presented using L\exists \forall N. The “claimed” proofs are all relatively short, so it would not be too hard for the authors of these results to write them out in the L\exists \forall N system. They already LaTeX them, so why not? This would have a double benefit: The false claims that are made about famous problems might decrease; and the chance that a correct proof would not be overlooked might increase. The cost would be the time to write up theL\exists \forall N proof.

Buzzard is a colorful (or colourful?) character, and may also be gaining a reputation for being well ahead of the times. In 2017 he gave a talk on P{=}NP for the Royal Institution, but what we note in the video is how he was presciently dressed for Zoom.

[some format fixes]