Can a mathematician program beat humans?

Watson is the IBM program that just convincingly won on “Jeopardy!” last night. By the way Bill Murdock of IBM, who is on the Watson project, is speaking today at Tech.

Today I am wondering if we are all about to become obsolete as mathematicians.

February 10, 1996: Deep Blue, the IBM chess playing machine, beat Garry Kasparov, who was then the world champion, in the first game of their first match. Kasparov came back to win that match, but his number was up the following year: on May 11, 1997, Deep Blue trounced him in 19 moves to win the second match. Since Kasparov is often considered to be the strongest player ever, this was a major achievement. Deep Blue was massively parallel and used 480 special purpose chips, yet it is believed to have been weaker than a program you can run today on your laptop. For example Rybka, a commercial program, has a projected Elo rating just over 3200 on a dual-core system, while Kasparov’s top rating was 2851. In chess this rating differential means that Kasparov would be fortunate to score 20% versus Rybka.

February 16, 2011: Watson, the IBM question answering machine, beat the two top Jeopardy! champions of all time: Ken Jennings and Brad Rutter. A guess or prediction might be: in less than ten years there will be a laptop program that will be even better at answering general knowledge questions than Watson. Actually the program by then might not even need a laptop but will be able to run on some other personal device—will we even have laptops in ten years?

A Prediction

Could this be next—note I tabbed IBM’s breakthroughs as occurring every fifteen years and six days:

February 22, 2026: Thomas J., the IBM automated mathematician, has just claimed to have solved the Riemann Hypothesis. Experts have started to look at the paper that Thomas J. has submitted to the math archive. The paper is over 100 pages, but is well written says one of the top number theorists in the world, Peter Smith. He goes on to explain that while the proof is not yet checked it contains a number of very interesting new ideas ${\dots}$

IBM’s Next Project?

I am serious that Thomas J. could be a real project. Note I am not suggesting that they create a perfect mathematician, anymore than they created a perfect chess player or a perfect “Jeopardy!” player. Deep Blue made mistakes and lost games. It did not solve chess. It did not have to solve chess or play perfectly. Watson also buzzed in late sometimes or even made incorrect answers. It did not solve Jeopardy!. It also did not have to.

I think the big insight in both these brilliant projects and achievements is the point that a program can be immensely powerful even if it is imperfect. In the past people have thought about automated math programs, but I believe that most ideas were around formal proof systems. What intrigues me is that perhaps the goal is quite different: build a math program that makes mistakes, can give false proofs, can make errors. Just like real mathematicians. But such a machine could still be of immense value and importance.

For Ken Regan the contest’s most iconic moment came with the first “Final Jeopardy!” question, in a category titled “U.S. Cities.” The question itself did not specify a U.S. city, and Watson responded “Toronto?????” with five question marks signifying its uncertainty. Apparently it’s not the case that Watson did not take the category title into account. According to CMU professor Eric Nyberg as quoted here,

“A human would have considered Toronto and discarded it because it is a Canadian city, not a U.S. one, but that’s not the type of comparative knowledge Watson has.”

Whatever type it has, it was imperfect, yet that may have made it more powerful.

How To Do This?

I really do not know how IBM, or anyone, could build a mathematician program. I guess if I thought I did I would go off and start building one. But I can imagine several components that could be part of such a machine:

${\bullet }$ Vast knowledge: Clearly such a machine could have stored all math papers every written. All of the archives, all of every journal, all papers, all books. Some rating system might be needed to be careful about mistakes and false proofs. But unlike humans the machine could easily have stored all theorems, lemmas, and definitions ever written. Note that one project for doing this failed, but the two reasons given there might be helped by a Watson-like change of view.

${\bullet }$ Vast experimental ability: Many of the greatest mathematicians have been great calculators. Clearly such a machine could easily try vast numbers of examples in an attempt to make good conjectures, and to rule out false theorems. If it is working on a group theory problem, why not try it instantly on all groups of order less than a million?

${\bullet }$ Vast tools: Clearly such a machine could use certain formal tools. I do not believe it must create a formal proof, but it can use certain formal tools. If it needs the sum of some complex series it should use the best known tools available.

The machine also need not be viewed as stand-alone. The article then quotes Professor James Hendler of RPI,

“A human working with Watson can get a better answer. Using what humans are good at and what Watson is good at, together we can build systems that solve problems that neither of us can solve alone.”

Open Problems

Is this a possible goal? Note the one key insight that I feel the need to repeat: Thomas J. must not be perfect. It must not only put out correct formal proofs. It must be allowed to put out papers like we do. And these must be right often enough to be interesting, but they need not be perfect.

1. February 17, 2011 2:44 pm

In a real sense I think AI is already here and has been with us for a while. Into computers we embed larger and larger amounts of knowledge and expand our ways to utilize that information. We could perceive that collection as being some type of ‘artificial’ access to our ‘intelligence’. What’s changing now is that these collections of pre-computed knowledge are getting larger and more dynamic. We getting better at allowing us to interact with them.

As this progresses, we have less and less of the grunt work to do. Twenty years ago, I was able to get a symbolic algebra calculator like Maple to assist me with many of my calculus and algebra courses. While I focused on the direction, the code could easily do all of the leg work. The same is also true of arithmetic and calculators. Computers then, have always been tools to enhance our understanding and interaction with the world around us, and take away some of the more-repetitive work. I suspect that we’ve only tapped into a tiny fraction of their total capabilities.

Right now, our experts spend a long time learning some fraction of the existing knowledge base, and are limited by what they know and what they remember. In the long run, I can imagine a world where anyone will be able to tap into the collective knowledge of everyone who proceeded them, and get that ‘broader perspective’ in a form that is easily disgestable. Experts will still be required to apply that knowledge, but instead of relying on their subset, they’ll be able to work from the wisdom of the ages.

Paul.

• February 17, 2011 3:25 pm

In a real sense I think AI is already here and has been with us for a while.

No, no, no and NO!
Not to repeat a comment I made elsewhere:
Semantic entailment + supervised learning + a lot of data does not make for intelligence.

• February 17, 2011 3:37 pm

I don’t think you should classify the scope of intelligence by the sophistication of its underlying knowledge representation. To say that intelligence is entirely driven by the internal models used to accumulate knowledge seems overly limited. Intelligence — to me at least — is how, from the outside, something responds to the world around it. An intelligence being learns and adapts from its sensory input, getting ‘smarter’ as it learns more. If we found out that internally we stored everything as a massive linked list, would that change our classification (even if it didn’t change us)?

That being said, I do agree that what is holding us back right now is that we don’t know how to effectively model our knowledge and that we lack the algorithms to utilize that unknown model fast enough. Watson isn’t particularly intelligent, but it does certainly seem more intelligence than many of the creatures on our planet (although, hopefully not self-aware).

Paul.

• Type A for Anonymous permalink
February 17, 2011 6:00 pm

Can you give us a formal definition of intelligence so that we can evaluate whether an artificial version has been achieved?

• February 17, 2011 9:23 pm

An intelligence being learns and adapts from its sensory input, getting ‘smarter’ as it learns more.

It is at the “getting smarter” that the Watson architecture fails it can only get “larger” (in scope) .

Can you give us a formal definition of intelligence

If anyone could do that the “AI problem” would be solved, thus the criterion is:

Having a formal definition of intelligence and plausible evidence that it works.

It’s been “solved” since 1956 by Newell, Shaw, and Simon 😀

February 25, 2011 9:37 am

Don’t be pedantic. There are many definitions of AI (and no consensus at all!).

• February 25, 2011 11:53 am

Don’t be pedantic.

It is not so much pedantry than being irked by the marketing hype.
I have been promised that computers will be able to manage by themselves “pretty soon now” (in few years) since 1968 when I was taught programming, by IBM incidentally…

2. February 17, 2011 2:47 pm

I have been very favorably impressed that MathOverflow and TCS StackOverflow credit more reputation points to asking a good question than to providing a good answer.

By the same rationale, if Watson had composed those witty Jeopardy questions, wouldn’t that would have been far more impressive (even frightening!) than merely answering them?

Two good essays on this subject are De Milo, Lipton, and Perlis’ Social processes and proofs of theorems and programs (1979):

Since computers can write symbols and move them about with negligible expenditure of energy, it is tempting to leap to the conclusion that anything is possible in the symbolic realm. But reality does not yield so easily.

and Bill Thurston’s On proof and progress in mathematics (1994):

I have managed to avoid presenting … an unassailable and hard-to-learn theory with no practitioners to keep it alive and to make it grow.

Until the day approaches when Watson’s descendants have read, understand, and adapted their cognition to the implications of *these* two articles, there is perhaps not too much to worry about. 🙂

Vladimir Arnold was fond of quoting René Thom to similar effect: “One can always find imbeciles to prove theorems.” Perhaps the main lesson that Watson teaches us, is that the Arnold/Thom principle can be pushed very far.

February 17, 2011 2:58 pm

John, I think you underestimate what computers can do 🙂

Creating those witty questions I think would be very easy for a computer like Watson. Actually, I think it something that could probably be quite easily done on a desktop with access to the internet. It would be taking a set of connected facts and stringing them together in some way.

They way I might write this program is to start with an answer. Maybe I do this by randomly walking Wikipedia. Maybe the answer I run across is “grasshopper”. Now I do a search on the term and find various usages of it and then combine them in some well-defined ways. “This insect can also learn from a wiser master”.

OK, maybe not Jeopardy quality quite yet, but you can see where this is going.

As you know there are computer generated algorithms for creating pseudo-scientific papers. And even programs to create classical music (http://arstechnica.com/science/news/2009/09/virtual-composer-makes-beautiful-musicand-stirs-controversy.ars).

What is “understanding” and “intelligence” have alluded us for some time. I think as computers advance we will actually lose, not gain, confidence in how we define those terms.

• February 17, 2011 3:42 pm

Oh, I mainly agree with you, Ken (Jackson). Still, when we look at (for example) ChessBase’ Christmas Problems 2010, it is absolutely clear that these problems were *not* composed by computers … and in very many cases, could not be solved by computers either!

The visual puns alone, that make (for example) Pal Benko’s A Belated Christmas Present so hilariously wonderful, seem (to me) to be utterly beyond the ability of present-day chess computers to appreciate.

I’m very much hoping that Ken (Regan) will weigh-in on this! 🙂

• February 17, 2011 10:01 pm

Indeed! I’ve known Pal Benko for many years, and I’ve actually proposed writing an article for Chess Life completely on my games with him, which were (have been) singularly instructive. I started doing some of Benko’s puzzles as they appeared on Susan Polgar’s chess blog, but did not have time to keep up—nor with the ChessBase contest.

I can post a link to an “Anti-Computer” endgame problem I composed. Actually this was based on the King and Pawn ending of Michael Adams vs. Judit Polgar in the 2008 Wijk aan Zee Corus (now Tata) Steel tournament. In comments at that link you’ll find that I instantly refuted a long analysis for ChessBase by GM Mihail Marin, which they never acknowledged, and that I composed a more-extreme version which I posted on the Rybka Forums. It also links back to a replayable analysis file on my own site. However, it is possible that one could program a computer to compose an endgame study that is deliberately difficult in this way for computers to solve…

3. February 17, 2011 2:48 pm

That last comment about a computer mathematician being able to easily try a conjecture on all groups up to order N (for some large N) makes me green with envy – as a fledgling student of mathematics I’ve been discovering that being able to come up with examples/counter examples is one of the most important skills a mathematician can have.

February 17, 2011 9:59 pm

By the way, the bit about trying all groups of order less than a million is total fantasy. For example, there are about 50 billion non-isomorphic groups of order 2^10, and nobody could possibly list all the groups of order 2^20 (there are something like 10^178 of them). Furthermore, a lot of important groups are huge, so even a hypothetical list of groups of order up to a million would be of limited use.

However, if you’d like to play with a list of all groups of order up to 2000 (except for 1024), you can find it in GAP.

February 18, 2011 8:40 am

On number of groups.

Most groups are p-groups, even 2-groups. So they are solvable. Yes there are many of them but I do not think Thomas J. would have to brute force try all to check a question.

But still a nice point that there are many of them. It does raise a question: given a hypothesis in some group language how hard to check on all groups of size <n?

4. February 17, 2011 3:44 pm

I’d like to suggest the much less ambitious, although still extremely useful, goal of having a program which can read a real published math paper and translate it into a proof which can be understood by an extant proof checker, such as mizar.

Getting that to work in fifteen years sounds overly ambitious to me.

• February 17, 2011 6:06 pm

I would make it simpler: a program which can read a published math paper and fill in the gaps, or at least mark them down for later filling. This in less than 15 years is (99% sure) impossible.

Cheers,

Ruben

• February 17, 2011 8:56 pm

I think it would be impossible to accomplish what you said without being 99% of the way towards what I said.

February 18, 2011 8:43 am

Kevembuangga

The task for proof -> to formal proof is hard for humans. What I think might be just as cool is for Thomas J. to output only an informal proof. Let someone else fill in the gaps, let someone else debug it.

• February 18, 2011 8:57 am

What I think might be just as cool is for Thomas J. to output only an informal proof.

I think it’s just the opposite . 😉
The computer will need some help with the “direction” but will beat humans for the nitty gritty details.
It is called tactics in automated theorem proving, you tell the prover which tactics to use and it grinds on.
You don’t use automatic theorem provers, do you?

• February 20, 2011 2:46 am

Bram,

I have been proposing the exact same thing, since at least 2002… More recently, I proposed using machine translation techniques towards this end. Although it would be very difficult to build a large corpus, we are not starting from zero. Landau’s “Grundlagen der Analyse” was translated into AUTOMATH back in the ’70s, and I think I have both the LaTeX and the checked AUTOMATH proof stored somewhere.

Gustavo

February 17, 2011 4:20 pm

IMO no matter how good an automatic chess player is, while chess remains unsolved, they have two problems: not playing perfectly and not beeing able of learning from his own mistakes (afaik they can not). This imperfection and inflexibility makes them in the long term, a weak adversary against a human player, that can detect imperfection and improve by learning. As soon as you discover their weakness you can design a strategy to win them always. Of course, this can be really hard against automatic chess players of the level of Rybka, but it will happen.

Regarding IBM next project, since some judge humans by their IQ for very important things such as assignements of jobs, I suggest them to design a machine that can attain a level of 300 or 3000 in an IQ test. Hmmm…maybe machines are already good at this, are they ?

February 17, 2011 5:19 pm

Pretty sure both Deep Blue and Watson are designed to learn from their mistakes. I’ve heard some of Watson’s training rounds are pretty comical.

• February 17, 2011 9:38 pm

Pretty sure both Deep Blue and Watson are designed to learn from their mistakes.

Of course, this is called Supervised learning but this is not intelligence.

February 17, 2011 10:19 pm

But the real lesson from both projects is that for many purposes your observation is beside the point. If a non-intelligent laptop can beat a human by running a non-intelligent algorithm, then, for the purposes of playing chess here and now, the intelligence element does not matter. The lesson being that weather or not you call this AI, it is useful and more powerful than we realized.

The problem is that when people talk of AI they often debate the philosophy. Indeed, the philosophy that goes with AI is very interesting. But we need to think what we can do with the methods we have now, and what is the next step. Indeed, Artificial Intelligence right now is a misnomer. We just want to know how to build systems that do “human things”.

• February 18, 2011 1:03 am

But the real lesson from both projects is that for many purposes your observation is beside the point.

Not really, I am looking for a general intelligence, but not of the Singularitarian kind and I am not denying that specialized AI is and will be useful.
This is the position of Nick Szabo.
It is just that the claim for AI has been abused since the very beginning while OTOH the Singularity cult is freaking out ridiculously.

• February 18, 2011 6:47 am

I suppose my position can be summarized as follows. Intelligence is not a binary thing that humans have and machines do not. Adding and subtracting are intelligent behaviors but machines have been doing them better than humans for at least a century. Intelligence is not some magical general algorithm, it is a vast library of specialized routines. Addition, subtraction, playing checkers and chess well (game trees), playing Jeopardy well (probabilistic techniques over a vast data set of sentences). All require their own, very different, very specialized techniques. The general (“universal”) algorithms for learning, prediction, etc. are by contrast infeasible or uncomputable. Roughly speaking this is because general techniques must be brute force whereas efficiency requires preprogrammed preconceptions that radically reduce the search space, and very different such preconceptions are necessary for different problems.

The vast performance superiority of specialized algorithms over general ones has deep resonances with, indeed probably explains, what economists have observed about the gains from specialization (the division of labor). One comes to the same conclusion via economics or via algorithmic analysis.

This applies to an “AI mathematician” as well. As an anonymous commenter above points out, a general, i.e. brute-force method for e.g. searching the space of groups is preposterously infeasible. I’d add that non-brute force methods have to be preprogrammed with preconceptions about the group, and different preconceptions will work well or very poorly for different kinds of groups, much less for mathematical objects not analyzed as groups. In other words, there will be a family, possibly an extremely large and complex family, of algorithms to tackle the problem (really problems). The various “mathematician” suggestions in this thread will each require techniques very different from those used to computerize arithmetic, playing chess, or playing Jeopardy, and I doubt we’ve discovered all the relevant algorithms yet, much less know which ones to apply and how.

January 17, 2013 2:49 am

Intelligence is the ability to solve problems. Even a hand calculator is intelligent. The problem is that some people confuse intelligence with consciousness.

February 17, 2011 4:43 pm

It’s well-known that proving theorems with a proof length specified in unary is NP-complete. Of course, it’s been clearly indicated here that the machine in question should be allowed to make mistakes. However, couldn’t one argue that a “machine that can [sometimes] prove theorems” would be about as hard to build as a really good SAT approximation algorithm? I’d imagine this would be quite hard; if this is even possible, why not just map Riemann to a SAT instance and try your luck at finding a satisfying arrangement instead?

Here’s another question: Could someone build a “really smart computer cryptanalyst,” capable of breaking many but not all ciphers? Again, it would be wrong and imperfect much of the time, but still formidable enough to scare people. Is such a machine about as (im)plausible as the one suggested here?

I guess my objection is that the complexity inherent in proving truly HARD problem is much greater than, say, looking up basic facts that require only natural language processing (or, if you’re human, Google).

Despite my objections to the feasibility of this idea, it’s still fascinating to think about and a really nice subject for an article/post.

7. February 17, 2011 4:47 pm

Dick asked a direct question: “Is Thomas J. [the professional-level article-writing computer program] a possible goal?” … to which maybe folks should attempt and answer.

My personal answer would be, not in the near term. We can ask, for example: how nearly do programs like Rybka (the chess program) approach being able to write (ab initio) treatises like Raymond Smullyan’s witty and erudite Chess Mysteries of Sherlock Holmes? Answer: they are not near.

By extension, how near are programs like Watsen/Thomas Jr. to writing treatises like Mac Lane’s Hamiltonian mechanics and geometry? Including witty and erudite passages like:

Mathematical ideas do not live fully until they are presented clearly, and we never quite achieve that ultimate clarity.

Answer: it might be infeasible for Watsen/Thomas Jr. to grasp the point of Mac Lane’s article, the obstruction being that all of Mac Lane’s theorems can (in principle) be proved without reference to geometric elements.

Thus, that Rybka/Watsen/Thomas Jr. (and Sage, Mathematica, etc.) are useful tools goes without saying; that they can be agents of mathematical creation remains to be demonstrated.

• February 17, 2011 5:16 pm

I think it depends on the relationship between P and NP.

If they are not equal, then I suspect that the best we can do with a computer is approximate Thomas J., gradually getting more and more sophisticated, starting from systems like Watson.

If they are equal, then I think we could find some underlying knowledge representation, build it up and in a reasonable time be able to get back answers that consolidate our individual knowledge (and initiate a second ‘industrial’-type revolution).

In short, I think Godel is right in his predictions of what it would mean to solve it one way or the the other. It is of the greatest importance.

Paul.

February 17, 2011 5:11 pm

You may find this link interesting: http://theorymine.co.uk/
There you can buy, that is correct BUY, your own mathematical theorems. The theorems are generated and proved by a computer. They had so many theorems at some point, that they decided to put them on sale. Right, these theorems won’t give you the Field Medal, but the very idea that a computer can do mathematical research (in any level) is astonishing!

• February 17, 2011 5:42 pm

Fernando, the TheoryMine service that you linked-to is genuinely amazing … it is an eye-opening contribution to this discussion.

But the sterility of the documents that TheoryMine produces shows too, how theorems alone do not suffice to replace ideas and narratives, either in chess *or* in mathematics.

Systems engineers know how to express this perhaps better than mathematicians:

“Systems engineering is the discipline of the design of the whole, to realize an effective and harmonious ensemble, as distinct from the design of the parts.” (Si Ramo)

What computers give us in chess is “only” the moves, what they give us in engineering is “only” the parts, what they give us in mathematics is “only” the theorems.

February 17, 2011 10:01 pm

Indeed, computers can give us lots of theorems. The importance of intelligence is that it can tell us the importance or significance of a theorem.

We humans can easily enough encoding the statements of various well-known conjectures and open problems, and set computers to solving them. However, it would be much harder to encode a broader range of endeavors in which some theoretical advance would be useful, but we don’t yet know what that might be.

Also, I’m reminded of a project I heard about a while back that takes in experimental or simulation data sets and derives mathematical relationships from them. I can’t remember what it’s called, but such a thing would make a fun partner to a prover: find real-world-relevant things worth deriving from first principles.

Phil

February 17, 2011 5:48 pm

Not trying to diminish the achievements of Deep Blue, one should know (and unfortunately it is little mentioned) that during the tournament Kasparov lost, he had less breaks than those are given during a real World Chess Championship (I don’t know why Kasparov agreed to these terms).
So at least part of Deep Blue achievement should be attributed to its definite advantage – it never gets tired.
Just my tuppence.

10. Type A for Anonymous permalink
February 17, 2011 6:04 pm

The part of your 2026 scenario that I’m having the hardest time accepting is the top number theorist in the world having a name like “Peter Smith”. Given apparent trends in STEM enrollments, I’m thinking “Zhou” or “Sandeep” is more likely…

February 17, 2011 7:17 pm

Chess and Jeopardy were competitions with particular utility functions. In mathematics I don’t see a similar utility function, is it publishing papers? well, there are already machines which can generate papers that pass peer-review in some journals. AI people were conjecturing many things for decades and their predictions have been false.

Math will be a very big challenge for IBM, there are many simpler more useful challenges.

On the other hand I should say I find the idea of the power of imperfection really fascinating.

February 17, 2011 7:20 pm

I think it is far more plausible that computers become a tool to mathematicians first, before they obsolete mathematicians. In fact, computers already took the first step at revolutionizing mathematics: You can *search* for papers and for *content* in papers in milliseconds. You can collaborate on proofs in ways there were almost impossible just 20 years ago.

Computers as *verifiers* of proofs take some ingenuity currently. You have to be precise – deadly precise – or the computer won’t accept your statement. You have to decomposition the proof until you hit some formal axiomatic system – you cannot use a theorem or lemma until it has been proven and so on. The work it takes is enormous. Yet, the assurance you are given is very high: You can rerun the checker on another computer if you don’t trust this one. You can run it more than once. And you often only have to trust a small trusted computing base in the system. I don’t see the verification process becoming easier in 15 years.

Computers as *producers* of proofs is even harder. So even thinking of denting this in the future is small.

The trick, I think, is to join forces: Let the computer automate the “boring” parts and let the human figure out the patterns. That way, we leverage the strengths of computers and humans in an optimal way. But this also means that mathematicians will have to accept the proof of the four color theorem as a proof, even though it can only be verified by machine and not by human.

• February 20, 2011 7:44 am

I agree with you that let the computer automated the “boring” parts and let the human figure out the patterns”. Therefore I accept the proof of the 4CT as proof. But the existing computer aided proof of the 4CT is based on the “non-existence of minimal counter-examples” and I think so straight forward proof method can be also discovered or guessed by the future intelligent computer. Note that I am not talking about technical details of reducibility and discharging methods.

I have given non-computer proof of the 4CT based on the spiral chain coloring (see https://rjlipton.wordpress.com/2009/04/24/the-four-color-theorem/ ). The key word in my proof that any computer equipped with AI tools cannot figure it out easily is the word “spiral” that also makes the proof short and human readable.

13. February 17, 2011 7:49 pm

Here is yet another as-yet unmentioned aspect of this wonderful topic.

Assuming that favorable publicity of Watson’s victory is associated to an increase in IBM’s market capitalization, that increase is +204M\$ for every 0.1% of stock rise.

Hmm … IBM’s stock price rose 0.8% in the last 48 hours. 🙂

Can we infer that there are at present far too few computational mathematicians, scientists, and engineers in the world … at least, relative to the market’s valuation of enterprise opportunities associated to computation?

If so, this is mighty good news for young researchers in many disciplines. 🙂

February 17, 2011 8:28 pm

Watson’s reply to “Are mathematicians in jeopardy”? was “What is ‘No’?”

15. February 17, 2011 10:14 pm

At the IFIP’94 conference, I asked Robin Milner how long it would take for a computer to be able to understand and execute the kind of proofs we teach in a typical undergraduate theory course, such as proving that a given context-free grammar generates a given language. His answer took me aback: “Fifty years.”

Five or seven years earlier, I forget which, Boyer and(/or?) Moore came to Cornell, where I was then a postdoc, to demonstrate their state-of-the-art theorem-prover. It was on display opposite a reception that followed their 3-4pm colloquium talk, with a dinner at 6pm across campus with several deans on-tap. I asked whether it could prove that a given context-free grammar generated the parity language. They said there was first no problem to represent the grammar and the definition of the language over {0,1} in their system. The system got hung up, however, on the lemma of proving that 1 + an even number = an odd number, and vice-versa. They tried this tactic and that, real-soon-now, as 4:30 and then 5pm passed and they didn’t get any of the reception drinks or eats. If I recall correctly, Bob Constable was unable to whisk them away even at 5:45pm. The attention and respect made me feel very flattered, but they must have felt flattened.

February 17, 2011 10:27 pm

THE ? REALLY IS ” WHAT IS AN INTELLIGENT BEING ” ARE OUR CURRENT
DEFINITIONS OF ” INTELLIGENT ” “BEING” OUTDATED.

February 17, 2011 10:45 pm

If we have three same Deep Blue, named BlueA, BlueB, BlueC.
BlueA play games with BlueB a lot of times and in one game he may win or fail. If BlueA and BlueB become smarter and smarter and at last can always beat Bluec,maybe we can say that Deep Blue “learns and adapts from its sensory input” and in some sense it has intelligence.

18. February 17, 2011 11:19 pm

Gil,
Chess was a piece of cake compared to Watson. Just understanding natural languages is so damn difficult.
I think there will come a day when computers will solve math, but 2026 is an optimistic date.

But the brain is a awesome thing – for the volume it takes and the power it consumes, its amazing. The greatest challenge will be to achieve such efficiency in any machine. I don’t think we will come close to making such an efficient machine in a century. I’d love to be proven wrong though.

Rajbir

• February 18, 2011 1:14 am

Just understanding natural languages is so damn difficult.

Yes indeed, that’s why Watson actually does not understand anything it is just a beefed up chatterbot.

• February 22, 2011 1:07 am

@Kevembuangga
> beefed up chatterbot

I’d give more credit to Watson than that. Otherwise the human brain would be a beefed up nematoda nervous system.

• February 18, 2011 1:37 pm

You do not really want this, otherwise, what is our (humans) utility?

February 17, 2011 11:31 pm

Prof. Lipton and prof. Reagan, you should definitely get to know prof. Doron Zeilberger, or at least read some of his Opinions (http://www.math.rutgers.edu/~zeilberg/OPINIONS.html). At the very least read this: http://www.math.rutgers.edu/~zeilberg/mamarim/mamarimPDF/priced.pdf. He has for a very long time believed in computer generated and computer assisted proofs, even going as far as saying that it is a matter of time until human mathematicians will all become obsolete (and re-qualify as programmers which he claims is a much more interesting and elucidating exercise). Btw, besides proving some landmark results in combinatorics, and doing that entirely as a human, he publishes papers co-authored with his computer Shalosh B. Ekhad. Also he has advanced the state of the art in computer generated proofs by a big leap: the WZ algorithm, developed with Herb Wilf, reduces the proof of pretty much any hypergeometric identity to checking an easy to verify (by computer) proof certificate. One example of its power: it can prove a finite version of the Rogers-Ramanujan identity, a very difficult and deep result. (Btw prof. Z does not like infinities either)

Moreover, prof. Zeilberger envisions a world of semi-rigorous mathematics and by that he pretty much implies a world of *practical* probabilistic checking of proofs, where the proofs are computer derived too (but using a human algorithm — another reminder that everything is algorithms). Here is a quote:
“I envision an abstract of a paper, c. 2100, that reads “We show that the Goldbach conjecture is true with probability 0.99999 and that its complete truth could be determined with a budget of \$10 billion”.

While prof Z.’s statements are exaggerated, his ideas are interesting. While reading him, keep in mind that he often oversells his case intentionally just to make an impression (he quotes a Jewish saying to the effect of “if you don’t ask for a \$100 you won’t even get \$50″). While his statements about humans becoming obsolete as mathematicians should be taken with a grain of salt,he does believe that there is more space for computer-automated proofs. That humans should be moving more towards the big-picture and use computers more extensively, for “computations” which are not only numerical, and moreover, which automate not just routine symbolic manipulations but also whole patterns of reasoning and proof derivation.

Another quote I could not resist making: “The identity 2+2=4 may seem trivial nowadays but was very deep when it was discovered independently by several cavedwellers. It is a general and abstract theorem that contains, as special cases, many apparently unrelated theorems: Two bears and two bears makes four bears; two apples and two apples makes four apples. It was also realized that, in order to prove it rigorously, it suffices to prove it for any special case.”

• February 18, 2011 12:52 am

Sasho, thanks!—It is a fair coincidence that you mention Doron. Dick was originally planning a post on “heckling” at lectures, which was giving us some trouble to get the right focus and tone. I suggested I could improve it by expanding it to “e-heckling”, with Doron’s “Opinion 76” viewed as a “heckle” of Avi Wigderson’s 2006 ICM talk as my prime example. But Dick was inspired to write about Watson and (im-)perfection instead. Of course, your third paragraph alludes to Opinion 76 (among others) anyway…

20. February 18, 2011 1:49 am

John Langford: “What does Watson mean?”

21. February 18, 2011 3:19 am

It might not be the Riemann Hypothesis, but Robbins’ Conjecture was a mathematically interesting fact proved by an automated theorem prover (admittedly under close supervision) in 1996: http://en.wikipedia.org/wiki/Robbins_algebra

22. February 18, 2011 4:50 am

Surprised that nobody has mention Lenat’s Automated Mathematician yet.

Lenat called his new program AM, for Automated Mathematician. For the next several months he coded into it knowledge about set theory, a very abstract and basic discipline that forms the infrastructure of mathematics. A set is simply a collection of objects(A,B,C,D), for example. It doesn’t matter what the letters represent: rocks, cars, people, galaxies. A set is an abstraction, and set theory is the science describing the behavior of this pure mindstuff.

For AM to understand set theory it had to know about such fundamental notions as equality (when two sets are identical), union (when two sets are combined to form a third set) and intersection (when the members that two sets have in common are used to form a third set). Lenat included 115 such concepts in AM, along with several hundred heuristics that captured some of the rudiments of mathematical discovery and aesthetics. One heuristic, for example, said that a good way to start exploring a concept is to find examples of it. Another said that if a concept has few examples, try creating a new, more general version of it.

Guided by such rules, AM conducted hundreds of experiments in set theory, modifying and combining ideas until new ones evolved like crystals growing in a supersaturated solution. In this way, the program chanced upon the concept of natural numbers, which enabled it to discover arithmetic. The concept of union led to that of addition; addition led to multiplication, which led to exponentiation.

One heuristic advised AM to study the inverse of interesting functions. Thus, AM turned multiplication into the concept of divisors. By applying a heuristic that suggested looking for extreme examples, it found numbers that have only two divisors: the primes. Once AM knew about primes it was only a matter of time before it created versions of such famous mathematical ideas as Goldbach’s conjecture (every even number greater than two is the sum of two primes) and the fundamental theorem of arithmetic: any number can be factored into a unique set of primes.

“AM went off and discovered hundreds of things,” Lenat said, “about half of which were useful and half of which were sort of weird and probably losers.” But, after its first 200 discoveries, AM began to run dry.

“It started doing silly things, one after another, like defining numbers that were both odd and even…or other just awful stuff that doesn’t exist, or of which there is only one.” The percentage of good concepts–the “hit rate,” as Lenat called it–dropped from 62.5% to less than 10%.

23. February 18, 2011 5:16 am

“Today I am wondering if we are all about to become obsolete as mathematicians.”

Not if there is a number-theoretic function F(x) that is in NP but not in P. In such a case, given any n, the machine can give evidence to verify that F(n) is computable; but it cannot give evidence to verify that, for any given n, F(n) is computable. The latter, for instance, is the significant meta-mathematical conclusion that a non-machine intelligence can conclude from the first half of the argument in Goedel’s Theorem VI of his seminal 1931 paper on formally undecidable arithmetical propositions; it is also the essence of Lucas’ Goedelian argument.

One could comfortably say that machine intelligence can be superior to non-machine intelligence for giving evidence in reaching conclusions that are algorithmically computable; whilst non-machine intelligence can be superior to machine intelligence in reaching conclusions that are algorithmically verifiable, but not algorithmically computable. So, in effect, the partnership between non-machine intelligence and machine intelligence can be said to have already started with the use of tools such as the abacus!

• February 18, 2011 5:48 am

I do not have the mathematical ability to “prove” anything about that but I doubt very much that any result, text, graphic, whatever which is created by an human being and is encoded in a finite number of discrete symbols could not as well be computed by a machine in principle, i.e. notwithstanding the limitations of the current state of the art.
That would entail some “mystical” properties of the human mind.

24. February 18, 2011 6:05 am

There is a nice post about computer chess written by Amir Ban here:
http://gilkalai.wordpress.com/2008/06/25/amir-ban-on-deep-junior/

25. February 18, 2011 7:26 am

The task of having computers replacing mathematicians discussed in this post is indeed very interesting. Bram Cohen’s remark about a way to translate human proofs into formal mathematical proofs is also very interesting. (Of course, we do not need necessarily to translate the proof from the paper, we can equip mathematician with a program (say, something like TEX) which allow them to write their human proofs in a language which is suitable to a computer programm aimed at translating human proofs into completely formal proofs.)

Both these tasks like machine translation and like many other AI goals look possible in principle but they also look very hard and the progress on them is rather slow.

A general remark I would like to make is that I am not sure we have good methodology to discuss and study why things are hard. Of course, we have by now a long tradition of showing that certain things are impossible. And indeed showing that a certain task is impossible is quite a convincing argument for why it is hard. (The impossibility tradition is so prominent that when we talk in complexity theory about “hardness” we really mean “impossibility”.) But it is hard for us to study syetematically why things that are possible are still hard.

It would be a nice idea to discuss the details of the progress regarding computerzed theorem proving, say in the last 50 years. Maybe we can also come up with good ideas where the deadlocks are and what kind of breathroughs are necessary.

• February 18, 2011 10:23 am

Gil, a suggestion that is stimulated equally by your remarks and by Scott Aaronson’s question “What’s the next increment after Watson?”, one suggestion is a mathbot whose MathOverflow reputation is 160 or higher,.

A reputation of 160 would surpass that of 85% of MathOverflow users. To pass this real-world Turing Test—which has the advantages of being wholly objective, quantitative, and natural—it would suffice for a mathbot to answer 0.1% of MathOverflow questions sufficiently well to gain one upvote on-average.

Is this feasible? Well … certainly there exists a large corpus of literature and posts to learn from.

• February 18, 2011 1:48 pm

John, that’s a cute idea.

February 18, 2011 7:55 am

Regarding computer understanding of informal proofs written by humans, this thesis is interesting:

http://people.pwf.cam.ac.uk/mg262/

I don’t remember where I found the link. It may have even been here on this blog 😉

• February 18, 2011 8:27 am

This is a controlled natural language more “comfortable” than Mizar and als but still quite formalized.
The language analysis capabilities used in Watson will allow to move even further in this direction.
To have a quick assessment look at the Flatlands and Restrictions PDFs in the thesis page.

27. February 18, 2011 2:50 pm

The proof of four color theorem had been done by computer, but nobody said: “mathematicians in jeopardy”. Today AI looks like brute-force search and nothing more 😉

February 18, 2011 2:55 pm

“An intelligence being learns and adapts from its sensory input, getting ‘smarter’ as it learns more”.

Some non-human animals improve their performance as they learn more within the particular environment they are fitted for and I wouldn´t call them intelligent.

I would suggest that intelligent beings (natural or artificial) must have the following features:
–have interests ( in case of animals to endure or reproduce; utility function ?).
–autonomy (in representation, decision and action; in case of animals this implies beeing able of represent their environment; in case of a computer i would expect beeing able of generate their own input and their own instructions; until now they are dependent of humans for this).
–universality (in the sense of Turing universal machines; all animals except humans lack of this feature: they are fitted to their environment; AI machines also: Deep Blue is designed for palying chess, Watson for wining Jeopardy..).
–learnability (beeing able to improve their performance; supervised learning seems a good step for this but i think we still need improvement).

As an application of above ideas: this post describes greatly the idea we have of a perfect mathematician (vast knowledge, vast experimental capabilities, vast tools). Since the stock of knowledge increases everyday I would expect from Thomas to update everydate its knowledge representation, decide that the kind of problem he is working on is not anymore relevant (autonomy) and decide which problem is relevant after this knowledge update.

The only entity I know until now have these properties are human beings whose brain / mind is a combination of a sensitive brain / machine (the one which works through sypnaptic/neurotransmiters, which reproduce literally the environment the cell lived on, at each synapse) and an abstract computational brain / machine (the one which works through the non-synaptic direct electric contacts between neurones). This last one, in humans, is nothing but a parallel universal computer as the ones we already have, but much more powerfull and efficient (that would be the zettascale challenge i suppose). So what AI needs is just to developpe the sensitive Artificial brain and combine it with the abstract machine we will already have or the more powerfull version we will have in near future.

P.s. Kavenbuanga, so how does a machine does performs in a Raven-like Intelligence test at present ? Any info about this ? I´m curious…

.

• February 18, 2011 3:09 pm

Not too sure that there has been any attempt at the Raven tests in AI the representation of shapes would be daunting.
Shane Legg has compiled may be a hundred definitions of intelligence. 😀

29. February 18, 2011 2:56 pm

I do not believe that computers will beat a good amateur player in Weiqi (known in west as Go) in 2026. I believe that computers will never beat a professional in Weiqi.

• February 18, 2011 4:29 pm

Qi, here’s a question: Let A0 be the world’s best Weiqi player. Let A1 be a player that A1 can beat—in matches with no stone handicaps—75% of the time. Let A2 be a player that A1 can score 75% against, and so on. Continue until A_d is a beginning Weiqi player. Then d is the “Human Depth” of Weiqi, a concept first espoused by a Hungarian whose name escapes me now.

I believe that the human depth of a game is proportional to how many Moore’s Law generations are needed for computers to (have) achieve(d) supremacy at it. A source I once used listed depths of 10 for 8×8 Checkers, 11 for Chess, and 13-14 for Shogi (Japanese Chess, which computers are now “abreast” of). Estimates of the depth of Weiqi in the source, however, were “25–40”. Do you know of this concept? Here 25 doesn’t mean Go is twice as deep as the average of Chess and Shogi, but rather exponentially deeper.

• February 19, 2011 9:41 am

Here is my comment.

• February 19, 2011 9:15 am

Then place a long bet on this.

• February 19, 2011 9:24 am

I am trying to reply to this comment bi “Qi” but i keeps being ignored by WordPress.

• February 19, 2011 4:21 pm

Kevembuangga, indeed there are multiple comments from you in the spam queue. It’s not moderation by us; instead I’m guessing that WordPress doesn’t like the big all-caps LONG BETS on the linked pages. I’ve written the same to them. Here I am linking to the relevant section of Wikipedia’s page on the Long Now Foundation instead.

Meanwhile, all Kevembuangga wanted to say was that Qi’s assertion about computer Go by 2026 is defined perfectly well enough for the kind of bet that LongBets.org handles. Our regrets on the delay and inconvenience.

• February 19, 2011 8:42 pm

Why didn’t you try the same link than me to see if it makes a difference?
Or did you?

30. February 18, 2011 4:05 pm

I am thinking of a possible two worst cases for a mathematician in the future.

Case 1. Mathematician is a person who generate mathematical problem for a computer to solve the problem and to prove theorems.
Case 2. Mathematician is a person who understand mathematical problems generate and solved by a computer.

I hope that future will never come.

• February 19, 2011 9:11 am

I expect these two outcomes to be quite plausible and don’t see what’s wrong with it.
Why would accountants have to be proud of their “arithmetic” capabilities (adding and multiplying) , this is not the most interesting or useful part of their job.

• February 19, 2011 3:08 pm

Pessimistic two possible outcomes have been listed for the sake of the profession of being an mathematician. Accountants continue to be proud of their “arithmetic” capabilities but not because of level of mathematics but rather adding and multiplying numbers related money.

31. February 19, 2011 6:23 am

It amuses me that a number of people think that the question of whether P equals NP has any relevance here. I have a simple question to anyone who thinks this: do you honestly believe that humans can solve NP-complete problems in their heads? Of course not! So how is that we manage do to mathematics when the problem, “Find a proof of this statement of length at most n” is NP-complete? The answer is obvious: we vastly restrict the set of instances of this problem that we are prepared to tackle.

All those features of real-live, human mathematics that distinguish it from mere formal verification of arbitrary well-formed strings of symbols — that statements should be natural, interesting, meaningful, enlightening — are precisely what will enable computers to do mathematics, and sooner than most people think. (Well, that’s my view anyway.) They won’t be able to solve everything you throw at them if P doesn’t equal NP, but there is nothing to stop them becoming vastly quicker than we are at solving the kinds of mathematics problems we like to solve.

I think there will be an intermediate stage where they are extremely good at relatively easy (but still non-trivial) mathematics, and we will use them to check hypotheses that we as humans might spend a week thinking about before finding a relatively simple proof or counterexample. That may even be the ultimately enjoyable era for human mathematicians before the computers take over completely and mathematics goes the way of number crunching and ceases to be a human activity (though deciding what we want to know would remain something we might do).

• February 19, 2011 7:47 am

Tim Gowers’ post gained my immediate assent … because I was prepared to do so by a parallel principle in quantum systems engineering: formal proofs that quantum dynamical systems are computationally difficult to simulate have little to with practical engineering.

This is because in practice (in Gowers’ words) both mathematicians and systems engineers “vastly restrict the set of instances of this problem that we are prepared to tackle.”

Being a contrarian, though, I have to ask whether Tim and I are not both wrong?

One strong contrarian argument is that mathematical programs and great engineering programs are founded upon great narratives … and the computational complexity of constructing great narratives is not-at-all obvious.

What quality is it, that makes for a great narrative in math, engineering, science? … and in literature, poetry, fashion, and games? As with jazz, pornography, romance, and ecosystem integrity, narrative quality is mighty tough to define … and yet, we know it when we see it.

Relatively few quotations on this subject are in my index, here is one from Terry Tao’s What is good mathematics?(2007):

There does however seem to be some undeﬁnable sense that a certain piece of mathematics is “on to something”, that it is a piece of a larger puzzle waiting to be explored further. And it seems to me that it is the pursuit of such intangible promises of future potential are least as important an aspect of mathematical progress than the more concrete and obvious aspects of mathematical quality listed previously.

Do Terry Tao’s “intangible promises of future potential” exist in any objective sense? Can these promises be formalized? What might be the computational complexity of synthesizing them? Is a machine that is incapable of synthesizing these intangible qualities really capable of top-flight mathematics (or top-flight engineering, for that matter)?

These questions are too tough (for me) even to guess at good answers … at present, they seemingly are too tough for anyone.

• February 19, 2011 8:37 am

“I have a simple question to anyone who thinks this: do you honestly believe that humans can solve NP-complete problems in their heads?”

Finding optimal sensory representation is NP-hard. Nevertheless humans are practically optimal at this task. Then either P=NP, or there is a pretty good approximation to that.

Thanks,
Stupid crank in the crowd.

• February 19, 2011 8:50 am

Can you state precisely what this NP-complete problem is that we are supposed to solve? I just don’t believe that we can solve general instances of an NP-complete problem any better than computers can, but I’m interested to have this belief challenged.

• February 19, 2011 9:27 am

This is a theological discussion, so everybody will stay within their religion.

The simple line goes like this. Humans are much better in image segmentation that computers are. That requires to learn how to code and code textures for discrimination purposes. You can model textures as a homogeneous (Markov) random field. Therefore the task of the segmentation requires to evaluate the local statistics and see the transition between them. Now what is hard is to learn the set of relevant statistics. In the real world they are not for exponential families, and even computing partition function for maximum likelihood estimation is NP-hard. You can go to approximate that, but still the problem does not become much less hard.

In theoretical physics there is an experimental counterpart that test hypothesis and rules out wrong assumption. In theoretical complexity there are two models – computer programs and human beings, and I do not see l lot of application of complexity theory to say sensory processing in humans, or different anymals.

• February 19, 2011 11:13 am

I also agree that the P equals NP has no direct relevance and that humans cannot solve computationally intractable problems in their heads, (neither can the forces of nature or the market forces.) The belief that NP is not equal P can still be regarded as a metaphore for the gap between checking a proof and reaching a proof in human mathematics. Another place that the NP equal P problem is relevant is as follows: When people or nature solve problems, which when modeled (often is greater generality) are NP complete problems, this does not mean that nature or humans can perform intractable tasks, but it does say that either the modeling is incorrect, or that the modeling is incomplete and we lack some crucial “secret” for understanding why the specific problem is tractable.

• February 19, 2011 2:11 pm

Hmm … what might provide an example of a Kalai-type “crucial secret”, that humans know, and yet well-informed programs like Watson lack?

In answer to a MathOverflow question titled “Proofs that require fundamentally new ways of thinking”, I cited Shannon’s Mathematical Theory of Communication (1948) as an example of a proof that (in effect) shows knowledge of “crucial secret.”

That MathOverflow answer cited Doob, von Neumann, Wittgenstein, and Wilson as postulating that a key aspect of “the crucial secret” is simply this: daily experience of ordinary human life in ordinary human bodies. And oh heck … that answer should have cited Marvin Minsky too! 🙂

Whether or not this particular Doob/ von Neumann/ Wittgenstein/ Minsky idea is a reasonable answer to Gil’s question, the MathOverflow question “Proofs that require fundamentally new ways of thinking” provides a suitable venue to post specific answers having the general construction: “The conception of <seminal proof> depended upon a <crucial factor> that computers <lack/don’t lack> as asserted by <works cited>.”

So if you have an answer of this form, why not share it on MathOverflow?

• February 19, 2011 3:54 pm

As an aside, the MathOverflow question “Proofs that require fundamentally new ways of thinking” was posed by Tim Gowers, the answerers include well-respected mathematicians like Terry Tao, and Gil Kalai’s reputation on MathOverflow (and TCS StackExchange) is among the very highest … as earned by Gil’s sustained contribution of highly-rated questions and answers.

Thus, in the above post, hopefully it is clear that the word “you” refers to “you, the good readers of Gödel’s Lost Letter and P=NP” … not to Gil Kalai or Tim Gowers or Terry Tao, whose idea-sharing already sets a standard for us all.

• February 20, 2011 8:08 am

Restricting ourselves to arithmetic, I suspect that the question of whether mathematicians are in jeopardry, and whether mathematicians can compute/solve faster/better than machines, should be treated as two separate issues.

Prima facie, the first has to do with first-order Peano Arithmetic PA, which attempts to capture in a formal language the objective essence of how a mathematician intuitively reasons about number-theoretic relations logically.

The second with Computability Theory, which attempts to capture in a formal language the objective essence of how a mathematician intuitively computes number-theoretic functions algorithmically.

Lucas’ Goedelian argument can be seen as the – not entirely uneasonable – thesis that there are arithmetical conclusions which we can logically arrive at by reasoning that we cannot arrive at by the evidence provided by algorithmic computations.

Another way of stating this would be the assertion that there is some way of defining effective computability that would falsify the Church-Turing Thesis.

To place this in a formal perspective, note that Goedel has shown (in the first half of his proof of Theorem VI in his seminal 1931 paper) that there is an arithmetical formula – say [R(x)] – that is not PA-provable by a uniform method of proof (since [(Ax)R(x)] is not PA-provable), but [R(n)] is PA-provable for any given numeral [n].

I would argue that when the PvNP problem is reflected appropriately in the language of arithmetic, the above would correspond to saying that the interpretation of [R(x)] over the structure N of the natural numbers – say R(x) – is algorithmically verifiable as always true (hence in NP), but not algorithmically computable as always true (hence not in P).

The possibility of such equivalence being formalisable is hinted at by Turing in his 1936 paper on computable numbers when discussing the Halting argument (of which the above may be seen as an arithmetical instance), and by Goedel both in his Gibbs lecture and in his ‘lost’ letter to von Neumann.

To address the issues concerning such a putative equivalence, I suspect that we may need to build a bridge between the two areas, which may involve:

(1) interpreting Goedel’s reasoning (and conclusions) in the two Theorems VI and VII of his seminal 1931 paper constructively – and differently from the implicitly Platonistic interpretation favoured by him;

(2) an arithmetical – rather than set-theoretical – expression of the PvNP problem.

Consequently, I would argue that if a number-theoretic relation R(x) is in NP but not in P, then a mathematician can logically reason that:

(a) for any given n, [R(n)] is PA-provable and so there is some algorithm that can give evidence that R(n) is true for some natural number n in the structure N of the natural numbers;

(b) [(Ax)R(x)] is not PA-provable, and so no algorithm that can give evidence that, for any natural number n, the proposition R(n) is true in N.

I would further argue that, merely on the evidence provided by a machine, a mathematician may conclude (b), but not (a); which is why Chaitin’s Omegas are curiously labelled as ‘uncomputable’ (amongst other such curiosities, as noted by Boolos, Burgess and Jeffrey on p37 of ‘Computability and Logic’.).

The reason: machines can only give evidence for arithmetical generalisations that are either algorithmicaly decidable or PA-provable, which R(x) is not.

To place this in a slightly different perspective, consider Goedels primitive recursive relation xBy, which holds if, and only if, x is the Goedel-number of a PA-proof of the PA-formula Goedel-numbered by y.

Now xBy defines an algorithm that, for any given natural number n, can give evidence for constructing the PA-proof of [R(n)]; a proof moreover that – for most n – is probably not only beyond the conceptual powers of any mathematician, but also beyond the ability of a mathematician to verify without the evidence provided by a machine.

Further (notwithstanding the standard interpretation of Goedel’s Theorem VI) the above algorithm should also give evidence for constructing the PA-proof of [~(Ax)R(n)].

Reason: no machine can be programmed to assume – as Goedel does – that PA is omega-consistent (which is equivalent to presuming Aristotle’s particularisation: essentially the intuitionistically objectionable postulation that from the assertion ‘[~(Ax)~R(x)] is PA-provable’ we may conclude that ‘[R(n)] is PA-provable for some numeral [n]’).

Moreover, it can be argued that – so far as a machine is concerned – the PA-axioms are algorithmically decidable as true under an appropriate ‘algorithmic’ interpretation in N; and that the rules of inference of PA preserve algorithmic truth in N under such an interpretation.

What the machine would then give evidence for – by providing a PA-proof of [~(Ax)R(x)] – is merely that there is no algorithm that, for any given natural number n, will give evidence for the assertion that R(n) is true in N.

From the above a mathematician could – but a machine cannot – conclude that R(x) is in NP but not in P (as Goedel’s reasoning seems to implicitly suggest in his Theorem VII).

February 21, 2011 2:15 pm

I think you’d need to modify a lot of your conclusions based on the fact that PA is in no way a limit on the ability of machines to reason about the natural numbers (nor does PA provability have anything to do with the PvNP question).

There is no obstacle to expanding PA to a system where “PA proves F(x) for all x” entails “For all x, F(x)”. In fact, you can keep iterating this to get ever more powerful systems (though it does hit a limit at a theory that’s still relatively modest).

• February 20, 2011 10:07 am

@gowers You may look on summary page on the site my name links to. I completely understand that I’m crank, almost trisector, still you may have see some rational grain there, if you do not dismiss approach completely. There is too much freedom in it, linking combinatorial (max-cut via integer programing) to minimizing quartic polynomial, so it is hard to believe that all mappings will fail. It always works in practice (quite possible because of perturbations induced by round-off errors), and, being mathematically zero, I need some guidance to fill the gap between simulations and theory, there is too much math involved in it (e.g. quadratic forms and real algebraic geometry). If you want a challenge, than find a counterexample to the algorithm presented there, or find me a person I can talk to learn important peaces, and fill the gap by myself – someone have to understand what is going on there.

February 19, 2011 5:45 pm

If we can create a program to prove theorems, then we can surely just as easily create a program to discover efficient algorithms to solve various important, hard problems, and then to prove that they work (and are indeed efficient). It’s a little like how the “infinite improbability drive” from the Hitchhiker’s Guide series was created from a “finite improbability generator”: “Then, one day, a student who had been left to sweep up the lab after a particularly unsuccessful party found himself reasoning this way: If, he thought to himself, such a machine is a virtual impossibility, then it must logically be a finite improbability. So all I have to do in order to make one, is to work out exactly how improbable it is, feed that figure into the finite improbability generator, give it a fresh cup of really hot tea … and turn it on!”

The superior algorithms for finding superior algorithms could then be integrated into the algorithm-discoverer program, and process repeated again and again until a fixed-point in performance is reached. At that point, one could use this super-program to go about devising algorithms for all areas of human endeavor: science, medicine, law, politics, you name it.

“Politics?”, you may ask. Well, imagine how useful it would be to a voter to have a computer such that upon asking it, “Is that politician being untruthful or misleading about such and so?”, then computer responds “Yes, and here is the evidence…” Or, imagine how useful such a system could be to a politician…

February 19, 2011 10:09 pm

I have not read all the comments but I can safely say that the distinction between the two capabilities, viz., combinatorial search and creativity, has not been addressed here.

Watson is just another attempt at a heuristic search not reflecting any [simulated] human creativity. If computer theorem provers can’t even prove (as Ken Regan pointed out) an even number + 1 = an odd number, then it just shows the “intelligence” level achieved by any AI program.

If these IBM folks had made an attempt in the direction of simulating creativity, then that might have been of some use. Otherwise it has very little scientific significance– it is more for the publicity– yet another kind of an “intellectual” toy.

February 20, 2011 12:22 am

On questions where it had high confidence in its answer, Watson buzzed in 1/100 of a second after receiving the signal that the buzzers were open after Alex Trebek finished reading the question.

If humans also knew the answer, they had to hit within that 1/100 window in order to beat Watson and get an opportunity to answer.

So with one or two random exceptions, Watson got to answer all questions it knew the answer to, even if the answer was also instantly obvious to the human players.

Watson’s extremely high score against the human players was therefore primarily buzzing superiority, and not question answering superiority.

Stephen Wolfram fed a large number of Jeopardy questions without quotes to Google, and noted that the correct answer appeared in the top Google search result 66% of the time.

Few would argue that Google is an example of artificial intelligence.

Watson’s approach of massive parallel linguistic matching in a large database of text documents, combined with scoring the candidate answers, isn’t really an example of artificial intelligence either. You can’t, for instance, ask a question which appears nowhere in the database, but the answer can be deduced from information that does. Artificial intelligence is expected to be able to do such things.

So I think we’re a long way from Dick’s proposed “Thomas J,” if we want it to be something more than a massively parallel literature searching assistant.

February 20, 2011 8:45 am

1. Thanks Kavenbamggua for the definitions link. I will read all with interest as soon as i have time. Meanwhile here is my last attempt to define what properties an intelligent beeing must have (i don´t think that trying to define intelligence in abstract, not as a property of a beeing, makes any sense):

An intelligent beeing, natural or artificial, is a beeing that shows autonomy, computational universality (in Turing sense), self-representation (can represent himself and his peers as input, representing and acting) and self-improvement (learning) when making representations and acting or expressing himself.

Pls note that the ability to have ends / interests is included in actions: the intelligent beeing can give himself his own ends (autonomy). Humans do this: they act against their nature and that´s because reductionist approaches to biology fall to explain 99% of human actions. Self-representation is where conciousness starts.

2. “Yes, but this is the point: Where does the “programming” comes from”.

An artificial beeing can be intelligent even if he has not programmed himself, provided the programm allows him to make representations and act with autonomy, universality, self-representation and improvement. We humans do not programme ourselves (we born with it, encoded in DNA) and we are labelled as intelligent

3. I still do not see why it should be difficult for an AI machine to score high in a Raven-like test: he must identify the size of the problem (usually 9×9 grid), the diferent objects used to fill the grid, the type of symmetry of the pattern (translation, rotation, reflexion) and the pattern itself. It seems not so hard, a-priori.

• February 21, 2011 6:17 am

We humans do not programme ourselves

Of course, we grew out of evolution and evolution doesn’t “program” anything since it has no plan or purpose, but an artificial being has by definition been built/made/programmed by someone/something and it is important to make the distinction whether the “last stage” algorithm has been devised by the original creator or via some autonomous process within the artificial being.

I still do not see why it should be difficult for an AI machine to score high in a Raven-like test

I am only saying that it is currently (very) difficult given the state of the art, not that it is so “in principle”.