The role of sex?

 Cropped from source

Christos Papadimitriou has a joint paper with Adi Livnat, Aviad Rubinstein, Gregory Valiant, and Andrew Wan that will appear soon at FOCS 2014. The conference is October 19–21 in Philadelphia, with workshops and tutorials on the 18th. Here are the accepted papers, several others of which interest me a lot. The last parallel session on Monday afternoon before my own award lecture has three of them in one room, including a paper co-authored by my recent student Danupon Nangonkai, and three on quantum—it would be nice to be in a quantum superposition and attend both sessions.

Today Ken and I want to discuss their paper, which is on complexity-theoretic aspects of evolution.

Yes, evolution. Theory’s long reach has in the past touched diverse fields of science, engineering, and many other areas of human endeavor. But evolution is now getting more and more interest from theorists. Wikipedia begins its description with the sentence:

Evolution is the change in the inherited characteristics of biological populations over successive generations.

This doesn’t sound too far different from the idea of a computation “evolving” over successive steps. What distinguishes the biological kind is Charles Darwin’s identification of natural selection as the main mechanism, specifically for adaptation after changes in the environment. The discovery that genes in DNA are describable by strings, likewise their mutations, powerfully suggests a mating of biology and the “it from bit” view from theory. Can we use theory to turn Darwin inward and explain how complexity and diversity can arise from a simple Boolean model?

## Theory and Fact and Paradigm

For many reasons that we will avoid, of all theories in science, evolution gets the most push-back. There are not many popular attacks on the inverse square law of gravity, even though string theory provides contexts in which it is false at short ranges. That matter is made of atoms used to be contentious, but no more, and there is no argument over how chemical bonds work. Evolution seems to get more than its share of attacks. Way more.

There is even a Wikipedia article on why this is so:

Many scientists and philosophers of science have described evolution as fact and theory, a phrase which was used as the title of an article by Stephen Jay Gould in 1981. He describes fact in science as meaning data, not absolute certainty but “confirmed to such a degree that it would be perverse to withhold provisional assent.” A scientific theory is a well-substantiated explanation of such facts. The facts of evolution come from observational evidence of current processes, from imperfections in organisms recording historical common descent, and from transitions in the fossil record. Theories of evolution provide a provisional explanation for these facts.

Ken thinks Gould’s point is better stated as evolution being not just a theory but also a paradigm. Within it, theories of particular mechanisms are evaluated; outside, the paradigm is applied via “evolutionary approaches to…” just about everything. The former give the impression that there is disagreement about the fact of evolution itself, while the latter include enough overreaches and oversimplifications to muddy the whole enterprise. The fact of evolution is also confused with questions about its rate and hill-climbing ability when “left to itself,” though experience in hospitals and our current lack of knowledge about complexity should forestall negative pronouncements on the latter.

What compounds the matter is that evolution can be enlarged to the study of how complex organized information systems arise. Indeed, several books by Richard Dawkins emphasize information content in biology and the quest for a “crane-lifting” mechanism for information in physics analogous to Darwin’s in biology. There the paradigm intersects directly with computer science:

The serious question arises of how fast evolutionary-information processes can work as measured by standard yardsticks of complexity, and how much large-scale change they can effect. This makes a context for the paper.

## The Paper

Their paper is titled, “Satisfiability and Evolution.” Its abstract is:

We show that, if truth assignments on n variables reproduce through recombination so that satisfaction of a particular Boolean function confers a small evolutionary advantage, then a polynomially large population over polynomially many generations (polynomial in n and the inverse of the initial satisfaction probability) will end up almost surely consisting exclusively of satisfying truth assignments. We argue that this theorem sheds light on the problem of the evolution of complex adaptations.

Ken and I really like this work. Really. One reason is that it is a positive result. Often—not always—but often when we apply theory to areas from other parts of science we run into negative results. Telling people in another area of research that they cannot do X is not usually helpful. In many cases they need to do the work anyway, and so telling them a negative is really less than useful.

Another reason is that they work uses the evolution setup to show that a certain reasonable type of behavior happens. This is a quite pretty result about Boolean functions, independent of any connection to evolution. Further it shows that there are likely to be more such results in the near future. I think this paper is potentially quite important.

I am sorry that I missed the evolution workshop last winter at the Simons Theory Center—I got a bad cold and could not travel. I imagine that some of the work that is in this paper came out of that event. However, Ken heard a related talk invited to the Algorithmic Aspects of Information Management (AAIM 2014) conference in Vancouver in July, where Ken also had a paper. So we can turn the rest of this post over to Ken.

## Benefits With Benefits

I, Ken, felt like I was following Christos around in summer 2012. He gave one version of this talk at the Turing Centennial in Princeton in May, which we covered. Then I heard much the same talk invited to AAAI-2012 that July in Toronto, where I was battling a robot. Then I was honored to speak in the University of Bergen’s Turing Centennial series in September a couple weeks after he did. His talk there was related more to his graphic novel Logicomix, which later appeared under our Christmas tree.

So, this past July, I was curious to see how his talk (later video) had evolved. Well, that is a mis-use of “evolved” unless he and his co-authors introduced processes to mutate slides and select—over successive generations—the ones with the highest research fitness. Perhaps they did. I was glad that a genetic example from the 2012 talks survived, since I had advertised—in encouraging my wife to join for the 8:20am start—that the talk would include sex.

We reproduce his first main example here. Consider two genes, ${A}$ and ${B}$, with 4 and 5 alleles (mutable variants) respectively occurring in a population. Suppose the benefits of each of the 20 possible combinations are given by this ${4 \times 5}$ matrix:

$\displaystyle \begin{array}{r|ccccc} & B_1 & B_2 & B_3 & B_4 & B_5\\ \hline A_1 & 3 & 2 & 4 & 5 & 4\\ A_2 & 1 & 0 & 0 & 7 & 2\\ A_3 & 2 & 1 & 0 & 4 & 3\\ A_4 & 1 & 8 & 1 & 3 & 2 \end{array}$

The argument is that global optimization techniques, such as simulated annealing, will tend toward the combinations with highest fitness unto themselves, here ${(A_4,B_2)}$ scoring ${8}$ and ${(A_2,B_4)}$ scoring ${7}$. These are likened to asexual propagation. However, row ${A_1}$ has the best fitness expectation over a partner drawn randomly from the partition ${B_1..B_5}$. Likewise columns ${B_4}$ and ${B_5}$ “mix best” when combined over the range of gene ${A}$. The sexual recombination algorithm gives ${(A_i,B_j)}$ a random partner ${(A_k,B_\ell)}$ according to the population’s distribution, and each offspring gets ${A_i}$ or ${A_k}$, likewise ${B_j}$ or ${B_\ell}$, with 50-50 probability.

In a 2011 paper with Livnat and Marcus Feldman formalizing this process, he showed that isolated “peaks” like ${(A_4,B_2)}$ and ${(A_2,B_4)}$ tend to wear down in frequency, but “plateaus” like row ${A_1}$ and some of the ${B}$ columns survive. This showed why we need sex—or at least gave an algorithmic mechanism to answer the question:

Why is recombination (i.e., sexual reproduction) more successful than asexual reproduction?

## The Missing Link to Complexity

Despite this answer, Christos admitted that “the role of sex is to this day not understood.” His talk went immediately on to a question that is more eye-opening than eyebrow-raising:

How can this crude mechanism and model account for the miracle [of complexity that] we see around us?

As stated in the new FOCS paper:

Given the reshuffling of genomes that occurs through recombination, how can complex traits that depend on many different genes arise and spread in a population?

The “complex traits” are approached and modeled by standard tools in complexity, most notably Boolean functions. Here is a mutated version of an example in the talk: Suppose we have monotone predicates ${P,P'}$ of genetic factors ${x = x_1,\dots,x_n}$, where ${P(x)}$ enables a trait to be possible and ${P'(x)}$ guarantees it. Let ${h}$ be Hamming weight, and let ${g}$ be a numerical function of the environment ${E}$. Suppose a trait ${\tau}$ follows a sigmoid probability function with high slope around a threshold ${t}$ as follows:

$\displaystyle h(x : P(x)) + g(E) \geq t.$

If the environment changes in a way that both increases ${g(E)}$ and makes ${\tau}$ advantageous, then not only will the incidence of ${\tau}$ go up, but the above recombination algorithm will favor organisms that possess all or almost all of the factors. Then even if ${g(E)}$ reverts to a lower level, the new population may have so many members with high ${h(x)}$ that not only does ${P(x)}$ remain satisfied but also ${P'(x)}$. Then the trait is locked in to much of the population. This yields an interpretation of a famous experiment by Conrad Waddington which is common to the talk and paper.

## The Results

The FOCS paper uses none of ${E}$, ${g}$, ${h}$, ${P}$, ${P'}$, or ${t}$—these characters (some by me, some by them) die out or maybe stay latent. What it preserves is an abstraction of the counting advantage that having more factors ${x_i = 1}$ gives in the above analysis. Let ${f}$ be a general Boolean function whose satisfaction confers a ${(1 + \epsilon)}$ advantage in the following way: Define ${f'(x) = 1 + \epsilon f(x)}$; that is, ${f'(x)}$ is ${1 + \epsilon}$ on satisfying assignments, ${1}$ on unsatisfying ones. Let the probability distribution ${\mu_t}$ on ${x \in \{0,1\}^n}$ at time ${t}$ be defined by independent probabilities ${\text{\bf Pr}_t[x_i = 1]}$ for each bit, and updated by the rule

$\displaystyle \text{\bf Pr}_{t+1}[x_i = 1] = \frac{\text{\bf E}_{x: \mu_t}[x_i\cdot f'(x)]}{\text{\bf E}_{x: \mu_t}[f'(x)]}. \ \ \ \ \ (1)$

We can re-cast this using the original ${f}$ and probabilities rather than expectations:

$\displaystyle \begin{array}{rcl} \text{\bf Pr}_{t+1}[x_i = 1] &=& \frac{\text{\bf Pr}_t[x_i = 1] \;+\; \epsilon\text{\bf Pr}_{x: \mu_t}[x_i \wedge f(x)]}{1 \;+\; \epsilon\text{\bf Pr}_{x: \mu_t}[f(x)]}\\ \\ &=& \text{\bf Pr}_t[x_i = 1]\left(\frac{1 \;+\; \epsilon\text{\bf Pr}_{x: \mu_t}[f(x)~|~x_i = 1]}{1 \;+\; \epsilon\text{\bf Pr}_{x: \mu_t}[f(x)]}\right). \end{array}$

The question is, does the trait of satisfying ${f}$ come to dominate the population? Let ${p_0 = \text{\bf Pr}_{x: \mu_0}[f(x)]}$ be the initial satisfaction probability under whatever initial distribution ${\mu_0}$ is given, say uniform. They prove:

Theorem 1 If ${f}$ is monotone, then

$\displaystyle \text{\bf Pr}_{x: \mu_t}[f(x)] \;\;\geq\;\; 1 \;-\; \frac{n}{t}\cdot\frac{(1+\epsilon)}{\epsilon p_0}.$

The bulk of the paper, however, is devoted to how far this extends when ${f}$ is not monotone. The process has to be modified, the numerator in equation (1) is rewritten to confer possible advantage when ${x_i = 0}$ too (according to ${\epsilon}$ and the variance of bit ${i}$ in ${\mu_t}$), ${f}$ is replaced by a multilinear extension ${\tilde{f}}$ with values in the interval ${[1,1+\epsilon]}$, the size ${N}$ of the population needs to be large enough, more-complicated polynomial functions of ${n}$, ${1/p_0}$, and ${1/\epsilon}$ are involved, and there are interlocking bounds on ${N}$, ${\epsilon}$, ${p_0}$, and the success probability. As usual see the paper for details, but we can abbreviate their result drastically as follows:

Theorem 2 Taken over all ${f}$,

$\displaystyle \text{sex} \implies \text{complex}.$

We suspect that no human relationship counselor would disagree.

## Open Problems

The paper includes a long discussion of possible variants and improvements to the model and results. Insofar as ${p_0}$, namely the initial probability of a random assignment satisfying ${f}$, is a factor in the denominator, the results lose force when ${p_0}$ is tiny. Indeed the authors disclaim relevance to the task of finding satisfying assignments to a hard formula ${f}$ with small ${p_0}$. What they call “weak selection” owing to the smallness of ${\epsilon}$ in their main results becomes an interesting problem of how far ${\epsilon}$ can be increased, and maybe this informs general issues about the complexity of Boolean functions and applicability of Fourier techniques, not just evolution.

[changed order of names in first sentence from paper’s alpha order]

1. September 22, 2014 6:49 pm

I agree that the serious question is “how fast” and “how much”. Almost everyone agrees that things change over time, but I appreciate how Darwin promotes a dominating process, “the strong principle of inheritance”. Is the, “abstraction of the counting advantage,” likewise a principled foundation here, or is there a principle for how a trait would come to dominate the population? (Sorry I can’t contribute on the formal logic.) In any case, I like to say that inheritance dominates biology like ideology dominates psychology.

September 23, 2014 9:33 am

I would think that because most computer scientists believe that P != NP, most computer scientists would also not believe that random evolution is the sole cause of life, since the problems involved with creating life are most likely NP-complete problems and in a universe in which P != NP it is impossible to solve these problems efficiently.

Do most computer scientists also believe in the theory of macro-evolution?

September 23, 2014 9:34 am

I should have said “in a universe in which P!=NP it is impossible to solve these problems efficiently without an oracle (God).”

September 23, 2014 12:17 pm

1) Which problems are you referring to and what makes you believe they are NP-complete?
2) What makes you think that in the specific instances of those problems solved, constraints are arranged such that a simple local hill-climbing method would not be able to achieve a “good enough” solution to sustain and promote life? The result referred to in this post (and other results unmentioned here and a bit of common sense) would indicate that there is at least a broad class of problems solvable in this way.
3) To answer your question, I’m not sure what you mean by macro-evolution, since creationists use the term to mean a variety of different things. If you mean the process of speciation by divergent changes, then, while I can make no certain claims about large groups of people I’ve never met, I’d wager that most computer scientists recognize that this process takes place, yet do not trouble themselves with the specific mechanisms that are theorized to propel it (those theorists mentioned in this post excepted, of course).

September 23, 2014 2:04 pm

“I would think that because most computer scientists believe that P != NP, most computer scientists would also not believe”

Without trying to guess what most computer scientists think, there is very little relevance between this conjecture and evolution. Many optimization problems that are hard in the worst case admit efficient algorithms when some constraints are relaxed. For certain NP-hard problems, a randomly chosen instance is readily satisfied. For others, approximate solutions can be found.

There is absolutely no reason to think that the diversity of life on earth represents an optimum set of solutions to any combinatorial problem, any more than there is reason to believe that a soap-film steiner tree is in fact the minimum steiner tree even if it looks very good. The fact that many good solutions can be found using the time and resources of the physical universe is unsurprising, even if (we conjecture) the absolute best solutions remain out of reach. This also corresponds to practical experience using such techniques as greedy algorithms, hill climbing, simulated annealing, and genetic algorithms, none of which guarantee optimum solutions, but which often provide useful results.

Note that evolution is no more a “chance” process than any other algorithm that introduces statistical sampling. It is a complex process in which constraints are repeatedly tested and incremental improvements are maintained and proliferate. The creationist “tornado in a junkyard” strawman bears no resemblance to evolution. Once this is understood, it is no longer surprising than evolution can produce complex structures that meet particular constraints. Since these structures are not guaranteed to be optimal, there is absolutely no relevance to an argument based on complexity theory.

• September 24, 2014 9:14 am

“Do most computer scientists also believe in the theory of macro-evolution?”

When someone uses the term “macroevolution”, the bayesian prior for them being a creationist is high.

September 23, 2014 2:08 pm

I was once working on a side project where the question of using genetic algorithms came up, and I wondered about this question. Asexual reproduction with mutation looks (to this non-biologist computer scientist) like hill-climbing with some random tweaks. Genetic algorithms (the CS equivalent of sexual reproduction) can do something apparently more interesting by combining parts of good solutions that might take a really long time to arise through a sequence of random tweaks to a single hill-climbing instance. But is that really true?

So intuitively, I felt that GA would offer an advantage, but I was unaware of any theoretical models for it. It would definitely be worth having a closer look at this result.

4. September 23, 2014 3:03 pm

Somewhat related review in Nature by Tim Lenton regarding E.O. Wilson’s new book, The Meaning of Human Existence.

“The frustration for the neutral reader is that both sides agree that the gene is the fundamental unit of selection, so the squabble is over different flavours of standard evolutionary theory. Neither side seems to see the Pythonesque irony of fighting over how to understand cooperation. Still, nothing could better demonstrate the tribal nature of humanity, which provides a focus for the rest of the book.”

5. September 24, 2014 5:43 am

Sorry for the non-sequitur — At some point in the past a commentator included a link to a theorist or mathematician discussing how they’d effectively destroyed their field by proving too many theorems too quickly. Now I can’t find it. Can anyone remember who or what that was? Thanks kindly.

• September 24, 2014 8:34 am

That is Bill Thurston, whose paper we contrasted in the Knuth Prize post—I had spotted it in a comment by John Sidles here. The essay by Thurston is here.

• September 24, 2014 1:42 pm

Thank you so much! It’s such a great article, I should have saved it the first time. That’s tremendously helpful, appreciate the assistance.