# The Thousand Percent Solution

* Boeing Dreamliner’s battery of problems *

Ray LaHood was until recently the top FAA administrator, the head of the U.S. agency that oversees air safety for the United States. He said at a meeting of the U.S. Conference of Mayors:

“Those planes aren’t going to fly until we are 1,000% sure that they are safe to fly.”

Today I want to talk about percents, beliefs, tests, and predictions.

I just flew in last week on a Boeing 777 from Dubai. It was a fifteen hour and 53 minute flight, made possible by the brilliant engineering of Boeing. The ability to fly that far reliably, on two engines, in reasonable comfort is a remarkable achievement. Yes I was in business class—thanks are due to the group I was visiting, the Qatar Computing Research Institute (QCRI). Boeing has long been one of the leaders in the design of safe planes.

Recently they have had troubles with their new plane, the 787 Dreamliner. It is an innovative plane, with great fuel efficiency through weight reduction based on extensive use of carbon fiber and other light materials.

As you no doubt know it has a small, but significant, problem: its batteries have caught on fire. One instance happened while the plane was on the ground, while another fire caused an emergency landing. I personally like my planes to avoid fires, and I personally prefer them not to make forced landings. I was once on a DC-10 from Rome to Newark that lost major hydraulics and had to make a forced landing in Paris, but that is another story.

## 1000 Percent

LaHood’s comment that he was going to be 1,000% sure is a doubly scary statement to me. As an engineer and scientist I know that a percent is always a ratio from to . You can have or or , but you cannot have anything larger than when talking about belief. It is not possible. Yes you can have a 200% increase in your intake of chocolate—a great idea—but not when it refers to certainty.

The British sometimes write percent as two words as “per cent.” We, in the US, almost always write “percent.” But one word or two, it must be below one hundred percent. Seems like scary to me for the head of an engineering agency to use the term 1,000%.

The other scary part to saying 1,000% reminds me of the famous presidential race in the U.S. in 1972. On July 25, 1972, George McGovern, the Democratic Presidential candidate, discovered his vice presidential candidate had once had electroshock therapy for clinical depression. McGovern quickly made a now famous remark to the press that he was

“1000 percent behind Tom Eagleton, and I have no intention of dropping him from the ticket.”

Eagleton was soon dropped, replaced by Sargent Shriver, and McGovern was defeated by one of the largest landslides in history.

## A Billion To One

The real fundamental issue is not the use of a silly phrase like “1000%,” but rather:

how do we test things to be sure they will work?

Prior to the 787’s fires, the prediction by Boeing was that a fire would happen at most once in about one billion flight hours. That seems pretty small, a minor risk, with many other risks of flying well ahead of the battery issue. But they were off by a factor of over .

The question mathematically is how can we calculate the probability of some event , and determine that it will occur in real-life with probability much less than one in a billion? In mathematics we can calculate low probabilities, but that seems to me very different from how to estimate real-life probabilities.

Let’s look at how one might calculate a low probability. Suppose that the event is of the form:

If we can estimate that each of the events occur about of the time, then we can conclude that ‘s probability is less that one in a billion. Of course this assumes that the events are independent events. If they are not independent, then we are unable to make this calculation.

In science and engineering such claims of independence are one way to calculate and claim very low probabilities. The fundamental issue with any such claim is that the events may be related. They may not be independent, and hence our calculation could be way off.

Note, testing whether or not two events are independent is as difficult as calculating the original probability. So what else can one do?

There is a method of stress-testing that is used in industry, and it was used by Boeing’s engineers. The idea is to calculate the probability that something fails by first running it in a more-stringent environment. A simple computer example would be to heat up a processor chip and see how long it takes to fail. The assumption is that if it lasts hours before failing in a heated state, it will last hours in a normal state. The hope is that this gives , and can therefore help compute very low probability events.

The trouble with these methods—independence and stress-tests—is that they do not seem to be based on a formal mathematical theory. The key question is, is there some way to make these methods into a formal theory? I do not know.

## A Googol To One

Often what one estimates is not directly the probability of failure in any one instance, but rather the mean time between failures (MTBF). In this case the MTBF would be expressed in units of flight hours, including time on the ground. Two advantages are that the MTBF can often be estimated without explicit reference to probabilities, and the time intervals have units and values that appeal to human reasoning.

Ken notes that the largest system to acquire a reputable MTBF estimate in recent years is the universe, specifically the pocket of the cosmos that includes Earth. This is a by-product of the measurement of the Higgs boson mass coming in between 125 and 126 GeV, significantly under the border of 129-to-130 GeV above which the Higgs field would not have a lower energy configuration to possibly transit to. Last month a speaker’s reference to “many tens of billions of years” as the expected time before transition was re-quoted as “about ten billion” and then “a few billion” years. Happily we can cite an authoritative physics site’s estimate from a noted paper that the MTBF of space under the Higgs field is on the order of

years.

The standard idea of inferring odds by inverting the MTBF then says the chance of space blowing up this calendar year is about one-in-a-googol. We still wonder about how these calculations are done, and what assumptions about independence and randomness of events such as those affecting the Higgs field are made. Estimates analogous to the above “stress tests” have generally used total concentrated energy as the main factor estimating the “” ratio, but we wonder about other possible factors.

## A Billion to One Again?

Closer to home, the unlikeliest event in poker may be a hand of 4 aces losing to a royal flush. Note that this can happen with a single 52-card deck if at least one ace is a “community card” that can be shared in different hands, such as in Texas Hold’Em poker. The odds of this have been cited as 2.7 billion to one against, yet it happened (video) in 2008 in the world’s premier poker tournament, the World Series of Poker (WSOP):

Have 2.7 billion deals been played in the history of the WSOP? An estimate based on a thousand tables, each playing about 200 deals a day, puts the number on the order of a few million. The “1-in-2.7 billion” odds figure comes from our source for the above photo from ESPN HD coverage, but we dispute it. For the 5 community cards to include two aces is roughly 1-in-25. For the other 3 cards to include one of the 8 cards to help a royal flush is then almost a 50% chance. Of the remaining 2 cards, one must be one of the other 3 cards that allow for a possible royal flush, just under a 1/8 chance. Hence it seems about 1-in-500 community spreads allow this to happen. Then we need one hand to have the other two aces; figuring 10 two-card hands that’s about (2/5)*(1/50) = 1/125, and ditto for the other two royal-flush cards. This gives a rough Enrico Fermi-style estimate of 1-in-500*125*125 = about 8 million, right on the order of the number of WSOP hands.

Of more importance, if you hold the two aces, and have bet confidently since before the “flop” of the first 3 community cards, and someone across the table hasn’t folded, the odds are probably higher than 1-in-125 that the other guy has the flush cards. Such *conditional* probabilities play hob with risk analysis. When are we in a conditional-probability situation? Can we know?

## Open Problems

What about the 787? Are you ready to fly in it soon? What about the estimation of very low probability events?

As for the 787, maybe you should use an area where people believe in reincarnation for additional testing

The “billion flight hours” was not intended as a prediction of failure rate. It is intended to represent a reasonable expectation for “no known combination of single failures shall cause a crash” while avoiding the old “One-Horse Shay” failure mode of everything failing at once. It comes from the results of a specific form of analysis:

It was recognized by the FAA in 1970 that aircraft manufacturers needed to consider the possibility of multiple faults in assessing the safety of new aircraft designs. In 1982, the FAA further recognized that the aircraft and its systems were becoming so complex that engineering judgement, testing, and historical data were insufficient to cope with the increasing interactions between multiple faults. So they mandated the supplementary use of analytical methods for analyzing the risk of multiple faults. The “billion hours per flight hour” refers specifically to the results of that analysis. SAE ARP 4761 defines the actual analytical methods used, but the fundamental fault tree analysis is discussed in NUREG-0492.

I’ll note that a common misconception (even in many so-called references) is that fault tree analysis assumes independent faults. It doesn’t, and good tools and analyses are full of representations of common cause failures. The problem is that nature is often sneaky about adding common causes, and humans are sometimes poor about finding them.

–Daniel Johnson

I count (52 choose 2)(50 choose 2)(48 choose 5) = 2.7 trillion ways to deal two 2-card hands and 5 community cards. Wonder if that is where 2.7 comes from? That would then be another example of vanishing decimal places.

For k players, pick one to give 2 of 4 aces, give other two to community, choose one of their suits, pick another player for 2 of 4 royal flush cards, give other two to community, and pick the remaining community card: k(k-1) (4 choose 2)^2*2*44. The remaining player hands cancel in numerator and denominator. So with k=10, I calculate one in 9.75 million; close enough to your envelope calculation that I’m willing to believe mine…

But what about the case where there are 3 aces in community? The two other community cards go towards the royal flush. Using simulation, I got something approximately 1 in 4 million.

The reason they have the same order likelihood is that there are now 44 possibilities for the second hole card of the player with the 4th ace or the 5th card in the royal flush, as opposed to 44 possibilities for the 5th community card.

Interesting—I (Ken) thought these cases would contribute only a lower-order term to the overall probability.

Also, for that matter, there is the case where 4 of the community cards go towards the Royal Flush, including an ace, with the other community card being an ace.

A lot, and probably the most stringent work in this space has been done in relation to safety of nuclear weapons.

There are some pretty interesting public material, about how they go about this, and as I gather, they pay a lot of attention to the individual pdfs, and then merge these in very stringent fashion.

The overall term they use is “math with uncertain numbers.”

Here is an interesting Sandia report on the subject: http://www.sandia.gov/epistemic/Reports/SAND2002-4015.pdf

the sad boeing story is partly about how complexity in [esp new! innovative!] design can be overwhelming to the point of crushing, a concept that is quite a familiar daily challenge for software engrs, although there are not too many books on the subj [grady booch is an interesting/excellent thinker/writer/speaker on the subj of architecture complexity]. heres a very standout analysis by a boeing insider called “how boeings dreamliner lost its way”. in a way boeings story of loss of supremacy is a sort of microcosm of the macrocosm of american ingenuity, engineering, and manufacturing. seriously challenged & yes, threatened in a very globalized, hypercompetitive world.

“As an engineer and scientist I know that a percent is always a ratio from {0} to {100}.”

Surely (based on the end of the same paragraph) you mean something like “Probability is always between 0 and 100%”.

I think I know where the 2.7 billion came from.

If you deal 5 random cards from a deck, the probability of getting a royal flush is 1 in 649740. The probability of getting a four-of-a-kind is 1 in 4165. If you naively multiply those two probabilities together, you get roughly 1 in 2.7 billion. I bet that’s what they did.

Regarding your statement about conditional probabilities and risk analysis in poker, here is a famous case where a player folded four of a kind to one river bet, having put his opponent squarely on a straight flush (not the royal flush, but still):

http://www.pokernews.com/live-reporting/2012-world-series-of-poker/event-55-the-big-one-for-one-drop-no-limit-hold-em/day1/post.207240.htm

http://www.pocketfives.com/articles/folding-quads-wsop-one-drop-craziest-hand-i-ve-ever-seen-587508/

And the ensuing incredulity in the online poker scene:

http://forumserver.twoplustwo.com/29/news-views-gossip/russian-fold-quads-one-drop-1217390/

(among the comments: “I’m just surprised a russian actually managed to fold a hand. “)

Eлки-палки, Я поражен! Thanks!