Some football wisdom from Dick Karp
|Cropped from S.I. Kids source|
John Urschel is a PhD student in the Applied Mathematics program at MIT. He has co-authored two papers with his Penn State Master’s advisor, Ludmil Zikatanov, on spectra-based approximation algorithms for the -complete graph bisection problem. A followup paper with Zikatanov and two others exploited the earlier work to give new fast solvers for minimal eigenvectors of graph Laplacians. He also plays offensive guard for the NFL Baltimore Ravens.
Today Ken and I wish to talk about a new result by the front linesman of -completeness, Dick Karp, about football.
Karp—Dick—-is a huge fan of various sports. Recently at Avi Wigderson’s birthday conference, held at Princeton’s IAS, he told me a neat new result on how to play football. It concerns a basic decision that a coach faces after his team scores a touchdown: kick for one extra point or attempt a riskier “two-point conversion” play.
We wonder whether players like Urschel might be useful for blocking out these decisions, more than blocking and opening “holes” for runners. By the way Urschel, who attended Canisius High School in Buffalo before his B.S. and M.S. plus a Stats minor at Penn State, is no stranger to IAS. Here he is with my friend Peter Sarnak there last year:
|AMS Notices feature source
Thus he is doing all the things we would advise a young theorist or mathematician to do: publish, circulate, talk about interesting problems, get on a research team, and open up avenues for deep penetration and advances by teammates.
The Football Rules
After a touchdown is made the team scoring gets 6 points. It then has an option:
- The team can kick the ball and try to get it through the goal posts. This is worth 1 extra point.
- The team can try to get the ball into the end zone by running or throwing it. This is worth 2 extra points.
Traditionally the right call in most game situations is (1). Usually the kicker can make the ball go through the posts most of the time, while getting the ball into the end zone is much more difficult. Of course at the end of a game there may be reasons to try for the 2 points. If the game is about over and you need 2 points to tie, that is probably the best play.
Karp set up a basic model of this decision. His model is a bit idealized, but I expect that he can and will make it more realistic in the future. The version I heard over a wonderful dinner that Kathryn Farley, my wife, set up for a small group of friends was not the proper venue to go into various technical details. So I will just relay his basic idea.
In his model we make several assumptions:
- We assume that the kicker is perfect: every try for the kick option works. Especially now with the kick moved back from the 2 to the 15 yard line, that is a pipe dream for any coach in the NFL, but it makes the discussion cleaner.
- We assume that the 2-point option succeeds 1/2 the time. That actually was exactly the case over a recent five-year stretch in the NFL.
- We assume that the team will score many touchdowns and therefore will face many such decisions.
The last clause means that we’re modeling the choice by an infinite walk. If you wish you may subtract 1 so that a kick gives 0, a successful play gives +1, but an unsuccessful one gives -1. Karp’s question is this:
What should the coach do? Always kick or sometimes go for two?
You might think about it before reading on for Karp’s answer.
His insight is this:
Theorem 1 (Fundamental Theorem of Football?) The optimal strategy is initially always to go for two. If after some number of tries you have succeeded times, so that you are ahead of what kicking would have brought, switch over to kicking.
Not Gambler’s Ruin?
Ken’s first reaction was to note a difference from “gambler’s ruin” which means to double-down after every lost 50-50 bet. In football this would mean that after you missed one conversion play, the next try would bring +3 points on success but subtract 1 from your score if you missed. Next time you could go for 5 but failure would cost 3. If you think in the +1/-1 terms compared to 0 for kicking, then this is the classic martingale system of doubling the bet 1,2,4,8… until you win and net +1. The ruin for gamblers is the chance of swiftly going bankrupt—but in football you can only lose one game.
However, we are not allowing doubling the bet either. It’s the classic random walk situation: right or left along the number line with equal probability, except that you can elect to stop and stay where you are. With probability 1, such a walk starting at 0 will reach a stage in positive territory at +1 net, and then we stop.
A real game, of course, does not have unboundedly many touchdowns—though some college games I’ve watched have sure felt like it. So if you miss a few two-point tries and the game is deep into the second half, you’re left holding the bag of foregone points.
Toward Deepening the Result
The question comes down to, what is the utility of nosing ahead by the extra point when you succeed, compared to being down more when you fail? How likely are game scenarios where that one extra point is the decider? To be concrete, suppose you score 3 touchdowns in the first three quarters. Following Karp’s strategy nets you an extra point 5/8 of the time: succeed the first time then kick twice, or fail the first time and succeed twice. Just 1/4 of the time you’ve lost a point, but 1/8 of the time you’re net -3 and need an extra field goal to get back to par.
There are late-game situations where the extra point is worth so much that it pays to go for two even with a chance of success that is under 40%. Suppose you are down 14 points and score a touchdown with 4 minutes left. You have enough time to stop the other team and get the ball back again for one more drive, but that’s basically it. You have to assume you will score a touchdown on that drive too, so the only variable is the conversion decision. The issue is that if you kick now and kick again, you’ve only tied the game and have a 50% win chance in overtime. Whereas if you go for two you have an chance of winning based on this figuring, plus if you fail you can still tie and get to overtime after your next TD. Thus your win expectation is
which crosses 50% when
When you have a expectation by going for two. Yet for human reasons, this is not on the standard chart of game situations calling for a two-point try. The human bias is toward maximizing your chances of “staying in the game” which is not the same as maximizing winning. There was a neat analysis of a similar situation in chess last year.
The challenging question to deepen Karp’s insight is, how far can we sensibly broaden this kind of analysis? Does this observation apply in real games? It seems to call for a big modeling and simulation effort, guided by theory where we might vary Karp’s simple rule and adjust for different probabilities of success on both the two-point tries and the kicks. This would bring it into the realm of machine learning with high-dimensional data, and per remarks on Urschel’s MIT homepage, perhaps he is headed that way.
What do you think of the insight for football strategy? We could also talk about when (not) to punt…
Urschel missed this season’s first three games but is starting at left guard right now for the Ravens, who are beating up on my Giants 10-0 as we go to post. Oh well. Of course he reminds me I once took classes from an NFL quarterback who similarly “went for two” in football and mathematics. We wish Urschel all the best and will follow his careers with interest. Enjoy the games today and tomorrow night.
[de-LaTeXed numerals for better look]
In the context of stable matching problems
Jamie Morgenstern is a researcher into machine learning, economics, and especially mechanism design.
Today Ken and I would like to discuss a joint paper of hers on the classic problem of matching schools and students.
The paper is titled, “Approximately Stable, School Optimal, and Student-Truthful Many-to-One Matchings (via Differential Privacy).” It is joint with Sampath Kannan, Aaron Roth, and Zhiwei Wu. Let’s call this paper (KMRW). She gave a talk on it at Georgia Tech last year.
The Classic Matching Problems
There are various instances of matching-type problems, but perhaps the most important is the NRMP assignment of graduating medical students to their first hospital appointments. In 2012 Lloyd Shapley and Alvin Roth—Aaron’s father—were awarded the Nobel Prize in Economics “for the theory of stable allocations and the practice of market design.”
The original matching problem was that of marrying females and males, on which we just posted about a recent joint paper of Noam Nisan. Here is a standard description of the stable matching problem (SMP):
Assume there are men and women. Further each has a linear ranking of the members of the opposite sex. Find a way to marry all the men and all the women so that the assignment is stable. This means that no two people would both rather marry each other instead of the partners they are assigned to.
In 1962, David Gale and Shapley proved that any SMP always has a solution, and even better, they gave a quadratic time algorithm that finds it. SMP as stated is less practical than the problem that results if the men are also allowed to marry each other, likewise the women, with everyone ranking everyone else as partners. But in the same famous paper, Gale and Shapley solved an even more realistic problem:
The hospitals/residents problem—also known as the college admissions problem—differs from the stable marriage problem in that the “women” can accept “proposals” from more than one “man” (e.g., a hospital can take multiple residents, or a college can take an incoming class of more than one student). Algorithms to solve the hospitals/residents problem can be hospital-oriented (female-optimal) or resident-oriented (male-optimal). This problem was solved, with an algorithm, in the original paper by Gale and Shapley, in which SMP was solved.
Let’s call this the college admissions problem (CAP).
Gaming the System
Consider adding another condition to CAP. Stability is clearly an important requirement—without it there would be students and colleges that would prefer to swap their choices. Having stability which avoids such a situation is quite nice. It does not force the assignment to be perfect in any sense, but does make it at least a “local” optimal solution. This is important, and it is used in real assignment problems today.
Yet there is a possibility that students, for example, could “game” the system. What if a student selected a strange ordering of schools that they claim they wish to attend. But the list is not really what they want. Why would they do this? Simply they may be able to lie about their preferences to insure that they get their top choice.
The basic point seems easiest to illustrate in the non-gender, single-party case, so let’s say need to form two pairs. They have circular first choices and second choices in the opposite circle:
Let’s say happens. and got their second choices, but it is not to their advantage to room together. They could have had their first choices under the also-stable configuration but there was no way to force the algorithm to choose it. However, let’s suppose and lie and declare each other to be their second choices:
The algorithm given these preferences sees as unstable since would join with , and the resulting as unstable because and would prefer each other. The resulting stable now gives and their first choices. Given what they may have known about ‘s and ‘s preference lists, there was no danger they’d actually have to pair up. Examples in the two-gender and student-college cases need six actors but the ideas are similar.
Systematizing the Game
The basic Gale-Shapley algorithm for the two-party case actually has two “poles” depending on which party has its preferences attended to first. The current NRMP algorithm favors the applicants first. In college admissions one can imagine the schools going first. Alvin Roth proved that the poles had the following impact on truthfulness:
Theorem 5. In the matching procedure which always yields the optimal stable outcome for a given one of the two sets of agents (i.e., for or for ), truthful revelation is a dominant strategy for all the agents in that set.
Corollary 5.1. In the matching procedure which always yields the optimal stable outcome for a given set of agents, the agents in the other set have no incentive to misrepresent their first choice.
A dominant strategy here means a submitted list of preferences that the individual has no motive to change under any combination of preferences submitted by the other agents (on either side). The upshot is that the favored side has no reason to deviate from their true preferences, but the non-favored side has motive to spy on each other (short of colluding) and lie about their second and further preferences.
There has been much research on improving fairness by mediating between the “poles” but this has not solved the truthfulness issue. Can we structure the algorithm so that it still finds stable configurations but works in a way that both parties have incentive to be truthful and use their real orderings?
Unfortunately, Roth’s same paper proved that there is no method to solve SMP or CAP that is both stable and truthful. Only the above conditions on one side or the other are possible.
Can We Handle Truth?
Judge Randolph: Consider yourself in contempt!
Kaffee: Colonel Jessup, did you order the Code Red?
Judge Randolph: You don’t have to answer that question!
Col. Jessup: I’ll answer the question!
Col. Jessup: You want answers?
Kaffee: I think I’m entitled to.
Col. Jessup: You want answers?
Kaffee: I want the truth!
Col. Jessup: You can’t handle the truth!
However, a recurring theme in theory especially since the 1980s is that often we can work around such negative results by:
- randomizing so that requirements hold “with high probability” or “in expected case,” and/or
- accepting approximations, and/or
- making results (more) asymptotic, and/or
- fencing off undesirable cases in some way.
Recall that in the college admissions case where the schools act first, the schools have no motive to do other than declare their true preferences, but the students might profit if they submit bogus second, third, and further preferences to the algorithm. The new KMRW paper states the following:
We present a mechanism for computing asymptotically stable school optimal matchings, while guaranteeing that it is an asymptotic dominant strategy for every student to report their true preferences to the mechanism. Our main tool in this endeavor is differential privacy: we give an algorithm that coordinates a stable matching using differentially private signals, which lead to our truthfulness guarantee. This is the first setting in which it is known how to achieve nontrivial truthfulness guarantees for students when computing school optimal matchings, assuming worst- case preferences (for schools and students) in large markets.
Approximation and Truth
By “approximately stable” the KMRW paper means a relaxation of the third of three conditions on a function that maps students to colleges plus the “gap year” option . If left off his or her application list, that is read as preferring to . Symmetrically, may have an admission threshold that some students fall below.
- The set of students mapped to is no greater than the capacity of .
- For every student mapped to and student in a college it is not the case that listed over and ranked over .
- For every college with an opening, i.e. , and student who listed as a possibility, either is below the admission threshold of or prefers to .
The relaxation is that for some fixed , colleges may hold on admitting qualified students who might prefer them if they are within of their capacity:
3′. Same as 3. but with in place of the first condition.
There is also an approximation condition on the potential gain from lying about preferences. This requires postulating a numerical utility for student to attend college that is monotone in the true preferences. Given , say the preference list submitted by is -approximately dominant if for all other lists by —and all lists submitted by other students which induce the mappings with and with —we have
In fact, KMRW pull the randomization lever by stipulating the bound only on the difference in expected values over matchings produced by the randomized algorithm they construct. Then this says that no student can expect to gain more than utility by gaming the system.
Confining the utility values to ensures that any is larger than some differences if the number of colleges is great enough, so this allows some slack compared to strict dominance which holds if . It also enables the utility values to be implicitly universally quantified in the condition by which and “mesh” with , the number of students, and the minimum capacity of any college:
Theorem 1 Given and , there are universal constants and an efficient randomized algorithm that with high probability produces a feasible -approximately stable solution in which submitting the true preferences is -approximately dominant in expected value for each student, provided that and
This is an informal statement. The mechanism underlying the proof also works for not fixed: provided grows faster than , the approximate stability and probable-approximate truth-telling dominance can be achieved with and both tending toward . It is neat that the approximations are achieved using concepts and tools from differential privacy, which we have posted about before. By analogy with PAC, we might summarize the whole system as being “probably approximately truthful.”
Should We Handle Truth?
The notion of truth is fundamental. In logic the notion of truth as applied to mathematical theories is central to the Incompleteness Theorems of Kurt Gödel and Alfred Tarski.
In economics the notion of truth is different but perhaps even more important. Imagine any set of agents that are involved in some type of interaction: it could be a game, an auction, or some more complex type of interaction. Typically these agents make decisions, which affect not only what happens to them, but also to the other agents. In our “” example above, the further point is that “gaming the system” by and lowered the utility for and . But and could have tried the same game.
The effect on others highlights that this is a basic problem. It seems best when the interaction does not reward agents that essentially “lie” about their true interests. This speaks our desire for a system that rewards telling the truth—and takes the subject into the area of algorithmic mechanism design which we also featured in the post on Nisan. Indeed truth is addressed in his Gödel Prize-winning paper with Amir Ronen, for instance defining a mechanism to be strongly truthful if truth-telling is the only dominant strategy.
That paper follows with a sub-sub-section 4.3.1 on “basic properties of truthful implementations”—but what I’m not finding in these papers are theorems that tell me why truthfulness is important in economic interactions. It sounds self evident, but is it? There are many papers that show one cannot force agents to be truthful, and there are other results showing cases in which the agents’ individual best interests are to be truthful after all. I understand why a solution to a matching problem should be stable, but am not convinced that it needs to be truthful. In mathematics we can define a property that we add to restrict solutions to some problem, but we usually need to justify the property. If we are solving a equation, we may restrict the answers to be integers. The reason could be as simple as non-integer answers making no sense, such as buying 14.67 cans of Sprite for a party.
I get that being truthful does stop some behavior that one might informally dislike. What I feel as an outsider to the research into matching problems is simple: where is the theorem that shows that adding truthful behavior has some advantage? It is true in the analogous case of auctions that they can be designed so that truthful bidding is provably a dominant strategy, and plausibly this matters to competitors agreeing to and paying for the auction mechanism. Perhaps there is a “meta” game level where mechanism designs are strategies and there is a communal payoff function, in which truth-inducing mechanism designs may be optimal strategies. But overall I am puzzled. Perhaps this just shows how naive I am about this line of work.
What are your reactions on the importance of inducing truth-telling? To show at least dewy diligence on our part, here are a few good references. What would Gödel—who conveyed certain of his own mechanism design opinions to his friend Oskar Morgenstern (no relation)—say?
The winner of the 2016 ACM-IEEE Knuth Prize
Noam Nisan has been one of the leaders in computational complexity and algorithms for many years. He has just been named the winner of the 2016 Donald E. Knuth Prize.
Today we congratulate Noam and highlight some of his recent contributions to algorithmic economics and complexity.
Local rules can achieve global behavior
Sarah Cannon is a current PhD student in our Algorithms, Combinatorics, and Optimization program working with Dana Randall. Sarah has a newly-updated paper with Dana and Joshua Daymude and Andrea Richa entitled, “A Markov Chain Algorithm for Compression in Self-Organizing Particle Systems.” An earlier version was presented at PODC 2016.
Today Ken and I would like to discuss the paper, and relate it to some recent results on soft robots.
Toward teaching computability and complexity simultaneously
|Large Numbers in Computing source|
Wilhelm Ackermann was a mathematician best known for work in constructive aspects of logic. The Ackermann function is named after him. It is used both in complexity theory and in data structure theory. That is a pretty neat combination.
I would like today to talk about a proof of the famous Halting Problem.
A new longest computer proof makes us wonder about things from security to the Exponential Time Hypothesis
Marijn Heule, Oliver Kullmann, and Victor Marek are experts in practical SAT solvers. They recently used this ability to solve a longstanding problem popularized by Ron Graham.
Today Ken and I want to discuss their work and ask a question about its ramifications.
A way to make indirect reasoning more palpable
|Wikimedia Commons source|
Nicholas Saunderson was the fourth Lucasian Professor at Cambridge, two after Isaac Newton. He promoted Newton’s Principia Mathematica in the Cambridge curriculum but channeled his original work into lecture notes and treatises rather than published papers. After his death, most of his work was collected into one book, The Elements of Algebra in Ten Books, whose title recalls Euclid’s Elements. It includes what is often credited as the first “extended” version of Euclid’s algorithm.
Today we raise the idea of using algorithms such as this as the basis for proofs.