Skip to content

Why Is Being Truthful A Good Thing?

October 14, 2016


In the context of stable matching problems

jamie-morgenstern_0

Jamie Morgenstern is a researcher into machine learning, economics, and especially mechanism design.

Today Ken and I would like to discuss a joint paper of hers on the classic problem of matching schools and students.

The paper is titled, “Approximately Stable, School Optimal, and Student-Truthful Many-to-One Matchings (via Differential Privacy).” It is joint with Sampath Kannan, Aaron Roth, and Zhiwei Wu. Let’s call this paper (KMRW). She gave a talk on it at Georgia Tech last year.

The Classic Matching Problems

There are various instances of matching-type problems, but perhaps the most important is the NRMP assignment of graduating medical students to their first hospital appointments. In 2012 Lloyd Shapley and Alvin Roth—Aaron’s father—were awarded the Nobel Prize in Economics “for the theory of stable allocations and the practice of market design.”

The original matching problem was that of marrying females and males, on which we just posted about a recent joint paper of Noam Nisan. Here is a standard description of the stable matching problem (SMP):

Assume there are {n} men and {n} women. Further each has a linear ranking of the members of the opposite sex. Find a way to marry all the men and all the women so that the assignment is stable. This means that no two people would both rather marry each other instead of the partners they are assigned to.

In 1962, David Gale and Shapley proved that any SMP always has a solution, and even better, they gave a quadratic time algorithm that finds it. SMP as stated is less practical than the problem that results if the men are also allowed to marry each other, likewise the women, with everyone ranking everyone else as partners. But in the same famous paper, Gale and Shapley solved an even more realistic problem:

The hospitals/residents problem—also known as the college admissions problem—differs from the stable marriage problem in that the “women” can accept “proposals” from more than one “man” (e.g., a hospital can take multiple residents, or a college can take an incoming class of more than one student). Algorithms to solve the hospitals/residents problem can be hospital-oriented (female-optimal) or resident-oriented (male-optimal). This problem was solved, with an algorithm, in the original paper by Gale and Shapley, in which SMP was solved.

Let’s call this the college admissions problem (CAP).

Gaming the System

Consider adding another condition to CAP. Stability is clearly an important requirement—without it there would be students and colleges that would prefer to swap their choices. Having stability which avoids such a situation is quite nice. It does not force the assignment to be perfect in any sense, but does make it at least a “local” optimal solution. This is important, and it is used in real assignment problems today.

Yet there is a possibility that students, for example, could “game” the system. What if a student selected a strange ordering of schools that they claim they wish to attend. But the list is not really what they want. Why would they do this? Simply they may be able to lie about their preferences to insure that they get their top choice.

The basic point seems easiest to illustrate in the non-gender, single-party case, so let’s say {A,B,C,D} need to form two pairs. They have circular first choices and second choices in the opposite circle:

\displaystyle  A \rightarrow B \rightarrow C \rightarrow D \rightarrow A; \qquad A \Longrightarrow D \Longrightarrow C \Longrightarrow B \Longrightarrow A.

Let’s say {(AB,CD)} happens. {B} and {D} got their second choices, but it is not to their advantage to room together. They could have had their first choices under the also-stable configuration {(AD,BC)} but there was no way to force the algorithm to choose it. However, let’s suppose {B} and {D} lie and declare each other to be their second choices:

\displaystyle  A \rightarrow B \rightarrow C \rightarrow D \rightarrow A; \qquad A \Longrightarrow D \Longrightarrow B;\quad C \Longrightarrow B \Longrightarrow D.

The algorithm given these preferences sees {(AB,CD)} as unstable since {B} would join with {D}, and the resulting {(AC,BD)} as unstable because {A} and {D} would prefer each other. The resulting stable {(AD,BC)} now gives {B} and {D} their first choices. Given what they may have known about {A}‘s and {C}‘s preference lists, there was no danger they’d actually have to pair up. Examples in the two-gender and student-college cases need six actors but the ideas are similar.

Systematizing the Game

The basic Gale-Shapley algorithm for the two-party case actually has two “poles” depending on which party has its preferences attended to first. The current NRMP algorithm favors the applicants first. In college admissions one can imagine the schools going first. Alvin Roth proved that the poles had the following impact on truthfulness:

Theorem 5. In the matching procedure which always yields the optimal stable outcome for a given one of the two sets of agents (i.e., for {M} or for {W}), truthful revelation is a dominant strategy for all the agents in that set.

Corollary 5.1. In the matching procedure which always yields the optimal stable outcome for a given set of agents, the agents in the other set have no incentive to misrepresent their first choice.

A dominant strategy here means a submitted list of preferences that the individual has no motive to change under any combination of preferences submitted by the other agents (on either side). The upshot is that the favored side has no reason to deviate from their true preferences, but the non-favored side has motive to spy on each other (short of colluding) and lie about their second and further preferences.

There has been much research on improving fairness by mediating between the “poles” but this has not solved the truthfulness issue. Can we structure the algorithm so that it still finds stable configurations but works in a way that both parties have incentive to be truthful and use their real orderings?

Unfortunately, Roth’s same paper proved that there is no method to solve SMP or CAP that is both stable and truthful. Only the above conditions on one side or the other are possible.

Can We Handle Truth?

Roth’s negative result reminds me of the famous dialogue toward the end of the 1992 movie A Few Good Men:


Judge Randolph: Consider yourself in contempt!

Kaffee: Colonel Jessup, did you order the Code Red?

Judge Randolph: You don’t have to answer that question!

Col. Jessup: I’ll answer the question!

[to Kaffee]

Col. Jessup: You want answers?

Kaffee: I think I’m entitled to.

Col. Jessup: You want answers?

Kaffee: I want the truth!

Col. Jessup: You can’t handle the truth!


However, a recurring theme in theory especially since the 1980s is that often we can work around such negative results by:

  • randomizing so that requirements hold “with high probability” or “in expected case,” and/or

  • accepting approximations, and/or

  • making results (more) asymptotic, and/or

  • fencing off undesirable cases in some way.

Recall that in the college admissions case where the schools act first, the schools have no motive to do other than declare their true preferences, but the students might profit if they submit bogus second, third, and further preferences to the algorithm. The new KMRW paper states the following:

We present a mechanism for computing asymptotically stable school optimal matchings, while guaranteeing that it is an asymptotic dominant strategy for every student to report their true preferences to the mechanism. Our main tool in this endeavor is differential privacy: we give an algorithm that coordinates a stable matching using differentially private signals, which lead to our truthfulness guarantee. This is the first setting in which it is known how to achieve nontrivial truthfulness guarantees for students when computing school optimal matchings, assuming worst- case preferences (for schools and students) in large markets.

Approximation and Truth

By “approximately stable” the KMRW paper means a relaxation of the third of three conditions on a function {\mu} that maps {n} students {a_i} to {m} colleges {U_j} plus the “gap year” option {U_0}. If {a_i} left {U_j} off his or her application list, that is read as preferring {U_0} to {U_j}. Symmetrically, {U_j} may have an admission threshold that some students {a_i} fall below.

  1. The set {\mu^{-1}(U_j)} of students mapped to {U_j} is no greater than the capacity {C_j} of {U_j}.

  2. For every student {a_i} mapped to {U_j} and student {a_k} in a college {U_\ell \neq U_j} it is not the case that {a_i} listed {U_{\ell}} over {U_j} and {U_j} ranked {a_i} over {a_k}.

  3. For every college {U_j} with an opening, i.e. {|\mu^{-1}(U_j)| < C_j}, and student {a_i} who listed {U_j} as a possibility, either {a_i} is below the admission threshold of {U_j} or {a_i} prefers {\mu(a_i)} to {U_j}.

The relaxation is that for some fixed {\alpha > 0}, colleges may hold on admitting qualified students who might prefer them if they are within {\alpha} of their capacity:

    3′. Same as 3. but with {|\mu^{-1}(U_j)| < C_j(1 - \alpha)} in place of the first condition.

There is also an approximation condition on the potential gain from lying about preferences. This requires postulating a numerical utility {v_{i,j} \in [0,1]} for student {a_i} to attend college {U_j} that is monotone in the true preferences. Given {\eta > 0}, say the preference list {L_i} submitted by {a_i} is {\eta}-approximately dominant if for all other lists {L'_i} by {a_i}—and all lists submitted by other students which induce the mappings {\mu} with {L_i} and {\mu'} with {L'_i}—we have

\displaystyle  v_{i,\mu'(i)} - v_{i,\mu(i)} < \eta.

In fact, KMRW pull the randomization lever by stipulating the {\eta} bound only on the difference in expected values over matchings {\mu,\mu'} produced by the randomized algorithm they construct. Then this says that no student can expect to gain more than {\eta} utility by gaming the system.

Confining the utility values to {[0,1]} ensures that any {\eta > 0} is larger than some differences if the number {m} of colleges is great enough, so this allows some slack compared to strict dominance which holds if {\eta = 0}. It also enables the utility values to be implicitly universally quantified in the condition by which {\eta} and {\alpha} “mesh” with {m}, the number {n} of students, and the minimum capacity {C} of any college:

Theorem 1 Given {\alpha} and {\eta}, there are universal constants {K,c,d > 0} and an efficient randomized algorithm that with high probability produces a feasible {\alpha}-approximately stable solution in which submitting the true preferences is {\eta}-approximately dominant in expected value for each student, provided that {m,n > K} and

\displaystyle  C > c\frac{\sqrt{m}}{\alpha\eta}(\log n)^d.

This is an informal statement. The mechanism underlying the proof also works for {\alpha,\eta} not fixed: provided {C} grows faster than {\sqrt{m}}, the approximate stability and probable-approximate truth-telling dominance can be achieved with {\alpha} and {\eta} both tending toward {0}. It is neat that the approximations are achieved using concepts and tools from differential privacy, which we have posted about before. By analogy with PAC, we might summarize the whole system as being “probably approximately truthful.”

Should We Handle Truth?

The notion of truth is fundamental. In logic the notion of truth as applied to mathematical theories is central to the Incompleteness Theorems of Kurt Gödel and Alfred Tarski.

In economics the notion of truth is different but perhaps even more important. Imagine any set of agents that are involved in some type of interaction: it could be a game, an auction, or some more complex type of interaction. Typically these agents make decisions, which affect not only what happens to them, but also to the other agents. In our “{A,B,C,D}” example above, the further point is that “gaming the system” by {B} and {D} lowered the utility for {A} and {C}. But {A} and {C} could have tried the same game.

The effect on others highlights that this is a basic problem. It seems best when the interaction does not reward agents that essentially “lie” about their true interests. This speaks our desire for a system that rewards telling the truth—and takes the subject into the area of algorithmic mechanism design which we also featured in the post on Nisan. Indeed truth is addressed in his Gödel Prize-winning paper with Amir Ronen, for instance defining a mechanism to be strongly truthful if truth-telling is the only dominant strategy.

That paper follows with a sub-sub-section 4.3.1 on “basic properties of truthful implementations”—but what I’m not finding in these papers are theorems that tell me why truthfulness is important in economic interactions. It sounds self evident, but is it? There are many papers that show one cannot force agents to be truthful, and there are other results showing cases in which the agents’ individual best interests are to be truthful after all. I understand why a solution to a matching problem should be stable, but am not convinced that it needs to be truthful. In mathematics we can define a property that we add to restrict solutions to some problem, but we usually need to justify the property. If we are solving a equation, we may restrict the answers to be integers. The reason could be as simple as non-integer answers making no sense, such as buying 14.67 cans of Sprite for a party.

I get that being truthful does stop some behavior that one might informally dislike. What I feel as an outsider to the research into matching problems is simple: where is the theorem that shows that adding truthful behavior has some advantage? It is true in the analogous case of auctions that they can be designed so that truthful bidding is provably a dominant strategy, and plausibly this matters to competitors agreeing to and paying for the auction mechanism. Perhaps there is a “meta” game level where mechanism designs are strategies and there is a communal payoff function, in which truth-inducing mechanism designs may be optimal strategies. But overall I am puzzled. Perhaps this just shows how naive I am about this line of work.

Open Problems

What are your reactions on the importance of inducing truth-telling? To show at least dewy diligence on our part, here are a few good references. What would Gödel—who conveyed certain of his own mechanism design opinions to his friend Oskar Morgenstern (no relation)—say?

Congratulations, Noam

October 5, 2016

!–more–>
The winner of the 2016 ACM-IEEE Knuth Prize

nisanphoto
Coursera source

Noam Nisan has been one of the leaders in computational complexity and algorithms for many years. He has just been named the winner of the 2016 Donald E. Knuth Prize.

Today we congratulate Noam and highlight some of his recent contributions to algorithmic economics and complexity.
Read more…

A Creeping Model Of Computation

September 25, 2016


Local rules can achieve global behavior

sarahcannonpicture

Sarah Cannon is a current PhD student in our Algorithms, Combinatorics, and Optimization program working with Dana Randall. Sarah has a newly-updated paper with Dana and Joshua Daymude and Andrea Richa entitled, “A Markov Chain Algorithm for Compression in Self-Organizing Particle Systems.” An earlier version was presented at PODC 2016.

Today Ken and I would like to discuss the paper, and relate it to some recent results on soft robots.
Read more…

A Proof Of The Halting Theorem

September 19, 2016


Toward teaching computability and complexity simultaneously

ackermann
Large Numbers in Computing source

Wilhelm Ackermann was a mathematician best known for work in constructive aspects of logic. The Ackermann function is named after him. It is used both in complexity theory and in data structure theory. That is a pretty neat combination.

I would like today to talk about a proof of the famous Halting Problem.
Read more…

How Hard, Really, is SAT?

September 4, 2016


A new longest computer proof makes us wonder about things from security to the Exponential Time Hypothesis

HeuleKullmannMarek

Marijn Heule, Oliver Kullmann, and Victor Marek are experts in practical SAT solvers. They recently used this ability to solve a longstanding problem popularized by Ron Graham.

Today Ken and I want to discuss their work and ask a question about its ramifications.
Read more…

Descending Proofs Into Algorithms

August 28, 2016


A way to make indirect reasoning more palpable

Worthies_of_Britain'_(_Nicholas_Saunderson)_by_John_Bowles
Wikimedia Commons source

Nicholas Saunderson was the fourth Lucasian Professor at Cambridge, two after Isaac Newton. He promoted Newton’s Principia Mathematica in the Cambridge curriculum but channeled his original work into lecture notes and treatises rather than published papers. After his death, most of his work was collected into one book, The Elements of Algebra in Ten Books, whose title recalls Euclid’s Elements. It includes what is often credited as the first “extended” version of Euclid’s algorithm.

Today we raise the idea of using algorithms such as this as the basis for proofs.
Read more…

A Surprise For Big-Data Analytics

August 14, 2016


A simple but interesting issue with analyzing high-dimensional data

LandweberLazarPatel

Peter Landweber, Emanuel Lazar, and Neel Patel are mathematicians. I have never worked with Peter Landweber, but have written papers with Larry and Laura Landweber. Perhaps I can add Peter one day.

Today I want to report on a recent result on the fiber structure of continuous maps.

Read more…