In the context of stable matching problems

Jamie Morgenstern is a researcher into machine learning, economics, and especially mechanism design.

Today Ken and I would like to discuss a joint paper of hers on the classic problem of matching schools and students.

The paper is titled, “Approximately Stable, School Optimal, and Student-Truthful Many-to-One Matchings (via Differential Privacy).” It is joint with Sampath Kannan, Aaron Roth, and Zhiwei Wu. Let’s call this paper (KMRW). She gave a talk on it at Georgia Tech last year.

## The Classic Matching Problems

There are various instances of matching-type problems, but perhaps the most important is the NRMP assignment of graduating medical students to their first hospital appointments. In 2012 Lloyd Shapley and Alvin Roth—Aaron’s father—were awarded the Nobel Prize in Economics “for the theory of stable allocations and the practice of market design.”

The original matching problem was that of marrying females and males, on which we just posted about a recent joint paper of Noam Nisan. Here is a standard description of the stable matching problem (SMP):

Assume there are ${n}$ men and ${n}$ women. Further each has a linear ranking of the members of the opposite sex. Find a way to marry all the men and all the women so that the assignment is stable. This means that no two people would both rather marry each other instead of the partners they are assigned to.

In 1962, David Gale and Shapley proved that any SMP always has a solution, and even better, they gave a quadratic time algorithm that finds it. SMP as stated is less practical than the problem that results if the men are also allowed to marry each other, likewise the women, with everyone ranking everyone else as partners. But in the same famous paper, Gale and Shapley solved an even more realistic problem:

The hospitals/residents problem—also known as the college admissions problem—differs from the stable marriage problem in that the “women” can accept “proposals” from more than one “man” (e.g., a hospital can take multiple residents, or a college can take an incoming class of more than one student). Algorithms to solve the hospitals/residents problem can be hospital-oriented (female-optimal) or resident-oriented (male-optimal). This problem was solved, with an algorithm, in the original paper by Gale and Shapley, in which SMP was solved.

Let’s call this the college admissions problem (CAP).

## Gaming the System

Consider adding another condition to CAP. Stability is clearly an important requirement—without it there would be students and colleges that would prefer to swap their choices. Having stability which avoids such a situation is quite nice. It does not force the assignment to be perfect in any sense, but does make it at least a “local” optimal solution. This is important, and it is used in real assignment problems today.

Yet there is a possibility that students, for example, could “game” the system. What if a student selected a strange ordering of schools that they claim they wish to attend. But the list is not really what they want. Why would they do this? Simply they may be able to lie about their preferences to insure that they get their top choice.

The basic point seems easiest to illustrate in the non-gender, single-party case, so let’s say ${A,B,C,D}$ need to form two pairs. They have circular first choices and second choices in the opposite circle:

$\displaystyle A \rightarrow B \rightarrow C \rightarrow D \rightarrow A; \qquad A \Longrightarrow D \Longrightarrow C \Longrightarrow B \Longrightarrow A.$

Let’s say ${(AB,CD)}$ happens. ${B}$ and ${D}$ got their second choices, but it is not to their advantage to room together. They could have had their first choices under the also-stable configuration ${(AD,BC)}$ but there was no way to force the algorithm to choose it. However, let’s suppose ${B}$ and ${D}$ lie and declare each other to be their second choices:

$\displaystyle A \rightarrow B \rightarrow C \rightarrow D \rightarrow A; \qquad A \Longrightarrow D \Longrightarrow B;\quad C \Longrightarrow B \Longrightarrow D.$

The algorithm given these preferences sees ${(AB,CD)}$ as unstable since ${B}$ would join with ${D}$, and the resulting ${(AC,BD)}$ as unstable because ${A}$ and ${D}$ would prefer each other. The resulting stable ${(AD,BC)}$ now gives ${B}$ and ${D}$ their first choices. Given what they may have known about ${A}$‘s and ${C}$‘s preference lists, there was no danger they’d actually have to pair up. Examples in the two-gender and student-college cases need six actors but the ideas are similar.

## Systematizing the Game

The basic Gale-Shapley algorithm for the two-party case actually has two “poles” depending on which party has its preferences attended to first. The current NRMP algorithm favors the applicants first. In college admissions one can imagine the schools going first. Alvin Roth proved that the poles had the following impact on truthfulness:

Theorem 5. In the matching procedure which always yields the optimal stable outcome for a given one of the two sets of agents (i.e., for ${M}$ or for ${W}$), truthful revelation is a dominant strategy for all the agents in that set.

Corollary 5.1. In the matching procedure which always yields the optimal stable outcome for a given set of agents, the agents in the other set have no incentive to misrepresent their first choice.

A dominant strategy here means a submitted list of preferences that the individual has no motive to change under any combination of preferences submitted by the other agents (on either side). The upshot is that the favored side has no reason to deviate from their true preferences, but the non-favored side has motive to spy on each other (short of colluding) and lie about their second and further preferences.

There has been much research on improving fairness by mediating between the “poles” but this has not solved the truthfulness issue. Can we structure the algorithm so that it still finds stable configurations but works in a way that both parties have incentive to be truthful and use their real orderings?

Unfortunately, Roth’s same paper proved that there is no method to solve SMP or CAP that is both stable and truthful. Only the above conditions on one side or the other are possible.

## Can We Handle Truth?

Roth’s negative result reminds me of the famous dialogue toward the end of the 1992 movie A Few Good Men:

Judge Randolph: Consider yourself in contempt!

Kaffee: Colonel Jessup, did you order the Code Red?

Judge Randolph: You don’t have to answer that question!

Col. Jessup: I’ll answer the question!

[to Kaffee]

Kaffee: I think I’m entitled to.

Kaffee: I want the truth!

Col. Jessup: You can’t handle the truth!

However, a recurring theme in theory especially since the 1980s is that often we can work around such negative results by:

• randomizing so that requirements hold “with high probability” or “in expected case,” and/or

• accepting approximations, and/or

• making results (more) asymptotic, and/or

• fencing off undesirable cases in some way.

Recall that in the college admissions case where the schools act first, the schools have no motive to do other than declare their true preferences, but the students might profit if they submit bogus second, third, and further preferences to the algorithm. The new KMRW paper states the following:

We present a mechanism for computing asymptotically stable school optimal matchings, while guaranteeing that it is an asymptotic dominant strategy for every student to report their true preferences to the mechanism. Our main tool in this endeavor is differential privacy: we give an algorithm that coordinates a stable matching using differentially private signals, which lead to our truthfulness guarantee. This is the first setting in which it is known how to achieve nontrivial truthfulness guarantees for students when computing school optimal matchings, assuming worst- case preferences (for schools and students) in large markets.

## Approximation and Truth

By “approximately stable” the KMRW paper means a relaxation of the third of three conditions on a function ${\mu}$ that maps ${n}$ students ${a_i}$ to ${m}$ colleges ${U_j}$ plus the “gap year” option ${U_0}$. If ${a_i}$ left ${U_j}$ off his or her application list, that is read as preferring ${U_0}$ to ${U_j}$. Symmetrically, ${U_j}$ may have an admission threshold that some students ${a_i}$ fall below.

1. The set ${\mu^{-1}(U_j)}$ of students mapped to ${U_j}$ is no greater than the capacity ${C_j}$ of ${U_j}$.

2. For every student ${a_i}$ mapped to ${U_j}$ and student ${a_k}$ in a college ${U_\ell \neq U_j}$ it is not the case that ${a_i}$ listed ${U_{\ell}}$ over ${U_j}$ and ${U_j}$ ranked ${a_i}$ over ${a_k}$.

3. For every college ${U_j}$ with an opening, i.e. ${|\mu^{-1}(U_j)| < C_j}$, and student ${a_i}$ who listed ${U_j}$ as a possibility, either ${a_i}$ is below the admission threshold of ${U_j}$ or ${a_i}$ prefers ${\mu(a_i)}$ to ${U_j}$.

The relaxation is that for some fixed ${\alpha > 0}$, colleges may hold on admitting qualified students who might prefer them if they are within ${\alpha}$ of their capacity:

3′. Same as 3. but with ${|\mu^{-1}(U_j)| < C_j(1 - \alpha)}$ in place of the first condition.

There is also an approximation condition on the potential gain from lying about preferences. This requires postulating a numerical utility ${v_{i,j} \in [0,1]}$ for student ${a_i}$ to attend college ${U_j}$ that is monotone in the true preferences. Given ${\eta > 0}$, say the preference list ${L_i}$ submitted by ${a_i}$ is ${\eta}$-approximately dominant if for all other lists ${L'_i}$ by ${a_i}$—and all lists submitted by other students which induce the mappings ${\mu}$ with ${L_i}$ and ${\mu'}$ with ${L'_i}$—we have

$\displaystyle v_{i,\mu'(i)} - v_{i,\mu(i)} < \eta.$

In fact, KMRW pull the randomization lever by stipulating the ${\eta}$ bound only on the difference in expected values over matchings ${\mu,\mu'}$ produced by the randomized algorithm they construct. Then this says that no student can expect to gain more than ${\eta}$ utility by gaming the system.

Confining the utility values to ${[0,1]}$ ensures that any ${\eta > 0}$ is larger than some differences if the number ${m}$ of colleges is great enough, so this allows some slack compared to strict dominance which holds if ${\eta = 0}$. It also enables the utility values to be implicitly universally quantified in the condition by which ${\eta}$ and ${\alpha}$ “mesh” with ${m}$, the number ${n}$ of students, and the minimum capacity ${C}$ of any college:

Theorem 1 Given ${\alpha}$ and ${\eta}$, there are universal constants ${K,c,d > 0}$ and an efficient randomized algorithm that with high probability produces a feasible ${\alpha}$-approximately stable solution in which submitting the true preferences is ${\eta}$-approximately dominant in expected value for each student, provided that ${m,n > K}$ and

$\displaystyle C > c\frac{\sqrt{m}}{\alpha\eta}(\log n)^d.$

This is an informal statement. The mechanism underlying the proof also works for ${\alpha,\eta}$ not fixed: provided ${C}$ grows faster than ${\sqrt{m}}$, the approximate stability and probable-approximate truth-telling dominance can be achieved with ${\alpha}$ and ${\eta}$ both tending toward ${0}$. It is neat that the approximations are achieved using concepts and tools from differential privacy, which we have posted about before. By analogy with PAC, we might summarize the whole system as being “probably approximately truthful.”

## Should We Handle Truth?

The notion of truth is fundamental. In logic the notion of truth as applied to mathematical theories is central to the Incompleteness Theorems of Kurt Gödel and Alfred Tarski.

In economics the notion of truth is different but perhaps even more important. Imagine any set of agents that are involved in some type of interaction: it could be a game, an auction, or some more complex type of interaction. Typically these agents make decisions, which affect not only what happens to them, but also to the other agents. In our “${A,B,C,D}$” example above, the further point is that “gaming the system” by ${B}$ and ${D}$ lowered the utility for ${A}$ and ${C}$. But ${A}$ and ${C}$ could have tried the same game.

The effect on others highlights that this is a basic problem. It seems best when the interaction does not reward agents that essentially “lie” about their true interests. This speaks our desire for a system that rewards telling the truth—and takes the subject into the area of algorithmic mechanism design which we also featured in the post on Nisan. Indeed truth is addressed in his Gödel Prize-winning paper with Amir Ronen, for instance defining a mechanism to be strongly truthful if truth-telling is the only dominant strategy.

That paper follows with a sub-sub-section 4.3.1 on “basic properties of truthful implementations”—but what I’m not finding in these papers are theorems that tell me why truthfulness is important in economic interactions. It sounds self evident, but is it? There are many papers that show one cannot force agents to be truthful, and there are other results showing cases in which the agents’ individual best interests are to be truthful after all. I understand why a solution to a matching problem should be stable, but am not convinced that it needs to be truthful. In mathematics we can define a property that we add to restrict solutions to some problem, but we usually need to justify the property. If we are solving a equation, we may restrict the answers to be integers. The reason could be as simple as non-integer answers making no sense, such as buying 14.67 cans of Sprite for a party.

I get that being truthful does stop some behavior that one might informally dislike. What I feel as an outsider to the research into matching problems is simple: where is the theorem that shows that adding truthful behavior has some advantage? It is true in the analogous case of auctions that they can be designed so that truthful bidding is provably a dominant strategy, and plausibly this matters to competitors agreeing to and paying for the auction mechanism. Perhaps there is a “meta” game level where mechanism designs are strategies and there is a communal payoff function, in which truth-inducing mechanism designs may be optimal strategies. But overall I am puzzled. Perhaps this just shows how naive I am about this line of work.

## Open Problems

What are your reactions on the importance of inducing truth-telling? To show at least dewy diligence on our part, here are a few good references. What would Gödel—who conveyed certain of his own mechanism design opinions to his friend Oskar Morgenstern (no relation)—say?

October 14, 2016 3:28 pm

The benefit to procedures in which truth telling is a dominant strategy is that it deals best with real world deviations from rationality. Sure, if we really had all rational agents who would all know they should lie and how to do so when submitting preferences it might be fine.

However, in practice many agents won’t realize that truth telling isn’t the best solution. Other agents might feel that it is immoral to try and get an advantage by not telling the truth and thereby cause the whole system to yield a less than optimal result (this suggests that a weaker property of still giving an equally good result when agents report truthfully even when this is non-dominant). Lastly, truthful strategies in some sense serve as a proxy for ease of computation. One can suppose that it is easy for an agent to list their truthful preferences but if a matching procedure favored some other strategy it might be too computationally difficult to implement.

As an aside there is also the practical issue that procedure that award fairness make it easier to collect data about how well the procedure is doing in practice.

October 15, 2016 4:13 am

Why favour truth?

Because history teaches us that evolution favours co-operation within a species; and since no two persons have the same perspective, describing one’s perspective truthfully enriches everyone’s perspective, albeit second-hand; describing it falsely enriches noone.

Because In mathematics, we can build an enduring edifice of truths upon a true statement; nothing upon a false one.

As to Goedel, by all accounts he could have been predisposed to hold (perhaps along with Tarski) that truth is a universal attribute of the content which precedes both the language that seeks to describe the content, and its logic.

If so, he would have been wrong.

For language precedes logic, which precedes both truth and provability.

And a well-defined logic is merely a set of deterministic rules that (reflecting the the evolutionary impulse) can constructively assign unique, and verifiable, values of truth and provability to the propositions of a language (such as the first order Peano Arithmetic PA) so as to ensure that the language can adequately represent and unambiguously communicate the content for which it is designed (the properties of the natural numbers in the case of PA).

3. October 15, 2016 8:45 am

In the field of mechanism design, we often look at direct revelation mechanisms. Usually a mechanism stipulates actions and payoffs and agents choose actions as a function of their private information (e.g. preferences). With Bayesian agents, this will produce some equilibrium. However, we could define an alternative mechanism in which agents report their private information and the mechanism takes an action for them (a direct revelation mechanism). In the context of an auction, rather than just submitting a bid, the agent reports his value for a good to the direct revelation mechanism, which then chooses a bid for him. One can show that for each Bayesian equilibrium, there exists a direct revelation mechanism which yields the same equilibrium outcomes! (This is easy to see by creating a mechanism that enacts the same action for the agent that that agent would’ve taken in the equilibrium). By looking at truthful reporting in a direct revelation mechanism, we are essentially asking if it is the case that the outcome corresponding the mechanism is indeed an equilibrium. Truthtelling is not necessarily desired if we are looking at general mechanisms, but in direct revelation mechanisms truthtelling corresponds to making sure that the direct revelation mechanism is an equilibrium.

4. October 16, 2016 11:45 am

Being truthful is a good thing because it means that your method still works even when used on agents who will attempt to get the best results for themselves and who know how the method works. We should expect agents participating in a game to act strategically, so we want to want to make methods that work even when used on agents that act strategically.

Note though that this is a weaker condition than truthfulness! The idea of “truthfulness” is to break this into two parts:
1. The method works when used on agents who are honest
2. Agents are incentivized to be honest (strategic agents will act honestly) — this is the “truthfulness” part.

However, one could potentially accomplish the original goal without breaking it into these two parts. But basically truthfulness is one way of dealing with self-interested agents; you could think of this as a sort of adversarial situation, and truthfulness as a sort of robustness condition.