An open problem on setting prices to optimize sales

Nina Balcan and Avrim Blum are top researchers in game theory, learning theory, and more generally the theory of algorithms. Nina is at Tech these days and Avrim is at CMU.

Today I want to talk about a simple, perhaps deceptively simple, question about the pricing of items.

The problem is due to Nina Balcan and Avrim Blum, who have a partial result that seems quite hard to improve. Avrim presented the result a few years ago at a talk at Tech. I could not follow any of the rest of his talk, because I had to play with the problem. It is one of those wonderful questions that is clean, simple to state, well motivated, but seems to be very hard to solve. I made no progress at all on the problem—and worse I missed the rest of his great talk.

The Problem

Here is their problem. You have a warehouse filled with items: you got this and that. Let’s use exciting names for the items and call them item ${\#1}$, item ${\#2}$, and so on. In total you have ${n}$ items, but even better your warehouse is filled with many identical copies of each item.

You company wants, of course, to sell these items to your customers. So far nothing unusual—that is what every company wants to do. The twist is that your customers have requests that pair two items together. Customer Alice may say: I wish to buy item ${\#3}$ and item ${\#45}$ and I will pay up to ${400}$ dollars for the pair. While customer Bob may say: I wish to buy item ${\#3}$ and item ${\#52}$ and I will pay up to ${19}$ dollars for the pair.

Coupling of items may be unusual but it is a plausible model. In real life people do sometimes couple their buying implicitly rather than explicitly as above. Think of Alice as wanting to buy a flat screen TV which is item ${\#3}$, and also a set of cables to attach it to her entertainment system which is item ${\#45}$. From her point of view having one is useless, while having both is priceless—well not exactly priceless, since she will only pay up to ${400}$ dollars total.

Your company must be able to handle these paired requests, and in the simplest manner possible. So you have decided to create a price for each item—what a radical idea. Every item from ${\#1}$ to ${\#n}$ will have a price in dollars. The catch is you wish to get the most money possible from your customers. You are lucky in that you know for every customer exactly what two items they want and the maximum price they will pay for the pair. A more realistic version might be an on-line problem, but here we are assuming that you see all the requests at once.

How do you price the items? The key issue is that if a customer wants items ${\#a}$ and ${\#b}$ for ${d}$ dollars here is what happens:

If the price of the two items together is less or equal to ${d}$ you get the sum of the two prices. If it is more than what the customer wants to pay you get zero.

Nothing. Zip. This is the cool part of this problem: the decision of what you get once you select prices is non-linear. This is why the problem seems to be hard, and why there are only approximations known.

The Problem Formally Stated

You are given an undirected graph ${G=(V,E)}$. We will assume there are no multiple edges, although the general problem allows them. Each edge is assigned a positive weight, via a function ${w: E \rightarrow \mathbb{R}^+}$. The weight ${w(e)}$ of the edge ${e=\{a,b\}}$ should be thought of as the amount that a customer will pay for buying both items ${a}$ and ${b}$. Your task is to final a price function ${p: V \rightarrow \mathbb{R}^+}$ so that the following sum is optimal:

$\displaystyle \sum_{ \{a,b\} \in E} \delta \left( p(a)+p(b), w(\{a,b\}) \right).$

Here ${\delta(x,y)}$ is ${x}$ if ${x \le y}$ and ${0}$ otherwise. This is the rule that you get money from a customer only if the price of the two items, ${p(a)+p(b)}$ is less than or equal to the amount they wish to pay, ${w(\{a,b\})}$. And when you are paid you get ${p(a)+p(b)}$, not the maximum the customer was willing to pay.

Approximate Solutions

There is a very simple and clever algorithm that is 2-OPT for the case when the graph is bipartite. The best general algorithm is 4-OPT, and uses the bipartite result. Balcan and Blum proved:

Theorem: For the pricing problem on a bipartite graph there is a deterministic algorithm that determines prices that yield a total payment within a factor ${2}$ of optimal.

Proof: Let ${G}$ be the bipartite graph, and let ${L}$ and ${R}$ be a partition of the vertices of the graph so all edges go between ${L}$ and ${R}$ only. Set all the prices for the vertices in ${R}$ to ${0}$—essentially giving those items away. Then determine the optimal prices for the vertices on the left. Note that this is easy to do because now the pricing problem is on single items, not on pairs. Call the amount of dollars you get ${d_L}$, since the money comes from the ${L}$ vertices. Then do the same to get ${d_R}$ by setting the prices for the vertices in ${L}$ to ${0}$.

Assume that ${d_L \ge d_R}$: this is okay by symmetry. Then the simple observation to finish the proof is that ${d_L}$ must be at least ${1/2}$ of the total optimal payment possible. So use pricing for ${d_L}$. $\Box$

How do Balcan and Blum use this for general graphs? Just give a random 2-coloring of the vertices, ignoring all edges between two endpoints of the same color, thus making the remaining graph bipartite. In expectation, this process loses at most half the optimal revenue—and can be made deterministic using pairwise independence. This gives a 4-approximation in general, and no better approximation is known.

Further Results

There is a hardness result that there is no ${17/16}$-opt for the pricing problem unless ${\mathsf{P}=\mathsf{NP}}$, so we cannot expect a fast general solution that is exact. This result is in the pretty paper titled “On Hardness of Pricing Items for Single-Minded Bidders” by Rohit Khandekar, Tracy Kimbrel, Konstantin Makarychev, and Maxim Sviridenko.

Locally, at Tech, Shiva Kintali and Danupon Nanongkai have worked on the flip-side problem of trying to find better upper bounds. So far they have shown that some very special graphs have either exact polynomial time algorithms or good polynomial time approximation algorithms, in some cases an actual PTAS. Here is the list so far:

1. Paths have an optimal algorithm.
2. Cycles have an optimal algorithm.
3. Trees have a PTAS.
4. Graphs that are k-colorable graphs have a factor ${4(k-1)/k}$ approximation.

For example, cycles are handled as follows: If in the optimal case some customer edge is not satisfied then try all deleted versions of the cycle and use the tree algorithm. If on the other hand in the optimal case all customers are satisfied, then we can easily reduce the problem to a linear programming problem.

Curiously while the result for paths is just a standard application of dynamic programming, the details are a bit messy. I would have guessed that a simple path would be “trivial,” but that appears not to be the case.

Open Problems

Is there a ${k}$-opt for general graphs with ${k<4}$? Are there other interesting classes of graphs that have good approximation algorithms? There is a natural generalization to hypergraphs: in this case buyers wish to purchase multiple items or none.

March 7, 2011 8:42 am

Has anyone considered the natural generalisation where some pairs of items are more frequently requested than other pairs of items, perhaps selected from some non-uniform probability distribution? How would this affect the results obtained so far?

March 7, 2011 9:43 am

“For example, cycles are handled as follows: If in the optimal case some customer edge is not satisfied then try all deleted versions of the cycle and use the tree algorithm”

DYM the path algorithm? Seems like you do if you get an optimal algorithm for cycles.

March 7, 2011 4:18 pm

A naive question. When you say “there exists a deterministic algorithm”, “the best general algorithm is 4-OPT”, “there is no 17/16-OPT”, is this always taken to mean “within the class of polynomial-time algorithms”?

March 7, 2011 4:44 pm

James.

Yes that is with polynomial algorithms.

March 8, 2011 10:23 am

It is another sample use case to test P is not NP:

(1) An infinite set of discrete willing prices are eigenvalues (complex parameters) depending on a continued fractional relation. It contains continuation, which forms non-polynomials.

(2) Pairing and bipartite methods make the pricing function analytic but it contains continuation, which is non-polynomial.

(3) It is possible to generalize (1) and (2) to study the pricing function for both items.

(4) Multiple buyers (or instances of buying) in the lower plane perform multi-path checking. It is a multivalued function. With a finite set of items, phase incompleteness exists in the imaginary domain. This adds both modulus and phase incompleteness in the real domain, which cause computational complexity in the complex domain.

(5) Even with the above non-polynomials, there exists a policy function to determine the orthogonal decision (buy or not buy). The policy function formulates different branches of buyers with the continuous path. However, the policy function contains a normalization factor which is non-polynomial associated with the above pricing function. Hence, Balcan and Blum’s approximation is correct. The possible improvement is to identify the convenient values such as frequency, preference, etc.

(6) In the case of buying three-items instead of pairing, the policy function becomes rational. It is possible to make polynomial decision. You can find finite local transformations (20-30 transformations). This does not include relation transformations.

(7) In the case of buying N-items instead of pairing, the decision function is non-polynomial. It is possible to identify a list of decisions if they exist. However, there exists polynomial decision functions in the N-item case. We can form an orthonormal basis upon the products of these polynomials.

(8) The policy function can be defined as a non-polynomial kernel function to transform the relation between the pricing function and the eigenvalues into infinite non-local decisions.

For the detailed reasoning, please see my proof on “P is not NP” at:

http://leibnizengine.wordpress.com/2011/02/23/p-np/

March 8, 2011 3:18 pm

Just to mention a related open question: Suppose you are allowed to assign negative prices (for this to make economic sense, think of “0” as meaning “pricing at cost”, and think of w(e) as the amount the buyer is willing to pay above your cost; so a negative price just means you are pricing the item at a loss (to you) in order to make money from buyers who also want high-price items). Can you get a constant approximation with respect to the revenue of the best pricing of this form? A recent paper of Yi Wu Pricing Loss Leaders can be hard shows that for 3-hypergraphs (the buyers each want 3 items) this is UGC-hard. But it’s not clear if it’s hard or doable for graphs.

March 8, 2011 9:33 pm

I loved this problem.

March 9, 2011 3:55 am

In the above paper with the 17/16-OPT proof, w (called b by them?) is assumed to be integer (see 2. Preliminaries), does this mean the bound for the posted problem with w in the real numbers can be even worse?
I’m not sure, if the proof works out in the w=real numbers case, because if I understand it correctly, Lemma 2.1 implies the prices are semi-integral if(f?) the b_e are integers and in the LP in 3.2 it’s crucial, that the prices are semi-integral due to the fact we have variables for every possible price (so if we have semi-integral prices =<, we have a countable number of possible prices).

Moreover, I think there is a typo in the paper on page 9, where they write b_e, e \in V instead of b_e, e \in E ?

• April 3, 2011 4:25 pm

Markus: It’s not very important whether w’s are integers, rational or real numbers (assuming that they are given in some “constructive” way). We can round down all w’s to be multiples of (1+epsilon) for some small epsilon. Then scale them and get integer numbers. By doing so we may loose at most (1+epsilon) in the approximation factor. Thus, Lemma 2.1 is not particularly important for the proof. The correct lower bound is perhaps better than 17/16. Particularly, assuming the Unique Games Conjecture it’s (2 – epsilon) as shown in the paper.

8. March 9, 2011 7:41 am

This is unrelated, but I made a blog post in which Richard Lipton Makes a Cameo appearance

9. March 9, 2011 7:44 am

Oops. I forgot to give a link to the blog post. Here it is.

March 9, 2011 10:35 am

Blum:

1. There is one class of three-item pricing can be represented by the graph-based methods. It remains homolgraphic for only one of the three items. It makes a graph possible. Asymptotic approximation can be done for other cases.

2. The pricing function is determined by their exponents to obtain confluence. Cost-based pricing vs selling pricing makes no difference. Hence, we cannot achieve constant approximation as we expected.

3. I have some doubts on the method and results from which UGC-hard and 3-hypergraphs were proposed since there are straight restrictions to reduce pricing to 3-hypergraphs. We have to verify the edge cases for that proposal. I think there are counter-examples for the proposed method. In addition, we need to consider all transformations (relation transformations, pricing transformations, and kernel transformations) as we discussed earlier. So I do not have “HARD” conclusion for this case.

March 9, 2011 2:51 pm

Rrtucci:

Internals of continuation do not terminate non-polynomials but externals affect continuation.

Quantum computing does not remove continuation and Space-time does not either.

You need to understand why we need P and when we need NP.

Did you read my proofs on “P vs. NP” ? If not, please do so !

http://leibnizengine.wordpress.com/2011/02/23/p-np/