Merrill Flood was a mathematician who publicized the Traveling salesman problem TSP back in the 1940s.
He says that he found the problem while:
I was struggling with the problem in connecting with a schoolbus routing study in New Jersey.
Today I thought we might look at the TSP from a viewpoint of complexity theory.
First, we note that the origins of the TSP are unclear. In 1832 it was mentioned in a handbook for traveling salesmen in Germany and Switzerland. William Hamilton talked about the related problem later named for him:
On finding a circuit through all points of graph.
Karl Menger, in the 1930s, defined the TSP, and considered the obvious bruteforce algorithm, and observed the nonoptimality of the nearest neighbor heuristic. We have annotated Wikipedia’s optimaltour example to show (in blue) one case where a node is not connected to either of its two nearest neighbors in the tour and another where a mutual nearestneighbor edge is not used in the optimal tour (the latter is a slight fib asis but can be nudged, we believe):
The TSP is now well known to be NPcomplete thanks to Dick Karp’s brilliant work in the 1970s. Flood would be happy to hear—we believe—that there is now a computer program Concorde that can solve TSP with many cites and get a nearly optimal solution. The authors of this program are: David Applegate, Robert Bixby, Vasek Chvatal, and William Cook.
Concorde has been applied to problems of gene mapping, protein function prediction, vehicle routing, conversion of bitmap images to continuous line drawings, scheduling ship movements for seismic surveys, and in studying the scaling properties of combinatorial optimization problems.
Concorde’s solutions are guaranteed to be within 23% of optimal and have worked for problems with tens of thousand cities or more. A lot more than Flood considered.
So why are we interested in the TSP still? A good question.
The answer is not simple. In spite of the problem’s NPcompleteness and widespread belief that PNP we still do not know for sure. Could there be an algorithm that always solves TSP in polynomial time? We do not know.
We also as complexity theorists want provable algorithms, want exact algorithms, or at least approximate algorithms with provable bounds. These are lacking. This is why we were excited by the recent result of a tiny improvement to the bound on TSP by Anna Karlin, Nathan Klein, and Shayan Oveis Gharan.
Maybe we are crazy to be excited over a microscopic shaving off from . But their result comes with new ideas, and since we don’t believe that is Nature’s final answer, those ideas should yield further advances. What we are really crazy over is new ideas, and that prompts us to attempt to find another in the rest of this post. What complexity theorists do have more of than when Ken and I started is receptacles for new ideas, and we try for one in resourcebounded communication complexity.
In this spirit of craziness let’s take a complexity viewpoint of TSP. Let’s visit Alice and Bob. By the way I always have thought the invention of them by Ron Rivest, Adi Shamir, and Leonard Adleman in their 1978 RSA paper was important. This idea has probably helped improve exposition more than any other single idea.
So suppose that Alice and Bob are presented with a city TSP. Further suppose that Alice can solve the TSP, but Bob like the rest of us cannot. An issue is:
Can Alice send a succinct message to Bob that allows him to find an optimal tour in polynomial time?
Without the word “succinct” there would be the simple answer of sending Bob her optimal tour. We will discuss a modest improvement: she can send half as many bits to Bob as it takes to describe the optimal tour. We suspect the following is known, but it suggests a possible interesting direction.
Theorem 1 (Main) Alice can send Bob a message of bits, so that he can get an optimal TSP tour in polynomial time.
This theorem holds for general TSP: it can be asymmetric and need not satisfy the triangle inequality.
Proof: Let’s assume that ; the argument is the essentially same if is odd.
Alice will first solve the TSP exactly. Assume she finds an optimal tour that visits the cities in order
To simply the notation let be the distance from to . Then let Alice send Bob the odd cities in order
This message has the right length in bits of .
Bob receives this message and creates the following bipartite graph : The left vertices are the cities
The right vertices are
has edges connecting every left vertex to every right vertex; i.e., is the complete bipartite graph.
The first idea is that any perfect matching in extends uniquely to a tour, because Bob knows the order of the odd vertices in Alice’s tour. That is, if is an edge in the tour, then Bob knows that the next edge goes to node (modulo ).
The second idea—which is a caveat—is that an optimal matching under the graph’s original distance function might not produce Alice’s tour. For this to be true, it would need to be the case that for every optimal tour, either one of the two perfect matchings it induces would need to be an optimal matching for the tour’s odd and even vertices. An example where this fails is shown at lower left in our diagram above: The green edge is part of the optimal oddeven matching but is not in Alice’s optimal tour.
The trick is to change the distance function to include the edge that Bob must add to the tour. That is, we give the edge the cost
Note has edges connecting every left vertex to every right vertex; i.e., is the complete bipartite graph.
Bob then finds a minimum cost matching for the graph . This can be done in polynomial time. We claim the following:
has the cost of the original TSP tour.
These claims imply that Bob can solve the TSP in polynomial time.
We note that (1) follows from the definition of the graph and its costs. In order to see (2) let match to . This means that there are edges
for all . This is clearly a tour, since is a permutation by the definition of matching.
The fact that Alice can send only half of the tour to Bob suggests: can we avoid having Alice solve the TSP in order to compute her message? This is the direction we are interested in. If she could do that it would improve some known complexity bounds on TSP. A possible idea is:
Replace having Alice solve the city TSP exactly. Instead have her solve some randomly selected subproblem exactly, and then use the above method to add in the remaining cities.
Can this work?
We have not found something like this when searching on TSP and communication complexity, but would not be surprised if the matching idea, caveat, and trick have been noticed and remarked on before.
]]>
Jemma Lorenat is an assistant professor at Pitzer College in Los Angeles. She teaches and does research on the history of mathematics.
Today I thought we’d look at some of her work.
History of math is one topic that we have focused on many times before. More on that in a moment.
But before we do that I wish to present Lorenat’s art work. My late father, Jack Lipton, was an artist and so perhaps I have a genetic interest in art. Lorenat is an artist besides being a mathematician. You can see some of her drawings of famous mathematicians here. Her elegant style I find appealing. See if you like it as much as I do. My dad taught me:
A clean drawing is more difficult to execute than a busy one. It is hard to hide flaws when your art is clean.
Her drawings are clean indeed.
Here are three of Lorenat’s drawings of the following three famous mathematicians in some order: Which is which? Prizes will not be given to those with correct answers.
Lorenat’s research is on the history of mathematics. My first choice is to create math, but I am intrigued by the history of who did what, when, and why. We must understand history—at least in broad strokes—if we are to continue to make progress. History helps us understand how progress was made and how it was not. History teaches us much about our field, about mathematics.
Several online sources show the field’s breadth and scope. Among issues and topics, we note:
It is fun to see the process in action. One example I have been involved in for a long time is the study of vector addition systems and reachability problems. There continues to be exciting news, for instance, a paper last year showing that a central reachability problem is vastly harder than had been conjectured. I will discuss, however, an issue from two centuries ago that Lorenat has illuminated.
Lorenat has a talk on the geometric theory of duality. It was a prime example of a controversy in the discover of a basic math idea. Here duality means: Given a statement from projective geometry we can flip points and lines and still leave its correctness invariant. This is the duality:
In complexity theory we have our own duality. Instead of flipping points and lines we can exchange boolean operations
and also exchange
Thus
Lorenat’s talk highlights a controversy between: Joseph Diaz Gergonne, JeanVictor Poncelet, and Julius Plucker. Her work is here.
A plagiarism charge in 1827 sparked a public controversy centered between JeanVictor Poncelet (17881867) and JosephDiez Gergonne (17711859) over the origin and applications of the principle of duality in geometry. Over the next three years and through the pages of various journals, monographs, letters, reviews, reports, and footnotes, vitriol between the antagonists increased as their potential publicity grew. While the historical literature offers valuable resources toward understanding the development, content, and applications of geometric duality, the hostile nature of the exchange seems to have deterred an indepth textual study of the explicitly polemical writings. We argue that the necessary collective endeavor of beginning and ending this controversy constitutes a case study in the circulation of geometry. In particular, we consider how the duality controversy functioned as a medium of communicating new fundamental principles to a wider audience of practitioners.
A further comment is here:
Of this feud, Pierre Samuel has quipped that since both men were in the French army and Poncelet was a general while Gergonne a mere captain, Poncelet’s view prevailed, at least among their French contemporaries.
Did you see which drawing was which?
]]>
Has Election Night—not just the election—been modeled adequately?
The Conversation source 
Kurt Gödel famously found solutions to Albert Einstein’s equations of general relativity that allow trajectories to loop through time.
Today, early on Election Day and a usual time for our posts on Gödel, I talk about the trajectory of counting votes after tomorrow’s polls close and timewarp effects that everyone watching the returns will see.
I am picking up the vein of ethical algorithms in yesterday’s post, but with a different take about ethical modeling in novel situations where reliable training data is unavailable. I believe there is a responsibility for running simulations not just of tomorrow’s election, such as FiveThirtyEight conducts, but also of tomorrow’s count as it may unfold hour by hour and stretch over many following days.
Gödel also famously believed he had found a logical flaw in the U.S. Constitution that allowed a mechanism for legally instituting a dictatorship, but his argument was never reported. We discussed this at length in 2013, including noting a 2012 paper by Enrique GuerraPujol of the University of Central Florida College of Business, who is a frequent reader of this blog. GuerraPujol offers a detailed construction of a mechanism based on selfreference applied to the short Article V on amending the Constitution. This topic has been addressed several times since, but what stands out to us is a 2019 paper by Valeria Zahoransky and Christoph Benzmüller. This paper attempts to find a proof of GuerraPujol’s mechanism by automated logical inference using the Isabelle/HOL proof assistant. They found a model that satisfies the argument.
My post on Election Day 2016 contrasted the relatively stable time evolution of polls in 2012 with the gyrations of 2016. That post began by defending Nate Silver of FiveThirtyEight and his 30% likelihood for Donald Trump against those who had Hillary Clinton over 90%. At the time of our 2012 post on Barack Obama versus Mitt Romney, I had thought Silver’s error bars were too wide, but in 2016 they looked right on the basis of lastminute decision making by impulse that my chess model also registers.
This year, the polls have been even steadier than in 2012, and undecided voters are found to be scarce. Of course, the pandemic has been the greatest factor in the poll numbers, but my first point is that election forecasting uses only the numbers as primary. The algorithms do not have an input that could take values Covid 19, Covid 23, Covid 17… In a normal election, the polling numbers would seem to point to an easier call than in 2012. Silver has, however, expressed caution in explaining why his odds stay just short of 90% for Joe Biden as I write at midnight. My first point of further disquiet begins with a simple fact:
The models have been trained on data from past elections.
A major difference on our flight path to Election Day is the upsurge in early voting and turnout overall. Texas reports that it has already registered more early votes than total votes cast in 2016. We are not saying this difference has been overlooked—of course, models are being adjusted for it every hour. What we are doubting is the existence of a basis for making those adjustments with high confidence. Before we discuss the major issue of the flight path after the polls close, let me insert an analogy from my current chess work.
My statistical chess model is trained on millions of moves from games at the hourslong pace of standard inperson chess. The pandemic has led to chess moving online where fast time controls are the norm. For instance, the hallowed US Championships were just held in a format of three games per day at a pace of 25 minutes for the whole game plus 5 seconds for each move played, which equates to G/30 (game in 30 minutes) under a simplification used also here. The Online Olympiad held in August by the International Chess Federation (FIDE) gave only 15 minutes plus 5 seconds per move, equating to G/20. I have been asked for input on games at paces all the way down to 1minute “Bullet” chess.
My solid estimates of how less time affects skill come from the annual FIDE World Rapid and Blitz Championships, which are contested at equivalents of G/25 and G/5, respectively, by hundreds of elite male and female players. For time controls inbetween I face twin problems of scant data from inperson chess—basically none for player below master level—and online chess having evident contamination from cheating, as I discussed in a post last June. Hence I interpolate to determine model settings for other time controls.
This diagram shows internal evidence supporting the orange curve obtained via Aitken extrapolation, in that inverse polynomial curves based on other reasoning converge to it. The curve has been supported by recent field tests, including a restricted invitational tournament recently run by Chess.com at G/3 and a large junior league in Britain at G/15 equivalent. Still, the ethical status is:
My second point is that this year’s election models are in similar boats: The pandemic forces their cantilevered use in new situations. As with chess the data needed for direct training may not be available. As we’ve said in the post, this year’s election task may be “EthicsHard.” Now we come to my third and main point about responsibility.
The new dimension of time is the order in which all the following categories of votes will be counted, in the states that variously allow them:
This is approximately the order in which votes will be counted, again with statebystate differences, which especially may lead to votes in category 2 being counted later. The issue is that the Democratic and Republican shares are expected to vary greatly across the categories, enough to cause large shifts in the perceived leader over real time.
Geoffrey Skelley of FiveThirtyEight has an article showing how a 5point win for Biden in Pennsylvania might still present as a 16point lead for President Trump on Election Night, when all inperson votes are counted but only some of the votes by mail. Here are the article’s key graphics, the left one showing actual proportions from Pennsylvania’s June primary.
Composite of two figures from FiveThirtyEight Skelley article 
Other states that allow early tallying of early votes may show an initial Biden lead before a redshift toward Trump on Election Night, with more of Biden’s vote share remaining to be counted. Still others are less clear. FiveThirtyEight on Saturday posted a useful statebystate guide.
Awareness of the timeshifts is important not only for perception but also for the timing of legal challenges that are expected to arise. For instance, there is a continued specter of invalidating 127,000 early votes already cast by a novel driveup system in Harris County Texas; despite its rejection by a district judge today, it has been appealed higher. There is expectation of legal battles nationwide over procedures that have been altered by measures to cope with the pandemic.
Public perception, however, is the most immediate concern. President Trump stated in a flagged Tweet that we “must have final total on November 3rd” and followed up by saying, “…instead of counting ballots for two weeks, which is totally inappropriate, and I don’t believe that’s by our laws.” Justice Brett Kavanaugh wrote in a formal opinion that states “want to avoid the chaos and suspicions of impropriety that can ensue if thousands of absentee ballots flow in after Election Day and potentially flip the results of an election.” Note his phrase “flip the results”—a sure sign that perception is reality.
Of course many outlets besides FiveThirtyEight are aware of the new electionreporting physics and have published their own guides. They are adjusting their electionnight projection models accordingly. Our main question goes further in terms of responsibility:
Has anyone been running simulations of how Election Night votecounting may unfold?
We believe such simulations are just as important as the ones they have run of a timeless election. Showing them is not only important to inform the many who will be watching, it would be a vaccine against pressure that exploits unexpected perception.
Yet it must be acknowledged that there is neither hard data nor sure knowledge of countylevel votetallying policies and schedules on which to train such simulations. Nor may there be as many crosschecks as my chess interpolation situation.
This may be another “EthicsHard” problem. But it is one already involved in adjusting the projection models that definitely are being deployed. Aside from the many novel and vital modeling problems from the pandemic, it may be the most important one of our near future. And we are thinking of this less than 48 hours—now less than 24 hours—before the first polls close.
How do you think Election Night results will unfold, apart from your estimate of the timeindependent “ground truth” of the electorate’s intentions?
]]>
Algorithms for the Election
Michael Kearns and Aaron Roth are the authors of the book Ethical Algorithms and the The Science of Socially Aware Algorithm Design. It has earned strong reviews including this one in Nature—impressive.
Michael is a longtime friend who is a leader in machine learning, artificial intelligence, and much more. He also overlapped with Ken at Oxford while visiting Les Valiant there in the mid1980s. He is at the University of Pennsylvania in computer science along with his coauthor Roth. Cynthia Dwork and Roth wrote an earlier book on the related issue of Differential Privacy.
Today we will talk about making algorithms ethical.
Tuesday is the 2020 US national election for President, for Congress, and for state and local offices. Every four years we have a national election, and we cannot imagine a better motivation for making sure that algorithms are ethical.
The word “algorithm” appears 157 times in their book. Two words used handinhand with it are “data” (132 times) and “model” (103 times), both spread through all of the book’s 232 pages. Models of electorates, trained on data from past elections, inform the algorithms used by news agencies to make electionnight projections. These carry more responsibilities than electioneve forecasts. There have been infamous mistakes, most notably the premature calls of Florida both ways in the 2000 election.
We believe that Tuesday’s election in our novel pandemic situation requires attention to ethics from first principles. We will discuss why this is important. What it means to be ethical here? And how one can make an algorithm ethical?
Algorithms have been around forever. Euclid devised his gcd algorithm in 300 BCE. In the first half of the last century, the central issue was how to define that an algorithm is effective. This led to showing that some problems are uncomputable, so that algorithms for them are impossible.
In the second half, the emphasis shifted to whether algorithms are efficient. This led to classifying problems as feasible or (contingently) hard. Although many algorithms for feasible problems have been improved in ways that redouble the effect of faster and cheaper hardware, the study of complexity classes such as has given reasons why algorithms for hard problems may never be improvable.
The new territory, that of Kearns and Roth, is whether algorithms are ethical. Current ones that they and others have critqued as unethical accompany models for the likes of mortgages, smallbusiness loans, parole decisions, and college admissions. The training data for these models often bakes in past biases. Besides problems of racial and gender bias and concerns of societal values, the raw fact is that past biases cause the models to miss the mark for today’s applications. For algorithms with such direct application to society, ethical design is critical.
But this requirement is further reaching that one might initially imagine, so that as with computability and complexity, the factors can be ingrained in the problems.
Consider a simple problem: We given a collection of pairs of numbers . We are to predict whether this number pair has the property
This is pretty easy if we can use both and . But imagine a world where is allowed to be viewed but is secret. Perhaps the law requires that we cannot use —it is illegal. Now we might do as poorly as . Suppose that the data consists of
Then seeing only gives no advantage, while giving both is perfect. Thus what in these simplified terms counts as an ethical algorithm is a poor predictor, whereas an unethical one is perfect.
The blurb for the KearnsRoth book says that they “…explain how we can better embed human principles into machine code—without halting the advance of datadriven scientific exploration.” While we agree their approach is vital, we suspect that as with complexity there will be indelibly ethically hard tasks. We wonder whether election modeling has already become one of them.
Ken and I have two separate takes on this. We will do the first and then the other in a second post.
One question on everyone’s minds is whether we will see a repeat of the forecasting misses from 2016. Let us remind that our own Election Day 2016 post started by defending Nate Silver of FiveThirtyEight for giving Donald Trump as much as a 30% chance to defeat Hillary Clinton. He had been attacked by representatives of many news and opinion agencies whose models had Clinton well over 90%.
We wonder whether these models were affected by the kind of biases highlighted in the book by Kearns and Roth. We must say right away that we are neither alleging conscious biases nor questioning the desire for prediction accuracy. One issue in ethical modeling (for parole, loans, admissions) is the divergence between algorithm outcomes that are most predictive versus those that are best for society. Here we agree that accurate prediction—and accurate projections as results come in after the polls close—is paramount. However, the algorithms used for the latter projections (which were not at fault in 2016 but have been wrong previously) may be even more subject to what we as computer scientists with crypto background see as a “leakage” issue.
Here is the point. Ideally, models using polling data and algorithms reading the Election Night returns should read the numbers as if they did not have ‘R’ and ‘D’ attached to them. Their own workings should be invariant under transformations that interchange Joe Biden and Donald Trump, or whoever are opposed in a local race. However, a crucial element in the projection models in particular is knowledge of voting geography. They must use data on the general voting preferences of regions where much of the vote is still extant. Thus they cannot avoid intimate knowledge of who is ‘R’ and who is ‘D.’ There is no doubleblind or zeroknowledge approach to the subjects being projected.
There is also the question of error bars. A main point of our 2016 post (and of Silver’s analysis) was the high uncertainty factor that could be read from how the ClintonTrump race unfolded. Underestimating uncertainty causes overconfidence in models. This can result from “groupthink” of the kind we perceive in newsrooms of many of the same outlets that are doing the projections. The algorithms ought to be isolated from opinions of those in the organization, but again there is reason from the last election to wonder about leakage.
Unlike cases addressed by Kearns and Roth, we do not see a solution to suggest. As in our simple example, prior knowledge of the data in full may be needed for prediction. This may just be an “EthicsHard” problem.
The word “election” does not appear in Kearns and Roth’s book. What further application of their standpoint to elections would you make?
Ken sees a larger question of ethical modeling decisions given the unprecedented circumstances of the current election. This comes not from spatial geography distorted by the pandemic but rather from the dimension of time injected by massive early voting and late counting of many mailed ballots. He will address this next.
]]>src1, src2, src3 
Anna Karlin, Nathan Klein, and Shayan Oveis Gharan have made a big splash with the number
No that is not the amount of the US debt, or the new relief bill. It is the fraction by which the hallowed 44yearold upper bound of on the approximation ratio of the metric Traveling Salesperson Problem has been improved. With the help of randomization, we hasten to add.
Today we discuss the larger meaning of their tiny breakthrough.
The abstract of their paper is as pithy as can be:
For some we give a approximation algorithm for metric TSP.
Metric TSP means that the cost of the tour is the sum of the distances of the edges
according to a given metric . When the points are in with the Euclidean metric, an time algorithm can come within a factor of the optimal cost for any prescribed . Sanjeev Arora and Joseph Mitchell jointly won the 2002 Gödel Prize for their randomized algorithms doing exactly that. The rub is the constant in the “” depends on —indeed, nobody knows how to make it scale less than linearly in . But for general metrics, getting within a factor of is known to be hard for up to .
Some intermediate cases of metrics had allowed getting within a factor of , but for general metrics the factor found in 1976 by the late Nicos Christofides, and concurrently by Anatoliy Serdyukov, stood like a brick wall. Well, we didn’t expect it to be a brick wall at first. Let me tell a story.
Soon after starting as a graduate student at Oxford in 1981, I went with a bunch of dons and fellow students down to London for a oneday workshop where Christofides was among the speakers and presented his result along with newer work. I’d already heard it spoken of as a combinatorial gem and perfect motivator for a graduate student to appreciate the power of combining simplicity and elegance:
Now any optimal TSP tour arises as a spanning tree plus an edge, so . And can be partitioned into two sets of paths with endpoints in . One of those sets has weight at most and yet matches all pairs of . Thus . It follows that and we’re done.
My memory of what we did after the workshop is hazy but I’m quite sure we must have gone to a pub for dinner and drinks before taking the train back up to Oxford. My point is, the above proof is the kind that can be told and discussed in a pub. It combines several greatest hits of the field: minimum spanning tree, perfect matching, Euler tour, Hamiltonian cycle, triangle inequality. The proof needs no extensive calculation; maybe a napkin to draw on and the partition helps.
The conversation would surely have gone to the question,
Can the factor be beaten?
A perfect topic for mathematical pub conversation. Let’s continue as if that’s what happened next—I wish I could recall it.
Note that the proof already “beats” it in the sense of there being a strict inequality, and it really shows
The advantage shrinks to zero as grows, however. Moreover, examples where Christofides’s algorithm does no better than approach are easy to draw. Pub walls are often covered with emblems of local organizations, and if one has a caduceus symbol it can serve as the drawing:
The staff is a path of nodes while the snakes alternate edges of weight between nodes two apart on the path. Going up one snake and down the other gives an optimal tour of weight (using the two outermost path edges to switch between the snakes), which . The snake edges don’t change the path’s being the minimum spanning tree, and for this costs plus the weight required to match the path’s endpoints. The extra weight is reckoned as the length of one snake, which , so the ratio approaches as and . Here are some tantalizing aspects:
In 1981, we would not have known about Arora’s and Mitchell’s results, so we would have felt fully on the frontier by embedding the points in the plane and sketching spanning trees and cycles on a piece of paper. After a couple pints of ale we might have felt sure that a simple proof with such evident slack ought to yield to a more sophisticated attack.
There is one idea that we might have come up with in a pub. The motivation for choosing to be a minimum spanning tree is that many of its edges go into the Euler tour and those bound the final even if shortcuts them. So making the total edge weight of minimum seems to be the best way to help at that stage. We might have wondered, however, whether there is a way to create to have a stronger direct relation to good tours, if not to the optimal tour.
Oveis Gharan did have such an idea jointly with a different group of authors a decade ago, in the best paper of SODA 2010. We cannot seem to get our hands on the optimal tour, nor even a “good” tour if that means a better than factor approximation—that is what we are trying to find to begin with. But there is another “tour” that we can compute. This is an optimum of the linear programming relaxation of TSP, whose relation to the exactTSP methods of Michael Held and Dick Karp we covered long back. is not a single tour but rather an ensemble of “fractional tours” where each edge has a rational number representing its contribution to the LP solution. The higher , the more helpful the edge.
The objective then becomes to design distributions of spanning trees so that:
The algorithmic strategy this fits into is to sample from , plug into the first step of the Christofides algorithm, and continue as before.
The first two conditions are solidly defined. Considerable technical details in the SODA 2010 paper and another paper at FOCS 2011 that was joint with Amin Saberi and Mohit Singh are devoted to them. A third desideratum is that the distribution not be overconstrained but rather have maximum entropy, so that for efficiently computable numbers approaching one has also:
The third condition, however, follows the maxim,
“the proof of the pudding is in the eating.”
As our source makes clear, this does not refer to Americanstyle dessert pudding, but rather savory British pub fare going back to 1605 at least. The point is that we ultimately know a choice of is good by proving it gives a better approximation factor than .
In America, we tend to say the maxim a different way:
“the proof is in the pudding.”
The new paper uses the “pudding” from the 2011 paper but needed to deepen the proof. Here is where we usually say to refer to the paper for the considerable details. But in this case we find that a number of the beautiful concepts laid out in the paper’s introduction, such as real stability and strong Rayleigh distributions, are more accessibly described in the notes for the first half of a course taught last spring by Oveis Gharan with Klein as TA. One nub is that if a set of complex numbers all have positive imaginary part, then any product of two of the numbers has real part less than the product of the real parts, and if the latter product is positive, then is not a real number. This rules out assignments drawn from the set from being solutions to certain polynomials as well as setting up odd/even parity properties elsewhere.
I’ll close instead with some remarks while admitting that my own limited time—I have been dealing with more chess cases—prevents them from being fully informed.
The main remark is to marvel that the panoply of polynomial properties and deep analysis buy such a tiny improvement. It is hard to believe that the true space of TSP approximation methods is so rigid. In this I am reminded of Scott Aaronson’s calculations that a collision of two stellar black holes a mere 3,000 miles away would stretch space near you by only a millimeter. There is considerable belief that the approximation factor ought to be improvable at least as far as .
It strikes me that the maximumentropy condition, while facilitating the analysis, works against the objective of making the trees more special. It cannot come near the kind of snaky tree obtained by deleting any edge from a good tour , such that plugging into step 1 yields back again. The theory of polynomials and distributions that they develop has a plugandplay element, so that they can condition the distributions toward the third objective using the parity properties. But their framework has inflexibility represented by needing to postulate a realvalued function on the optimum edges whose expectation is of order the square of a parameter already given the tiny value . Of the requirement that be a small fraction of their governing epsilon parameter, they say in section 3:
This forces us to take very small, which is why we get only a “very slightly” improved approximation algorithm for TSP. Furthermore, since we use OPT edges in our construction, we don’t get a new upper bound on the integrality gap. We leave it as an open problem to find a reduction to the “cactus” case that doesn’t involve using a slack vector for OPT (or a completely different approach).
What may be wanting is a better way of getting the oddvalence tree nodes to be closer, not just fewer in number. To be sure, ideas for “closer” might wind up presupposing a metric topology on the given points, leading to cases that have already been improved by other means.
Will the tiny but fixed wedge below become a lever by which to find better approximations?
There is also the kvetch that the algorithm is randomized, whereas the original by Christofides and Serdyukov is deterministic. Can the new methods be derandomized?
[fixed = to + sign at end of Christofides proof; fixed wording of “nub” at end of pudding section]
James Maynard is a number theorist. He attended Cambridge as an undergrad and then moved to do his grad work at Oxford at Balliol College. He is now a professor at Oxford. He is one of the world experts on prime density type theorems.
Today, since it is Friday, I thought we would discuss a timely idea of Maynard. Not an idea about time complexity or time in physics, but involving the use of time.
No it’s not a technical idea of his. He has had many ideas, for instance, that shed light on the beautiful structure of primes. For example, he proved in 2016 that
Theorem 1 For each decimal digit , there are infinitely many prime numbers that do not have in their decimal expansion.
This is not known for all digit systems: For binary, our favorite system as complexity theorists, this is still an open problem. Of course a binary prime with only ‘s must be of the form:
where must be a prime.
These are the famous Mersenne primes named for Marin Mersenne. The largest prime is as of today—at least I believe this is true. For further discussion, see a 2001 paper by Samuel Wagstaff titled, “Prime Numbers with a Fixed Number of One Bits or Zero Bits in Their Binary Representation.”
Maynard’s idea is based on his quest to understand whether known techniques can solve some problem. Of course the best way to understand this is to solve the problem. His above theorem is a perfect example of this. In the abstract he says:
The proof is an application of the HardyLittlewood circle method to a binary problem, and rests on obtaining suitable `Type I’ and `Type II’ arithmetic information for use in Harman’s sieve to control the minor arcs.
The proof may be based on known techniques, but is still very hard. He needs pages to make it work.
Maynard’s idea is to set aside time to remind himself why existing techniques have not worked against math’s biggest open problems.
I often spend Friday afternoons just thinking about trying to directly attack some famous problem. This is much less because I think there’s a realistic way of solving the problem, but more because I think it’s important for me to understand where plausible techniques fail.
One can imagine that he had a Friday afternoon think. During it he asked himself:
Suppose I try to show that there are primes without some particular digit. This is a density type theorem. Well could I use the HardyLittlewood method. But it cannot work because Wait here is a possible way around the roadblock. Hmmm.
Maybe he looked at Terence Tao’s blog post on this very issue. It helped that Maynard is an expert on the HardyLittlewood method, but perhaps thinking why it could not work helped him figure out how it could work.
Today is Friday, so I though what should I think about? What problems and what techniques? Here is a possible example. Let’s look at the On Lower Bounds for the Separating Word Problem.
An approach is based on the following. Let be the set of all degree polynomials, with coefficients . Let be a constant. Our hypothesis H is: For every polynomial in , there is some prime , so that for some
How small can be so that H is true? What are the methods that we should think about? What methods can we see that cannot prove H? Can we, for example, show that we can use a random argument? Can we should that they are not enough primes in the range? Hmmm
Do you like Maynard’s Friday rule? What problems and what techniques would you think about?
]]>
The search for a vaccine—is not a development.
Edward Jenner was an English physician who created the first vaccine, one for smallpox.
In 1798 he used the weak coxpox to fool our immune system to create a protection against the deadly smallpox. Jenner is said to have saved more lives than any other human.
Today there is an attempt to create a vaccine against our smallpox of the 21st century.
In his day smallpox killed 10% or more of populations. In our day there is a similar threat. and thus the immense interest in the development of a vaccine. However, there is a misunderstanding about vaccines for COVID19 that is pervasive. Read the New York Times or watch cable news—CNN, FOX, MSNBC—where “experts” explain how AstraZeneca, Johnson & Johnson, Novavax, and other drug companies are developing a vaccine. What developing means could potentially affect all of us, and a better understanding could save millions of lives.
They are not currently developing the vaccines, they are testing them. The point we want to emphasize is:
The development of a vaccine does not change the vaccine. The vaccine is the same at the start of its testing trials, and remains the same throughout.
The Oxford vaccine AZD1222 is the same today as it was months ago when it was created. The same is true for the other vaccines currently being tested around the world.
A vaccine is not developed in the usual sense. Drug companies can modify: how the drug is made, how it is stored, how it is given, how many doses are needed, and so on. Drug companies cannot modify the vaccine without starting over—the vaccine must remain the same. Trials can lead to a vaccine being adopted, or it can cause the vaccine to be abandoned. In the later case the drug company can try again, but with a different vaccine.
Think of the what development means elsewhere.
Here is a sample explaining vaccine development:
There are several consequences from this insight about vaccines. For one it makes sense to order millions of doses of a vaccine, even one that has not yet been proved to be safe and effective. For example,
The European Commission has placed its first advance order for a coronavirus vaccine, snapping up 300 million doses of AstraZeneca’s AZD1222 candidate developed by the University of Oxford, with an option on another 100 million.
Note we would never order a large number of copies of a book before all editing and typos were fixed. This is a “proof” that the vaccine is the same.
Actually it may make sense to even begin to take the vaccine. Especially for high risk people. In the past inventors of vaccines have often taken their own new vaccine, even before they were sure they worked.
I am a computer scientist with no experience in vaccines. In 1954 I did help test the Jonas Salk polio vaccine. My help was in the form supplying an arm that got a shot of the Salk polio vaccine, I was nine years old then. But I have a math view of vaccines—a viewpoint that sheds light on this misunderstanding.
]]>Composite crop of src1, src2 
Roger Penrose, Reinhard Genzel, and Andrea Ghez have won the 2020 Nobel Prize in Physics. The prize is divided half to Penrose for theoretical work and half to Genzel and Ghez for finding a convincing and appreciably large practical example.
Today we congratulate the winners and give further musings on the nature of knowledge and the role of theory.
The physics Nobel has always had the rule that it cannot be for a theory alone, no matter how beautiful and how many mathematical discoveries follow from its development. Stephen Hawking’s theory of blackhole radiation is almost universally accepted, despite its association with paradox, yet it was said that only an empirical confirmation such as miniblack holes being discovered to explode in an accelerator core would have brought it a Nobel. The official citation to Sir Roger says that his prize is:
“for the discovery that black hole formation is a robust prediction of the general theory of relativity.”
What is a “robust” prediction? The word strikes us as having overtones of necessity. Necessary knowledge is the kind we deal with in mathematics. The citation to Genzel and Ghez stays on empirical grounds:
“for the discovery of a supermassive compact object at the centre of our galaxy.”
The “object” must be a black hole—given relativity and its observed gravitational effects, it cannot be otherwise. Among many possible witnesses for the reality of black holes—one being the evident origin of the gravitational waves whose detection brought the 2017 Nobel—the centers of galaxies are hefty examples. The combination of these citations opens several threads we’d like to discuss.
Dick and I are old enough to remember when black holes had the status of conjecture. One of my childhood astronomy books stated that the Cygnus X1 Xray source was the best known candidate for a black hole. In 1974, Hawking bet Kip Thorne that it was not a black hole. The bet lasted until 1990, when Hawking conceded. He wrote the following in his famous book, A Brief History of Time:
This was a form of insurance policy for me. I have done a lot of work on black holes, and it would all be wasted if it turned out that black holes do not exist. But in that case, I would have the consolation of winning my bet. … When we made the bet in 1975, we were 80% certain that Cygnus X1 was a black hole. By now [1988], I would say that we are about 95% certain, but the bet has yet to be settled.
In the 1980s, I was a student and then postdoc in Penrose’s department, so I was imbued with the ambience of black holes and never had a thought about doubting their existence. I even once spent an hour with John Wheeler, who coined the term “black hole,” when Penrose delegated me to accompany Wheeler to Oxford’s train station for his return to London. But it seems from the record that the progression to regarding black holes as proven entities was as gradual as many argue the act of crossing a large black hole’s event horizon to be. Although the existence of a central black hole from data emanating from Sagittarius had been proposed at least as far back as 1971, the work by Ghez and then Genzel cited for their prize began in 1995. The official announcement for Riccardo Giacconi’s share of the 2002 physics Nobel stated:
“He also detected sources of Xrays that most astronomers now consider to contain black holes.”
This speaks lingering doubt at least about where black holes might be judged to exist, if not their existence at all.
However their time of confirmation might be pinpointed, it is the past five years that have given by far the greatest flood of evidence, including the first visual image of a black hole last year. The fact of their presence in our universe is undeniable. But necessity is a separate matter, and with Penrose this goes back to 1964.
We have mentioned Kurt Gödel’s solution to the equations of general relativity (GR) in which time travel is possible. This does not mean that time travel must be possible, or that it is possible in our universe. A “solution” to GR is more like a model in logic: it may satisfy a theory’s axioms but have other properties that are contingent (unless the theory is categorical, meaning that all of its models are isomorphic). Gödel’s model has a negative value for Einstein’s cosmological constant; the 2011 physics Nobel went to the discovery that in our universe the constant has a tiny positive value. GR also allows solutions in which some particles (called tachyons) travel faster than light.
That GR has solutions allowing black holes had been known from its infancy in work by Karl Schwarzschild and Johannes Droste. There are also solutions without black holes; a universe with no mass is legal in GR in many ways besides the case of special relativity. Penrose took the opposite tack, of giving minimal conditions under which black holes are necessary. Following this article, we list them informally as follows:
Penrose showed that any system obeying these properties and evolving in accordance with GR must develop black holes. He showed this without any symmetry assumptions on the system. Thus he derived black holes as a prediction with the force of a theorem derived from minimal axioms.
His 1965 paper actually used a proof by contradiction. He derived five properties needed in order for the system to avoid forming a singularity. Then he showed they are mutually inconsistent—a proof by contradiction. Here is the crux of his paper:

[ Snip from paper ] 
In the diagram, time flows up. The point in a nutshell—a very tight nutshell—is that once a surface flows inside the cylinder at the Schwarzschild radius then light and any other motion from it can go only inward toward a singularity. The analysis is possible without the kind of symmetry assumption that had been used to tame the algebraic complexity of the equations of GR. The metric completeness mandates a singularity apart from any symmetries; a periodic equilibrium is ruled out by analysis of Cauchy surfaces.
Like Richard Feynman’s famous diagrams for quantum field theory, Penrose developed his diagrams as tools for shortcutting the vicissitudes of GR. We could devote entire other posts to his famous tiles and triangle and other combinatorial inventions. His tools enable quantifying blackhole formation from observations in our universe.
The question of necessity, however, pertains to other possible universes. Let us take for granted that GR and quantum theory are facets of a physical theory that governs the entire cosmos—the longsought “theory of everything”—and let us also admit the contention of inflationary theorists that multiple universes are a necessary consequence of any inflation theory. The question remains, are black holes necessary in those universes?
It is possible that those universes might not satisfy axiom 1 above, or might have enough complexity for existence of black holes but not largescale formation of them. The question then becomes whether black holes must exist in any universe rich enough for sentient life forms such as ourselves to develop. This is a branch of the anthropic principle.
Lee Smolin proposed a mechanism via which black holes engender new universes and so propagate the complexity needed for their largescale formation. Since complexity also attends the development of sentient life forms, this would place our human existence in the wake of consequence, as opposed to the direction of logic when reasoning by the anthropic principle.
The 2020 Nobel Prize in Chemistry was awarded this week to Jennifer Doudna and Emmanuelle Charpentier for their lead roles in developing the CRISPR geneediting technology, specifically around the protein Cas9.
We argue that two more different types of results cannot be found:
Penrose shows that black holes and general relativity are connected, which is a math result. We still cannot create black holes in a lab to experiment with—or maybe we could but should be very afraid of going anywhere near doing so. It was not clear that there could ever be a real application of this result.
Charpentier and Doudna discover that an existing genetic mechanism could be used to edit genetic material. Clearly this can and was experimented on in labs. Also clear that there are applications of this result. Actually it is now a standard tool used in countless labs. There even are patent battles over the method.
We like the fact that Nobels are given for such diverse type of research. It is not just that one is for astrophysics and one for chemistry. It is that Nobels can be given for very different types of research. We think this is important.
But wait. These results do have something in common, something that sets them apart from any research we can do in complexity theory. Both operate like this:
Observe something important from nature. Something that is there independent of us. Then in Penrose’s case explain why it is true. Then in Charpentier and Doudna’s case, use it to solve some important problems.
We wonder if anything like this could be done in our research world—say in complexity theory?
Besides our congratulations to all those mentioned in this post, Ken expresses special thanks to Sir Roger among other Oxford Mathematical Institute fellows for the kindness recorded here.
[changed note about massless universe]
]]>Emil Faber is the pretend founder of the pretend Faber College. The 1978 movie Animal House starts with a closeup of Faber’s statue, which has the inscription, Knowledge Is Good.
Today, Ken and I thought we might talk about knowledge, science, mathematics, proofs, and more.
The phrase on Faber’s pedestal is meant to be a joke, as is the subtitle we added saying the same about science. But there is some truth to both of them. From the cause of climate change to the best response to the current pandemic to sports predictions there is much interest in science. Science is good, indeed.
What is science and what are methods of creating knowledge via science? There is a whole world on the philosophy of science. The central questions are: What is science? What methods are used to create new science? Is science good?—just kidding.
We are not experts on the philosophy of science. But there seem to be three main ways to create scientific knowledge.
Experiments: This is the classic one. Think about the testing of a candidate vaccine to stop the pandemic.
Computational Experiments: This is relatively new. Think computer simulations of how climate change is effected by the methods of creating energy—for example. wind vs. coal.
Mathematical Proofs: This is the one we focus on here at GLL. Think proofs that some algorithm works or that there is no algorithm that can work unless…
We are interested in creating knowledge via proving new theorems. This is how we try to create knowledge. Our science is based not on experiments and not on simulations but mostly on the theoremproof method. Well not exactly. We do use experiments and simulations. For example, the field of quantum algorithms uses both of these.
However, math proofs are the basis of complexity theory. This means that we need to create proofs and then check that they are correct. The difficulty of checking a proof is based on who created them:
My favorite tool for checking is this trick: Suppose that we have a proof that demonstrates is true. Sometimes it is possible to show that there is a proof that proves where:
One way this commonly arises is when as a proof did not use all of the assumptions in . Thus really proves more that and it proves . But we note that is not a consequence of .
For example, consider the Riemann hypothesis. Suppose that we claim that we have a proof that
follows from the usual axioms of math plus . Sounds great. But suppose this is based on an argument that assumes that
and manipulates the summation, eventually yielding a contradiction, without using the condition . This is a problem, since there are with so that the sum is zero. This is an example of the above method of checking.
From time to time claims are made of resolutions to famous conjectures. Think . These claims have all been wrong to date. So most researchers are reluctant to take time to check any new claims. Why would you take the effort to try and find the bug that is likely there?
I wonder if there could be a method that is based on competition. For concreteness, suppose Alice and Bob are two researchers who both claim a resolution to the versus problem. Alice has a lower bound argument that and Bob has an upper bound that . Could we have them play a “game”?
Give their papers to each other. Have them try to find a flaw in each other’s paper.
They are highly motivated. Could we argue that if they cannot find any flaw then we would be slightly more motivated to look at the papers?
This might work even if they both claim . Ken and I, personally, have had more claims of brought to our attention. Even in this case they would be highly motivated: the awards, the prizes, the praise will go to the one who is correct.
One difference in our situation from classic empirical science is the nature of gaps in knowledge. For example, one of the big current controversies in physics is over the existence of dark matter. The Wikipedia article we just linked seems to date mostly to years around 2012 when dark matter was more widely accepted than strikes us today (see also this and this). There are cases where two competing theories are incompatible yet the available data do not suffice to find a fault in either.
Whereas, with claimed proofs of incompatible statements, such as and , at least one must have a demonstrable error. The statements themselves may have barriers all the way up to undecidability, but that does not matter to judging the proffered proofs.
The method may be more applicable in life sciences where the gap is gathering sufficient field or lab observations. For a topical example, consider claims about the risk or safety of human gatherings amid the pandemic. One extreme is represented by the extraordinary claim, which is evidently quite excessive, that the Sturgis motorcycle rally in August led to over 250,000 Covid19 cases. The other extreme would be analyses used to justify gatherings with minimal precautions. The extremes cannot coexist. The means to arbitrate between them are available in principle but require costly social effort for contact tracing and testing as well as resolving mathematical issues between epidemiological models.
What do you think of our new checking method? Should it be more widely employed for evaluating claims and hypotheses?
]]>
Some differences from the Computational Lens
Chai Wah Wu, Jonathan Lenchner, Charles Bennett, and Yuhai Tu are the moderators for the four days of the First IBM Research Workshop on the Informational Lens. The virtual workshop begins Tuesday morning at 10:45 ET. The conference has free registration and of course is online.
Today we preview the conference and discuss a few of the talks.
The workshop’s name echoes the moniker “Through the Computational Lens” of initiatives led by the Simons Institute at Berkeley and used for a 2014 workshop organized by Avi Wigderson at IAS. A prospectus by the theory group at U.C. Berkeley led off with quantum computing. So will Tuesday’s talks, a full day on quantum by seven leaders we say more about below.
Then the meeting will branch in some different directions from the computationallens themes. The preface on the workshop website says:
Viewing the world through an informational lens, and understanding constraints and tradeoffs such as energy and parallelism versus reliability and speed, will have profound consequences throughout technology and science. This includes not only mathematics and the natural sciences like physics and biology, but also social sciences such as psychology and linguistics. We aim to bring together leading researchers in science and technology from across the globe to discuss ideas and future research directions through the informational lens.
The other three days have talks that reach into all these areas, a dazzling array. We have made a collage of the twentysix speakers currently listed on the schedule. Several faces are long familiar but others are novel to us.
The opening morning has Alexander Holevo between Aram Harrow and Gil Kalai. Among many other accomplishments, Holevo is known for a theorem that implies that qubits can yield at most bits of classical information. In particular, any attempt to encode the edges of a general vertex graph via entanglements between pairs among qubits must be extremely lossy. He will talk about quantum channels and give a structure theorem for quantum Gaussian observables.
We don’t have information yet on Aram’s talk. But Gil will update us on the state of his skepticism about the feasibility of largescale quantum computing. This was the subject of the 2012 debate between Aram and Gil that spanned eight posts on this blog. Here are Gil’s title and abstract:
Computational complexity, mathematical, and statistical aspects of NISQ computers.
Noisy IntermediateScale Quantum (NISQ) Computers hold the key for important theoretical and experimental questions regarding quantum computers. In the lecture I will describe some questions about computational complexity, mathematics, and statistics which arose in my study of NISQ systems and are related to:
a) My general argument “against” quantum computers,
b) My analysis (with Yosi Rinot and Tomer Shoham) of the Google 2019 “huge quantum advantage” experiment.
IBM have expressed their own skepticism of Google’s claims, which we mentioned in our own review of the experiment last year. IBM of course also have their own quantum computing initiative.
We may hear about its state from Charlie Bennett, whose talk leads off the afternoon but has yet to be measured, along with the following talk by Isaac Chuang. Then will come Scott Aaronson. If “tomography” in his title sounds to you like a CAT scan, you could consider this a “Schrödinger’s Cat” scan. Well, we should let Scott tell it—the 2018 paper he mentions is this.
Shadow Tomography of Quantum States: Progress and Prospects.
Given an unknown quantum state , and a known list of twooutcome measurements , “shadow tomography” is the task of estimating the probability that each accepts , by carefully measuring only a few copies of . In 2018, I gave the first nontrivial protocol for this task. In 2019, Guy Rothblum and I exploited a new connection between gentle measurement of quantum states and the field of differential privacy, to give a protocol that requires fewer copies of in some cases, and has the additional advantage of being online (that is, the measurements are processed one at a time). Huge challenges remain in making shadow tomography practical with nearterm devices; extremely recently Huang, Kueng, and Preskill took some promising steps in that direction. I’ll survey these developments and the challenges that remain.
Then Srinivasan Arunachalam, who also works at IBM T.J. Watson in Westchester, NY, will finish the day with another talk about inferring from samples:
Sampleefficient learning of quantum manybody systems.
We study the problem of learning the Hamiltonian of a quantum manybody system given samples from its Gibbs (thermal) state. The classical analog of this problem, known as learning graphical models or Boltzmann machines, is a wellstudied question in machine learning and statistics. In this work, we give the first sampleefficient algorithm for the quantum Hamiltonian learning problem. In particular, we prove that polynomially many samples in the number of particles (qudits) are necessary and sufficient for learning the parameters of a spatially local Hamiltonian in norm. Our main contribution is in establishing the strong convexity of the logpartition function of quantum manybody systems, which along with the maximum entropy estimation yields our sampleefficient algorithm. Our work paves the way toward a more rigorous application of machine learning techniques to quantum manybody problems.
The workshop has lunch breaks as usual but they are not lunch breaks. Lunch is served at IBM T.J. Watson, and I (Ken) can vouch from times I have been hosted there by Jon Lenchner that the food is wonderful, but attendees will not be on hand to partake. I have known Jon since we were part of the New York area chess scene in the 1970s. Among Jon’s activities in the past five years have been directing IBM’s research center in Nairobi, Kenya, and helping the Toronto Raptors assemble a championship basketball team via player analytics. The latter is not technically related to my chess analytics, but we have greater shared interests in ideas for lower bounds on uniform complexity classes.
Instead of physical lunch, the lunch breaks are moderated panel discussions. So you can bring your own lunch while watching and listening via IBM’s WebEx or other portal. Maybe there will be time for remote attendees to ask questions—though not with your mouth full, as our mothers would say. A few years ago, my department began running catered Friday lunch forums under the name “UpBeat,” but those too are now remote and B.Y.O.L. As for the other thing the “L.” can stand for besides “lunch,” we can mention that the pandemic has rendered onsite restrictions moot.
There will also be Q & A and panel discussion sessions for 45 minutes after each day’s last talk.
We wish we could attend all the talks. We imagine they will be available afterward as recordings, but especially when there is live Q & A it is nice to experience them in the moment as at a physically intimate workshop. Rather than list all the speakers and titles here—you can find them on the abstracts page, after all—we will just highlight a few that catch our eye:
Fun facts about polynomials, and applications to coding theory: Mary Wootters.
Here are some (fun?) facts about polynomials you probably already know. First, given any three points, you can find a parabola through them. Second, if you look at any vertical “slice” of a paraboloid, you get a parabola. These facts, while simple, turn out to be extremely useful in applications! For example, these facts are behind the efficacy of classical ReedSolomon and ReedMuller codes, fundamental tools for communication and storage. But this talk is not about those facts — it’s about a few related facts that you might not know. Given less than three points’ worth of information, what can you learn about a parabola going through those points? Are there things other than paraboloids that you can “slice” and always get parabolas? In this talk, I will tell you some (fun!) facts that answer these questions, and discuss applications to error correcting codes.
Punch Cards and the Difference Engine: William Gibson and Bruce Sterling.
We discuss the power of the card concept. By storing a finite amount of data on a “punch card” we can structure data handling to be straightforward and safe. The machine we plan will be thousands of times slower than even the first vacuumtube computers were. But our novel use of steam and punch cards does have merits: cards are physical and help solve security and also privacy issues.
Reasoning about Generalization via Conditional Mutual Information: Lydia Zakynthinou.
We provide a framework for studying the generalization properties of machine learning algorithms, which ties together existing approaches, using the unifying language of information theory. We introduce a new notion based on Conditional Mutual Information (CMI) which quantifies how well the input (i.e., the training data) can be recognized given the output (i.e., the trained model) of the learning algorithm. Bounds on CMI can be obtained from several methods, including VC dimension, compression schemes, and differential privacy, and bounded CMI implies various forms of generalization guarantees. In this talk, I will introduce CMI, show how to obtain bounds on CMI from existing methods and generalization bounds from CMI, and discuss the capabilities of our framework. Joint work with Thomas Steinke.
Rebooting Mathematics: Doron Zeilberger.
Mathematics is what it is today due to the accidental fact that it was developed before the invention of the computer, and, with a few exceptions, continues in the same vein, by inertia. It is time to start all over, remembering that math, is, or at least should be, the math that can be handled by computers, keeping in mind that they are both discrete and finite.
Oh wait, one of these talks is both less novel and more novel than the others. Can a virtual workshop have a virtual virtual talk? The cards were arguably “the” informational lens for almost 100 years.
What have been your experiences with topoftheline virtual workshops? Our hats are off to the organizers of this one, and we are looking forward to it.
]]>