IAS page |

Karen Uhlenbeck is a mathematician who has won a number of awards in the past and has just now been announced as winner of the 2019 Abel Prize.

Today Ken and I want to explain a tiny bit about what Uhlenbeck did.

The Abel Prize citation says that Uhlenbeck won for

“pioneering achievements in geometric partial differential equations, gauge theory, and integrable systems, and for the fundamental impact on analysis, geometry and mathematical physics.”

A story in *Quanta* and another in *Scientific American* are among those with readable summaries of the general nature of this work. The latter describes Uhlenbeck’s discovery with the mathematician Jonathan Sacks of a phenomenon called *bubbling* as follows:

Sacks and Uhlenbeck were studying ‘minimal surfaces,’ the mathematical theory of how soap films arrange themselves into shapes that minimize their energy. But the theory had been marred by the appearance of points at which energy appeared to become infinitely concentrated. Uhlenbeck’s insight was to “zoom in” on those points to that this were caused by a new bubble splitting off the surface.

Some of the coolest comments are by Uhlenbeck’s doctoral graduate Mark Haskins in the story in the current issue of *Nature*.

Haskins says Uhlenbeck is one of those mathematicians who have ‘an innate sense of what should be true,’ even if they cannot always explain why.

The story recounts his often being baffled by answers to his questions, thinking Uhlenbeck had misheard them. But

“maybe weeks later, you would realize that you had not asked the correct question.”

Simon Donaldson wrote a piece in the current issue of *AMS Notices* that explains Uhlenbeck’s research in the Calculus of Variations. The article starts with

You can think of as assigning a cost to a function . The goal of the calculus of variations is to find the best that minimizes subject to some conditions on . This is a huge generalization of simple minimization problems that arise in basic calculus. He then goes on to explain that in order to study the minimum solutions of such a function one quickly needs to examine partial differential equations. The math gets complex and beautiful very quickly.

As computer scientists who like discrete structures this is not our sweet spot. We rarely use partial derivatives in our work. Well not very often. See these two posts for an example.

To get a taste of this area, we will consider a classic variation problem coming out of these helpful online notes. It leads to integrals such as

Well, we take to integrals even less than partial derivatives.

We will change things up by starting with a discrete approach—as is our wont. Our given task is to prove in general that a straight line is the shortest path from the origin to a given point . We first consider polygonal paths with line segments.

First, if then the only option allowed is to go from to in one line segment. Thus the conclusion holds trivially: the Euclidean distance is the minimum length of a -segment path.

Now let . Let

be a series of line segments that form the shortest path from to . Now by induction, the minimum length of a path of up to segments from to is via a straight line from to . And the length of the segment from the origin to of course is . Now the Euclidean triangle inequality says that the length which bounds the length of this path from below is not less than . Thus we have proved it for and the induction goes through.

What we really want to do, however, is prove that is the shortest length for any path, period. The path need not have any straight segments. It may go in circular arcs, continually changing direction. The arcs need not be circular per-se; they could be anything.

The idea that occurs to us computer scientists is to let go to infinity. That is, we want to consider any path as being a limit of polygonal paths. But is this really legitimate? We can certainly approximate any path by paths of segments. But real analysis is littered with examples of complicated curves—themselves defined by limits—that defeat many intuitive expectations about continuity and limits. So how can we make such an infinitistic proof go through rigorously? This is where the calculus of variations takes over.

To set up the problem for fully general paths, we could represent them as functions such that and . The length of the path is then obtained by integrating all the horizontal and vertical displacements:

Wrangling this integral seems daunting enough, but the real action involving only begins after doing so. Both the length and the body of the integral are *functionals*—that is, functions of a function. We need to minimize over all functions . This is a higher-order task than minimizing a function at a point.

Our source simplifies the problem by assuming without loss of generality that increases from to , giving the function as instead. Then the problem becomes to minimize

The body can be abstracted as a functional where and its derivative are functions of . Here we have and . The condition for to minimize was derived by Leonhard Euler and Joseph Lagrange:

We won’t reproduce here how our source derives this but give some interpretation. This is a kind of regularity property that must obey in order to minimize . To quote Donaldson’s survey:

Then the condition that is stationary with respect to compactly supported variations of is a second order differential equation—the Euler-Lagrange equation associated to the functional.

However you slice it, the point is that the equation (3), when applied to cases like the above, is attackable. In the minimum-length path example, our source—after doing eight more equation lines of work—deduces that must be constant. Any function argument that yields this must be a straight line. The initial conditions force this to be the straight line from to .

The point we are emphasizing is that this simple case of paths in the plane—and its abstraction via functionals that are ultimately founded on one variable —have a ready-made minimization scheme, thanks to Euler and Lagrange. The scheme is fully general—not subject to the caveats about our simple approximation by line segments.

What happens in higher-dimensional cases? We can quote from the wonderful two-page essay accompanying the Abel Prize citation. It first notes the importance of a condition on functionals and their ambient spaces named for Richard Palais and Stephen Smale, which however fails for many cases of interest including harmonic maps.

[T]he Palais-Smale compactness condition … guarantees existence of minimizers of geometric functionals and is successful in the case of 1-dimensional domains, such as closed geodesics. Uhlenbeck realized that the condition of Palais-Smale fails in the case of surfaces due to topological reasons.

The papers with Sacks explored the roots of these breakdowns and found a way to patch them. The violation of the Palais-Smale condition allows minimizing sequences of functionals to converge with dependence on points outside the space being analyzed. But those loci are governed by a finite set of singular points within the space. This enables the calculus outside the space to be treated as a re-scaling of what goes on inside the space.

In general cases the view of the process from inside to outside can be described and analyzed as bubbles emerging from the singular locations. More than this picture and interpretation, the Sacks-Uhlenbeck papers produced a now-standard tool-set for higher-dimensional minimization of functionals. It is also another successful marriage of topology—determining the singularities—and analysis.

This work was extensible to more-general kinds of functionals such as a central one of Yang-Mills theory in physics. Geometric properties of a Riemannian manifold are expressed via the concept of a connection and the functional associates to its *curvature* . This is the body for the Yang-Mills functional

There is a corresponding lifting of the Euler-Lagrange equation. This led to developments very much along lines of the previous work with Sacks and more besides. There was particular success analyzing cases where has dimension 4 that were soon relevant to Donaldson’s own Fields Medal-winning research on these spaces. Most in particular, Uhlenbeck working solo proved that these cases were immune to the “bubbling” issue—with the consequence as related in *Quanta* that

any finite-energy solution to the Yang-Mills equations that is well-defined in the neighborhood of a point will also extend smoothly to the point itself.

We’ve been happy to report that Uhlenbeck has won the prestigious Abel Prize. We have avoided referencing one aspect—despite giving numerous quotes verbatim—that can be appreciated in subsequent fullness here and here and in this. By so doing we’ve abided the desire stated in the twelfth paragraph of this essay. We wonder if this is the right way to do things. What do you think?

]]>

*Facing nonexistential realities*

Neil L. is a Leprechaun. He has graced these pages before.

Today, the day before St. Patrick’s Day, we ponder universal riddles of existence. Ken, who will visit me this coming week, insisted on reporting what happened this morning.

I woke up early, walked alone into our living room, and was amazed to see Neil already here. He was looking out the window at dawn sunlight flooding over St. Patrick’s Cathedral and the many other landmarks we can see from our apartment in midtown Manhattan.

Spring has at last given an earnest of coming. The scene was transfixing, exhilirating, all the glory of nature except for the green-clad little man puffing his pipe with his back to me.

“Neil—?”

He turned and tipped his hat to me but then turned back to the window. I wondered if I had caught him in meditation. I stood back until he turned a second time. He had always visited me late on the evening before St. Patrick’s Day, or the morning of it.

“You’re here early.”

“Aye. Top o’ the morning to ye.”

“Why?”

He gave me a long stare but warmly, not hostile.

“By the living daylights in ye…”

Then I understood. The Romans have their Ides, the Irish have the 17th, but I got the day in between—when last year I was not here but in an operating room for long hours.

“Thanks” was all I could say. I had actually come out to look up a mathematical idea for a “part 2” post I’ve been struggling with ever since posting the “part 1” years ago. This pulled me back to math, then to how Neil has always tricked me and I’ve never been able to pin him down. I realized this might be my best opportunity.

“Neil, may I ask you a Big Question? Not directly personal.”

He simply said, “Aye.” No tricks yet.

I had my chance—something I’ve wanted to ask him for years but not had an opening for.

“Neil, what can you tell me about *female* Leprechauns?”

Neil had mentioned family, cousins, even siblings—I figured they had to come from somewhere. But nothing I’d read mentioned female leprechauns. I braced for silence again, but Neil’s reply was immediate and forthright:

“They comprise the most heralded subset of our people. They are included in everything we do.”

“How so?”

“Our society agreed early on that every female leprechaun should have the highest station. All female leprechauns are sent to the best schools, with personal tutoring reserved in advanced subjects. They are our superpartners in every way.”

“How do you treat them?”

“If a lady leprechaun applies for a position, she is given full consideration. Whenever a lady leprechaun gives advice, it is harkened to. None of us has ever ‘malesplained’ or demeaned a lady leprechaun in any way.”

“Are they as short—uh, tall—as you?”

“Every adult female leprechaun is between one cubit and one-and-a-half in stature, like us menfolk. We don’t have quite the variation of your human species.”

“I would surely like to meet one.”

Neil heaved a sigh.

“Ye know the trepidation with us and your womenfolk.” Indeed, I recalled Neil’s story of what was evidently his own harrowing encounter with the wife of William Hamilton. “It is symmetrical, I’m afraid. No female leprechaun has ventured into your world.”

I had figured that.

“But can you give me an *example* of a female leprechaun? Someone who has done something notable.”

Neil puffed on his pipe. I knew I had him cornered and moved in.

“Neil, is everything you just said about female leprechauns *true*?”

To my surprise, he answered quickly.

“Aye.As in mathematics, each of my statements was perfectly true.”

Oh. Then I realized how empty domains are treated in mathematical logic. I looked at him sharply.

“Neil, there are no female leprechauns. All of your statements have been vacuous, bull-.”

“That’s not right or fair. That which I told ye have been important values of our society from the beginning. Many studies of female leprechauns have established their natures precisely. We have devoted vast resources to the quest for female leprechauns. Readiness—how we would integrate them—this is enshrined in all of our laws. Even your late Senator, Birch Bayh, was informed by us at Notre Dame.”

Neil’s sincerity was evident. I still felt bamboozled. He carried on.

“Ye have whole brilliant careers studying physical objects that may not exist. The analysis of them is still important in mathematics and other scientific applications. And in mathematics itself—”

I was actually relieved to turn the subject back toward mathematics, as Neil and I usually talked about.

“—your most storied advance of the past quarter century was achieved by ten years’ concerted study of the Frey-Hellegouarch curve. Which by Fermat’s Last Theorem does not exist.”

I could not argue with that. This set me pondering. Whether mathematical objects *exist* in reality defines the debate over Platonism. But what, then, is the reality of *nonexistent* mathematical objects?

Nonexistent objects still follow rules and aid deductions about objects that do exist. So, they too exist? Moreover, we often cannot prove that those objects do not exist—for all we know they may be just as real. Polynomial-time algorithms for satisfiability and many other problems—those are bread and butter in theory. I wanted Neil to tell me more about some of them.

“Neil, have your folk worked on the projective plane of order 10?”

“Aye. Not only did we long ago have all the results of this paper on them, our engineers recommended it as the thoroughfare pattern for new towns.”

“You have built them?”

“Indeed—in two new districts around Carlingford back home.”

“But they don’t exist.”

“By our laws and treaties they exist until your people prove they don’t. Then they have a 50-year sunset. Which means the towns have 20 years left.

“That’s sad.”

“It’s actually a much better system than the Scots have. Their villages of this type such as Brigadoon come back every 100 years just for one day. But Brexit may put paid to our two towns sooner. We are protected under E.U. law, but those towns are just over the Northern Ireland border.

Cropped from src1, src2 |

“How about odd perfect numbers?”

“Nay. Ye know the rules. We can nay tell that which your folk do not know, unless there is nothing else ye learn that ye could not learn without us.”

I pondered: what could I ask about computational complexity that Neil could respond to in this kind of zero-knowledge way? But Neil cut in.

“It will be much more fruitful to discuss objects that yeknowdo not exist—and yet yeusethem anyway.”

I thought, which could he mean? I recalled the story we told about a student getting his degree for a thesis with many new results on a strong kind of Lipschitz/Hölder function even though an examiner observed that non-constant ones do not exist. Ken heard the same story as an undergraduate, with the detail that it was an undergraduate senior thesis, not PhD. The Dirac delta function? Well, that definitely exists as a measure. I forgot an obvious example until Neil said it.

“There is the field with one element: .”

Of course. We posted about it.

“It is an active research subject, especially in the past dozen years. Your Fields medalist Alain Connes has co-written a whole series of papers on it. Yuri Manin also. Your blog friend Henry Cohn got a paper on it into theMonthly—think of the youngsters… Edifices are even being built upon it.”

“Can you give some less-obvious ones for our readers?”

“There is the uniform distribution on the natural numbers. This paper gives variations that do exist, but it makes the need and use of the original concept clear.”

Ken had in fact once finished a lecture on Kolmogorov complexity when short on time by appealing to it.

“There are fieldsFwhose algebraic closures have degrees 3 and 5 overF. Those are highly useful.”

I knew by a theorem of Emil Artin and Otto Schreier that they cannot exist. But it struck me as natural that they *should* exist. I wondered how much one would need to tweak the rules of group theory to make them possible.

“Then there is the free complete lattice on 3 generators.”

Wait—I thought that could exist if one loosened up set theory. So I asked:

“Neil, do some of these objects exist in alternative worlds known to leprechauns in which the rules of mathematics are crazy, not like the real world?”

“What makes you think ye don’t live in one of those other worlds?”

Before I could react, Neil picked up a solid glass sphere art object that belonged to Kathryn. He pulled out a diamond-encrusted knife and slashed it as green light flashed in every direction until he stopped. Then from his hands he gave me two glass spheres of the same size.

“For the happy couple—la le Padraig suna ye-uv!”

And he was gone.

What is your view on the nature of mathematical reality? Do nonexistent mathematical objects exist? What are your favorite examples?

]]>

*Bill and Clyde’s new book*

Bill Gasarch and Clyde Kruskal are colleagues in Computer Science at the University of Maryland. They have just seen the publication of their book *Problems With a Point*.

Today Dick and I congratulate them on the book and give a brief overview of it.

The tandem of Bill and Clyde were featured before on this blog in connection with a series of papers with James Glenn on sets of numbers that have no arithmetical progressions of length 3.

Can a “series” have only two members? We could add a third paper to the series, but it is by Glenn and Bill with Richard Beigel rather than Clyde and is on the different topic of multi-party communication complexity. But Clyde gets into the act since it is the topic of Chapter 19 of the book. That chapter analyzes a game form of problems that go back to a 1983 paper by Dick with Ashok Chandra and Merrick Furst. Does this mean Dick should get some royalties? Note that this last link is to Bill’s website. And ‘vdw’ in this link seems to refer to van der Waerden’s theorem, which is a pillar of both Ramsey theory and number theory, which in turn receive much attention in the book.

My last paragraph flits through divergent questions but they are representative of things that we working mathematical computing theorists think about every day—and of various kinds of content in the book. Of course, Bill partners on Lance Fortnow’s prominent blog and so has shared all kinds of these musings. The book, which incorporates numerous posts by Bill but is not a “blog book” *per-se*, furnishes contexts for them. Let’s look at the book.

The book has three main parts, titled:

- Stories With a Point.
- Problems With a Point.
- Theorems With a Point.

All the chapters except three are prefaced with a section titled “Point”—or “The Point” or “Points”—after one sentence on prior knowledge needed, which often reads, “None.”

The first part begins with a chapter titled, “When Terms From Math are Used By Civilians,” one of several with social discussion. The second is about how human aspects of sports upset statistical regularities one might expect to see. For example, many more baseball hitters have had season batting averages of .300 than .299 or .301, evidently because .300 is a “goal number” everyone strives to meet. There are numerous things in the book I didn’t know.

The next chapter largely reproduces this record-long post on the blog but adds much extra commentary. Then there are chapters on what makes a mathematical function worth defining, on how mathematical objects are named, on Bill’s visit to a “Gathering for [Martin] Gardner” conference, on coloring the plane, two on imagined applications of Ramsey Theory to medieval history, and one on the methodological and social impact of having theorems that represent barriers to ways of trying to prove other theorems.

This last chapter of Part I draws on several of Bill’s posts. It goes most to the heart of what we all in computational complexity grapple with daily and benefits from having more context: As we just mused in this blog’s 10-year anniversary post, the major questions in complexity are not only as open as when we started, but as open as when the field started 50-plus years ago. There has barely been any discernible direct progress, and this is not only a concern for entering students but a retardant on everyone—about which we’ve given exhortation.

There are instead barrier theorems saying *why* the main questions will not be resolvable by the very techniques that were used to build the field. Of three broad categories of barriers, “oracle results” de-fang all the methods typically taught in intro graduate complexity courses. Those methods meter computations at inputs and outputs but not how they “progress” inside toward a solution. Oracle languages supply blasts of external information that either make progress trivially quick or allow conditioning the terms of the complexity-class relation under analysis to information in the oracle that is tailored to obfuscate. Oracles of the former kind usually collapse complexity classes together, such as any polynomial-space complete language making , whereas oracles of the latter kind drive them apart (so ) but often for reasons that frustrate as much as enlighten. When contemplating a new complexity class relation it is incumbent to frame a “relativized” version of it and see if some oracles can make it true and others false. Such oracle constructions were initially easy to come by—“fast food” for research—but their ultimate besideness-of-the-point is reflected in this early post by Dick.

Next after oracles is the “Natural Proofs” barrier, which basically cuts against all the techniques taught in advanced graduate complexity courses unless they can somehow be weirdly narrow or intensely complex in their conceptual framing. The attitude on those is captured by the title of another early post by Dick titled, “Who’s Afraid of Natural Proofs?” The third barrier is algebrization, which cuts against efforts to escape from oracles.

Amid all this environment of obstruction, what of consequence can we prove? Well, the chapter on this has by far the highest concentration of Bill’s trademark all-capital letters and exclamation points. The initial “Point” section is titled, “Rant.” (The chapter also includes a page on the attempts to prove Euclid’s parallel postulate, which might be called the original “problem with a point.”)

Much of Part II involves the kind of problems that were used or considered for high-school mathematics competitions. By definition those are not research problems but they often point toward frontiers by representing accessible cases of them. The appealing ones blend equal parts research relevance, amusement value, and reward of technical command. Often they require mathematical “street smarts” and a repertoire of useful tricks and proof patterns. Practicing researchers will find this aspect of the book most valuable. The problems are mainly on the number-theory side of recreational mathematics, plus geometric and coloring problems. Egyptian fractions, prime-factor puzzles, sums over digits, summation formulas, and of course Ramsey theory all make their appearance. Here is one problem I mention partly to correct a small content error, where the correction seems to make the problem more interesting. Say that holds if there are integers such that

A general principle discovered by Joseph Sylvester gives cases such as being witnessed by

How can it and related ideas be extended to other cases? The particular problem is given to find the least such that holds—and to find a witnessing sum. The error is an assertion that “by writing as a sum of ‘s,” but then the reciprocals add up to not . So is there any bound on ? That is rather fun to think about.

This problem’s chapter is made from a short post with an enormous comments section. There is also a chapter on the boundary between a fair problem and a “trick question” that much extends an April Fool’s post on the blog.

Part III continues the nature of the problem content with emphasis on the worths of proofs of theorems. One illustration involving perfect numbers and sums of cubes notes that the proof loses value because it also applies to certain non-perfect numbers. Two things *en-passant*: First, Lemma 21.1 in this chapter needs qualification that are relatively prime and is prime. Second, this prompts me to note via Gil Kalai the discovery by Tim Browning that the number 33 is a sum of three cubes:

Gil points to a Quora post by Alon Amit. There is only the bare equation on Browning’s homepage, and nothing about his methods or how much he used computer search is known. The theme of proofs by computer runs through several chapters of Bill and Clyde’s book. It is also current on Scott Aaronson’s blog—we would be remiss if we did not mention Scott in this post—and more widely.

The penultimate chapter covers the beautiful theory of rectangle-free colorings of grids. We mentioned Bill’s involvement with this Ramsey-style problem in a post on what happened to be this blog’s third anniversary. We connected it to Kolmogorov complexity, which together with uncomputability is the subject of the last chapter. This and other chapters I’ve mentioned exemplify how the research ambit connects traditional mathematics with the terms and concerns of computing theory. How to work in this ambit may be the greatest value of the book for students. In all I found the book light, lighthearted, and enlightening.

The big open problem with the book is how it might help gear us up to tackle the big open problems.

[sourced chapter near end of Part II to blog]

]]>[ Jimmy Wales ] |

Jimmy Wales is the co-founder of Wikipedia. Of course this is the wonderful online non-profit encyclopedia we all know and love.

Today I want to talk about using the web to search for math information.

I do a huge amount of research on the web. Years ago I would spend hours and hours each week in the library. I looked for books and articles on various math issues. No more. Today it’s all web-based resources. What has happened is that primary sources—research papers—have become as easily browsable as Wikipedia. Many of them, anyway.

Wikipedia often does not include proofs of mathematical theorems—those are linked to primary papers or authoritative surveys or books. There is a website Proof Wiki which tries to bridge the gap without the reader needing to follow links into papers. Right now they are featuring Euclid’s Theorem on the infinitude of primes—which we have posted about recently and before. They have style guides for how the reasoning is laid out.

There are also many style guides around the web for writing good mathematics and computer science papers. But the guides seem to stop short of the level of the phrases that irk me. When those phrases show up in the primary sources, there’s no fallback. Let’s take a look.

Here are a few top ones. I omit the well known ones, the ones that are obvious, and some that are in preparation for a future post—just kidding.

*It is well known*. Not by me. The reason I am searching for papers on this subject is that I do not know the basics of the area. If you must say this phrase please add a possible reference. That would be extremely helpful.

*Clear from the proof of the theorem*. Not by me. This means that the author did not state the proof in the greatest generality possible. We all do this, but it may help if this is avoided.

*It is easy to see*. Similar to the last item. Ken vividly remembers a seminar in the early 1980s by Peter Neumann at Oxford that showed excerpts from a French mathematician in which the phrase “Il est aisé de voir” appeared often. The French phrase even reads just like the English one. But that mathematician had a reasonable excuse. He was Évariste Galois, and he had a duel early the next morning.

*Proof omitted*. Please no.

*Easy calculation*. Not by me. The reason it usually is avoided is that it is too difficult a computation.

*This case is the same as the previous case*. Not always. This has been the source of errors for me and others over the years. Good place to check the argument. Maybe a suggestion that there should be a Lemma that covers both cases.

*In our paper in preparation*. Please no. I cannot read a paper that does not exist yet.

Finally the worst in my opinion.

*It costs $37.95*. Oh no. Many times we find papers that are behind a pay-wall. I always hate this. The authors of course wish to make they work available to all possible. But the reality is that often papers are protected. Thanks to our friends at Wikipedia and Arxiv that this is not the case for lots of stuff.

But I have wondered who ever pays the crazy amount of $37.95? Does anyone ever pay that? Is it equivalent to saying: This paper is not available.

Ken has one example that has driven him crazy for years and again these past two weeks. Many of you have probably used it often.

*This procedure runs in polynomial time.* Excuse me, *what* polynomial time? At least tell us the best exponent you know…

A related matter is attending to “edge cases” of theorems. Sometimes the edge cases are meant to be excluded. For example, “Let ” excludes ; maybe nothing more needs to be said. But in other cases it is not so clear. A theorem may suggest limits and it is nice to say what happens if one tries to take those limits.

What are you favorite phrases that drive you crazy?

]]>Wikipedia bio source |

Giuseppe Vitali was the mathematician who famously used the Axiom of Choice, in 1905, to give the first example of a non-measurable subset of the real numbers.

Today I want to discuss another of his results that is a powerful tool.

The existence of a set that cannot properly be assigned a measure was a surprise at the time, and still is a surprise. It is a wonderful example of the power of the Axiom of Choice. See this for details.

We are interested in another of his results that is more a theorem about coverings. It is the Vitali covering theorem–see this. The theorem shows that a certain type of covering—ah, we will explain the theorem in a moment.

The power of this theorem is that it can be used to construct various objects in analysis. There are now many applications of this theorem. It is a powerful tool that can be used to prove many nice results. I do not know of any—many?—applications of the existence of a non-measurable set. Do you know any?

Let’s look at an application of the Vitali theorem that may be new. But in any case it may help explain what the Vitali theorem is all about.

Suppose that . We can make the map surjective if we restrict to be equal to . It is not so simple to make the map injective, but we can in general do that also.

Theorem 1Let be a surjective function from to . Then there is a subset of so that is injective from to .

*Proof:* For each in select one from the set and place it into . Recall is the set of so that .This of course uses the Axiom of Choice to make the choices of which to choose. Then clearly is the required set.

The difficulty with this trivial theorem is that cannot be controlled easily if it is constructed via the Axiom of Choice. It could be a very complicated set. Our goal is to see how well we can control if we assume that the mapping is smooth.

How can we do better? The answer is quite a bit better if we assume that is a “nice” function. We give up surjectivity onto but only by a null set.

Theorem 2Suppose that is a surjective smooth map from to where and are open subsets of . Also suppose that locally is invertible. Then there is a subset of so that

- The complement of is a null set.
- The map is injective from to .

That is that for all distinct points and in , . Moreover the map from to is smooth.

How can we prove this theorem? An obvious idea is to do the following. Pick an open interval in so that for an open set in and so that is injective from to . Setting to clearly works: the map is injective on . This is far from the large set that we wish to have, but it is a start. The intuition is to select another open interval that is disjoint from so that again is injective from to . We can then add to our .

We can continue in this way and collect many open sets that we add to . Can we arrange that the union of these sets yield a so that is most of ? In general the answer is no. Suppose that the intervals are the following:

for Roughly we can only get about half of the space that the intervals cover and keep the chosen intervals disjoint. If we select then we cannot select since

Vitali’s theorem comes to the rescue. It allows us to avoid his problem, by insisting that intervals have an additional property.

The trick is to use a refinement of a set cover that allows a disjoint cover to exist for almost all of the target set. The next definition is critical to the Vitali covering theorem.

Definition 3Let be a subset of . Let be intervals over in some index set . We say these intervals are acoverof proved is a subset of the union of all the intervals. Say the intervals also are aVitalicover of provided for all points in and all , there is an interval that contains and .

The Vitali theorem is the following:

Theorem 4Let be a subset of . Let be intervals for in some index set . Assume that the family is a Vitali cover of . Then there is a countable subfamily of disjoints intervals in the family so that they cover all of except for possibly a null set.

The Vitali theorem can be extended to any finite dimensional space . Then intervals become disks and so on.

Do you see how to prove Theorem 2 from Vitali’s theorem? The insight is now one can set up a Vitali covering of the space .

]]>Cropped from Device Plus source |

Tetsuya Miyamoto is a mathematics teacher who divides his time between Tokyo and Manhattan. He is known for creating in 2004 the popular KenKen puzzle, which the New York Times started running ten years ago. As with its sister puzzles Sudoku and Kakuro, unlimited-size versions of it are -complete.

Today we observe the 10th anniversary of this blog and ask what progress has been made on the question.

The is a question about *asymptotic* complexity. From time to time we have tried to raise corresponding questions about *concrete* complexity that might yield more progress. What catches our eye about the KenKen puzzles is that their generation is a full-blown application within concrete complexity. The NYT’s KenKen puzzles are all generated using software by David Levy that can tailor their hardness. Quoting the NYT anniversary article by Will Shortz:

[Levy’s] program knows every possible method for solving a KenKen, which he has rated in difficulty from easy to hard. Thus, when a KenKen has been made, the computer knows exactly how hard it is.

This seems to say there is a hardness measure that is objective—quite apart from the idea of having human testers try the puzzle and say how hard they found it to be. We surmise that it is lower for instances that have more forced plays at the start. We wonder whether Levy’s criteria can be generalized.

Incidentally, this is the same Levy who won a challenge chess match in 1978 against the computer Chess 4.7 to complete his win of a famous ten-year $1,000+ bet. He lost $1,000 back when he was defeated by Deep Thought in 1989. He later became president of the International Computer Games Association (ICGA), whose *Journal* published a nice paper on the -completess of the aforementioned puzzles and many others.

GLL’s first post, on 12 Feb. 2009, featured Stephen Rudich and his work on the “Natural Proofs.” Two other posts that day covered other aspects of why the question is hard. Our question, dear readers, is:

Has anything happened in the past ten years to make any part of those posts out-of-date in the slightest way?

We won’t claim any such progress, though we have tried to stir ideas. In the meantime, we have written 806 other posts:

- Some have featured our own work and ideas (considering Ken’s chess research in a separate vein);
- some have featured others’ direct attempts at breakthrough lower and upper bounds (with a few successes);
- many have featured other kinds of results by others;
- many have pulled “idea nuggets” from the past;
- many have been humor and social commentary.

To date, we’ve had 18,575 comments plus trackbacks on these posts and just over 2.1 million views. We are less able to quantify impacts, beyond occasionally seeing citations of articles on the blog as sources. We try for precision as well as readability and are grateful for reader comments with fixes when we slip up on the former.

There continue to be claims of proofs that and some that While these proofs do not seem to be correct, there is something that we wish to remark about them. Many argue as follows:

There is some problem say that seems to require a search of exponentially many objects. Then the proof states that any algorithm for must actually look at all or most of the exponentially many objects. This of course is where the proof is not complete.

There is some sense to these proofs. They seem related to the oracle proofs that for example show that for some oracle set it is the case that

we have discussed these types of proofs before—we even said that we did not like them.

The trouble with these results that are rigorous is that they change vs in a central manner, and this seems to make the results much less interesting. Roughly here is how they argue: Imagine that for each we either put a string of length into or we do not. The point is that if we do this in a *unpredictable* manner then a polynomial time machine will not be able to decide whether for there is or is not a string of length in . But a nondeterministic machine with just use its power and guess. This shows, essentially, that

is true.

There is some relationship to many attempts to show . The proofs often argue that one must look at all the objects. The counterpart here is that a polynomial time machine will not have enough time to check the strings of length to see if they are in . But this works in the oracle case because we allow the rule that decides whether or not a string is in to be very complicated. In the real world, in the world where we study the real question, we cannot assume that -complete problems use a complicated rule. *That is precisely what we are trying to prove*.

What can we say? Mostly the big open questions remain. We still have no non-linear lower bounds on circuit complexity and no progress of any definite kind on . What do you think?

What is commonly hailed as one of the two biggest results in our field last year was a positive solution to what is intuitively a slightly weaker form of the Unique Games Conjecture (UGC). For UGC we can refer you to Wikipedia’s article:

The note [2] is in turn a reference to a 2010 post here. The new paper proves hardness for the relaxed situation where, roughly speaking, a trial assignment to a node in a constraint graph limits the other node on any connecting edge to at most two possible values, rather than a unique value as in UGC. This relaxation retains many properties that had caused disbelief in the original UGC, yet it was proved—in that sense a big deal.

Nevertheless we note that UGC, at its core, is just asserting that for arbitrarily small , with our power to make other parameter(s) as large as desired, we can execute an -hardness proof. We have been executing -hardness proofs for almost fifty years. That is something we in the field have proven good at. True, these hardness results becomes lower bound proofs if and when is proved, and true, we have been as vocal as any on the standpoint that significant lower bounds will come from constructions that are usually thought of as being for upper bounds. But the new proof from a year ago doesn’t feel like that. We invite readers to tell us connections from UGC to the possibility of actually constructing lower bounds.

We at GLL thank you all for your help and support these ten years. Ken and I plan to continue doing what we have done in the past. Plan on a visit from our special friend on St. Patrick’s day, for example. Thanks again and let us know how we are doing.

[fixed date of first GLL post]

]]>

*Solving a type of Fermat Equation*

Leo Moser was a mathematician who worked on a very varied set of problems. He for example raised a question about “worms,” and invented a notation for huge numbers.

Today I want to talk about one of his results with a very short proof.

No, it is not about worms. That is a question in discrete geometry that is still open I believe: “What is the region of smallest area which can accommodate every planar arc of length one?” The region must be able to hold the arc inside but the curve can be moved and rotated to allow it to fit. A disk of diameter works and has area about . It is possible to do much better and get around .

See this paper for some additional details.

No, it is not about a conjecture of Paul Erdős See this for a great paper on this result:

Theorem 1Suppose thatThen any solution in integers with must have

Erdős conjectured there are no solutions at all. It is easy to check that for the unique solution is a bit smaller:

Yes, it is about the solution to a natural family of Diophantine equations. This result of Moser comes from an old paper of his. The result can be found on the wonderful blog called cut-the-knot written by Alexander Bogomolny.

The question considered by Moser is simple to state:

Consider the equation over the integers where are fixed values that are relatively prime. Show that there are infinitely many integer solutions.

The surprise, to me, is that this equation always has integer solutions. I thought about it for a bit and had no idea how to even start.

The solution is as follows. The initial insight is that the restriction on the exponents implies that there are integers so that .

Wait a minute. We must be careful by what we mean by “the values of are relatively prime.” We need more than the greatest common divisor (GCD) of is . We need that and are relatively prime. Note that have GCD equal to but no matter which of the triple is “” we cannot find the needed :

I thank Subrahmanyam Kalyanasundaram for catching this.

The next idea is not to look for a single set of solutions but rather to find a parametrized solution. That is try to find expressions for that depend on some variables so that for all the equation is satisfied.

Then set

Note as vary over integers the values of and vary over integers too. The claim is that this is a parameterization of the equation. Let’s see why. We need to figure out what is equal to. It looks a bit nasty but it is not. Let . Then

So is

Which magically is

Thus setting

implies that

Very neat. By the way we do need to note that as and run through integers the values of and and vary enough to get an infinite number of solutions. A simple growth argument shows that this is true.

The key trick was to **not** use a standard idea and apply the binomial theorem and expand

My algebra DNA suggests that expanding such an expression is often a good idea. Here it would lead to a mess. This is a case where using the binomial expansion does not work.

I really like Moser’s clever solution to the diophantine equation

Note that it must fail when for by the famous solution to the original Fermat equation.

]]>Amazon India source |

Paul Allison is an emeritus professor of sociology at the University of Pennsylvania and the founder and president of the company Statistical Horizons. They provides short courses and seminars for statistical training.

Today we have a short seminar on statistics and horizons of effectiveness.

Our first topic is about citations. Did we say citations? What are we in research more interested in than citations? Allison co-wrote a paper on a “law” claimed by Alfred Lotka about how the number of citations behaves. Full details in a moment, but two upshots are:

- Over half of the papers are contributed by a few highly prolific authors.
- One-shot authors are roughly of the population but account for only a tiny proportion of the literature.

Allison co-wrote his paper, “Lotka’s Law: A Problem in Its Interpretation and Application,” with Derek de Solla Price, Belver Griffith, Michael Moravcsik, and John Stewart in 1976. Lotka’s law, which is related to George Zipf’s famous law, alleges that over any time period in any scientific or literary field, the number of authors with contributions obeys

where is independent of . This suggests a maximum of on the range of , since higher give , but there is also a probabilistic interpretation: The law says that the total number of papers at is , and that gives a positive constant expectation even when varies as . Both cases yield that out of the total number of papers, which has , over half of them are contributed by a vanishing percentage of highly prolific authors. Meanwhile, one-shot authors are roughly of the population but account for only a proportion of the literature.

However, the paper also remarks on a third case, namely making a fixed constant—since human time is finite in any field. This puts a sharper *horizon* on Lotka’s Law and changes the inferences made as the horizon is approached. The paper shows how the factor intrudes on other inferences they would like to draw, even between the former two cases. And never mind more–recent evidence of breakdowns in Lotka’s law.

We have mentioned de Solla Price before in regard to his founding scientometrics. In practice this is mainly concerned with citation analysis and other productivity metrics, but its widely-quoted definition, “the science of measuring and analyzing science,” strikes us as broader. We feel there should be a component for measuring limitations of the effectiveness of the science one is practicing.

Now of course in statistics there are longstanding measures of statistical power and experiment acuity and of *noise* in general. Nevertheless, the cascading “(non-)reproducibility crisis” argues that more needs to be addressed. The development of software tools to counter “p-hacking” exemplifies a new layer of scientific modeling to do so—which could be called introspective modeling.

I will exemplify with two “horizons” that are apparent in my own statistical chess research. One involves estimating the Elo rating of “perfect play.” The other involves the level of skill at which my data may cease to be effective. The former has captured popular imagination—it was among the first questions posed to me by Judit Polgar in a broadcast during the 2016 world championship match—but the latter is my concern in practice. We will see that these may be the same horizon, approached either by looking down from the stars or up from the road.

I am not the first to do this kind of work or face the issue of its resolving power. Matej Guid, Artiz Perez, and Ivan Bratko made it the sole topic of a 2008 followup paper to their 2006 study of all games in world championship matches. But their indicators strike me as weak. Most simply, they do not try to estimate where their horizon *is*, just argue that their results are not wholly beyond it. We will try to do more—but speculatively. The first step is rock-solid—it is a big surprise I found last month.

I use strong chess programs to take two main kinds of data. My full model uses programs in an analysis mode that evaluates all available moves to the same degree of thoroughness and takes roughly 4–6 hours per game. My quicker “screening” tests use programs in their normal playing mode, which gives full shrift only to what’s considered the best move, but shaves the time down to 10–15 minutes. For my AAAI 2011 paper with Guy Haworth, I used over 400,000 positions from 5,700 games at rating levels from Elo 1600 to 2700 only, all run on my office and home PCs. Below Elo 2000 the available data was so scant that noise is evident in the paper’s table.

Since then, many more games by lower-rated players are being archived—much thanks to the greater availability of chessboards that automatically record moves in the standard PGN format, and to an upswell in tournaments, for youth in particular. Last year, thanks to the great free bandwidth granted by my university’s Center for Computational Research (CCR), I took data in the quicker mode from over 10,600,000 moves from just over 400,000 game-sides (counting White and Black separately) in every tournament compiled by ChessBase as well as some posted only by The Week in Chess (TWIC) or provided directly by the World Chess Federation (FIDE). My two main test quantities are:

- The percentage of the computer’s best move being the one the player chose (“MM%”).
- The average error judged by the computer per move, scaling down large differences (“ASD”).

My 2011 paper found strong linear relations of these quantities to the players’ rating, and great ASD fits on a 3-million-move data set are shown graphically in this post. With MM% and my new 2018 data, here is what I see when I limit to the 1600–2700 range, grouping in “buckets” of 25 Elo points. All screenshots are taken with Andrew Que’s Polynomial Regression applet. They all show data taken with Stockfish 9 run to search depth at least 20 and breadth at least 200 million nodes; the similar data for the chess program Komodo 11.3 gives similar results.

Even with vastly more data, there still does not appear any reason to reject the simple hypothesis that the relation to rating is linear. Not only is , the quality of fit is terrific. The noise under Elo 2000 is minimal.

But now I have over 14,000 moves in individual buckets clear down to the FIDE minimum 1000 rating; only the 2750 bucket with 11,923 moves and the 2800-level bucket with 6,340 (from just a handful of the world’s elite players) lag behind. When those buckets are added, here is what we see:

The linear hypothesis is notably less tenable. Instead, a quadratic polynomial fits supremely well:

Thus it seems I must admit a *nonlinearity* into my chess model. This may not be just about slightly improving my model’s application to players at the ends of the rating spectrum. Philosophically, nonlinearity can be a game-changer: the way Newtonian physics is fine for flying jets all around the globe but finding your neighbor’s house via GPS absolutely requires Einstein.

Let us flip the axes so that is MM% and is rating. Then the intercept of would give the rating of perfect agreement with the computer. Well, here is what we see:

Having the rating of perfect agreement be about 1950—which is a amateur A-level in the US—is ludicrous. The greater import is how the increase stops at Elo 3000 with matching just under 75%. The serious implication I draw is that this helps locate the horizon of effectiveness of the data and my methods based on it. Meanwhile, I’ve had the sense from applications that my full model based on smaller higher-quality data is coherent up to about 3100 but cannot tell differences above that.

Indeed, there is a corroborating indicator of this horizon: The top chess programs, or even different (major) versions of the same program, don’t even match *each other* over 75% with regularity. Moreover, the *same program* will fairly often change to a different move when left running for more time or to a greater search depth. If it didn’t change, it wouldn’t improve. Thus my tests, which have no foreknowledge of how long a program used to cheat was running and on how powerful hardware, cannot expect to register positives at a higher rate. The natural agreement rates for human players range from about 35% for novices to upwards of 60% for world champions.

The fit to average scaled error-per-move (ASD) shows the other side of the horizon issue. The ASD measure is more tightly correlated to rating—as the graphs in the above-mentioned post suffice to indicate. Here is the corresponding graph on the new data, again with flipped axes:

Only under 1250 Elo does perfect linearity seem to be countermanded. The issue, however, is at the other end. Committing asymptotically zero error seems to be a more acute indicator of perfection than 100% agreement with a strong program. However, the -intercept there is given as a rating under 3300, whereas computer programs have been reliably rated above 3400, and very recently over 3500. Thus we’d appear to have computers rated higher than perfection.

One can move from the above indication of my setup losing mojo before 3000 to allege that it is insufficient for fair judgment of human players above 2500, say, so that the intercept is not valid. My counter-argument is that the same intercept is also a robust extrapolation from the range 1500 to 2500 where the linear fit is nearly perfect and the computer’s sufficiency for authoritative judgment of the players is beyond doubt.

Nevertheless, the above “game-changer” for the move-matching percentage suggests the same for ASD. A quadratic fit to ASD produces the following results:

Now the -intercept at is within error bars of 3500, in agreement with the 3475 figure currently used in my full model and less starkly under the measured ratings.

Let us think of move-matching for a given rating as a flip of a biased coin with heads probability . If we plot not against but against , we recover a nearly perfect linear fit (the plot shows ):

Well, is the variance of one coin flip. Why should multiplying by this variance recover a linear fit in the *mean*? Only multiplying by the square root of still leaves a significantly non-linear plot.

Recovering from needs solving a cubic equation. The maximum value is at and is Multiplying by as in the plot makes the maximum solvable value. This regression line associates this to a rating of only 2860. This suggests a tangibly lower horizon. It also seems contradicted by the fact of Magnus Carlsen maintaining a rating over 2860 from January 2013 through June 2015, yet his engine agreement did not approach 66.7%.

We’ve connected the horizon of perfect play to whether the fundamental relationship of rating to agreement with strong computer programs is linear, quadratic, or indirectly cubic. Which relationship is true? What further tests may best ascertain the range of effectiveness of inferences from these data?

]]>

*A result on the prime divisors of polynomial values*

Cropped from source |

Issai Schur was a mathematician who obtained his doctorate over a hundred years ago. He was a student of the great group theorist Ferdinand Frobenius. Schur worked in various areas and proved many deep results, including some theorems in basic number theory.

Today we discuss a nice lemma due to Schur. Actually, it’s a theorem.

There are many things named after Schur. Wikipedia has compiled a list of them:

Frobenius-Schur indicator Herz-Schur multiplier Jordan-Schur theorem Lehmer-Schur algorithm Schur algebra Schur class Schur complement method Schur complement Schur decomposition Schur functor Schur index Schur multiplier Schur number Schur orthogonality relations Schur polynomial Schur product theorem Schur test Schur-convex function Schur-Horn theorem Schur-Weyl duality Schur-Zassenhaus theorem Schur’s inequality Schur’s lemma (from Riemannian geometry) Schur’s lemma Schur’s property Schur’s theorem

This lists two Schur lemmas. But when you click on Wikipedia’s “Schur’s lemma” page, there are three Schur lemmas. No, wait—there are four, including one called “Schur’s test.” But the result we are interested in is classed as a *theorem*. Wikipedia’s single page for “Schur’s theorem” lists not just *five* but *six* Schur’s theorems. This is one higher than the count eventually reached in Monty Python’s famous “Spanish Inquisition” skit. We want the sixth one, which is quite useful—like a lemma.

Here is Schur’s lemma—no, theorem—published in 1912.

Theorem 1Let be a non-constant polynomial with integer coefficients. Then has infinitely many prime divisors.

Here has infinitely many prime divisors means that the number of primes that arise as divisors of the values as is infinite. Note, this works even if the polynomial is reducible or contains some trivial factors. Thus

is fine. As is .

Schur’s theorem is quite fun, even just to prove. Here is a nice version from a paper by Ram Murty, whose work on extensions of Euclid’s famous proof of the infinitude of primes we covered last summer. We follow Murty’s proof.

*Proof:* Suppose is an integral polynomial of least degree for which the statement fails,

where . If , then is an integral polynomial of lower degree for which it must fail, so we have . By hypothesis, has only finitely many prime divisors, which we can represent as . For each natural number , define and

Now because can take the value or only finitely often, and all sufficiently large give for the same reason. Because includes as a factor, is divisible by , giving

for some integer that by choice of is divisible by all of . But then as in Euclid’s proof, is co-prime to all those primes, so it must furnish a prime divisor of that is not among them. This is a contradiction.

This proof was simple but clever. Here is a concrete version for the polynomial , which may help in understanding the proof: Say is divisible by only say. Let The trick is to look at

where

For some large enough there exists a prime that divides . But this prime divides and so for some . But then modulo is equal to which is a contradiction.

As the proof hints, Schur’s theorem is a proper extension of Euclid’s, which is just the case . Here is a less-obvious application:

Theorem 2Suppose that is a polynomial with integer coefficients. Assume that is a perfect square for all integers large enough. Then for some polynomial .

There are many proofs of this theorem. One uses the famous Hilbert Irreducibility theorem, which we also discussed last summer. Another proof uses Schur’s lemma and the fact that if and are products of irreducible integer polynomials that are collectively distinct then the ideal generated by and contains a positive integer—namely, their resultant . Again following Murty’s paper:

*Proof:* We can factor , where is a product of irreducible integer polynomials, and the goal is to show that . If has positive degree, then by Schur’s theorem, there are infinitely many prime divisors of . The square-freeness of implies that it shares no factors with its derivative , so . We need only take dividing for some large enough that is a perfect square but such that does not divide . Then the order of dividing must be even, which implies that the order of dividing must be even. So divides .

By a similar token, since divides as well and is a perfect square, we get that divides . Now is congruent to modulo , and since divides from before, it follows that divides . Since belongs to the ideal generated by and in , divides too, but this contradicts the choice of . So must be a constant. Since the notion of “irreducible” with applies to constants, any square dividing is already part of , so must be a product of distinct primes. This forestalls dividing in the above, so we must have .

We did mention Schur’s long list of things named after him. Did people name things after others more often years ago? Or is naming them still common-place?

[fixed last proof]

]]>Composite crop from src1, src2 |

Joseph Wedderburn and Leonard Dickson proved Wedderburn’s “Little” Theorem: that every finite ring with the zero-product property is a field. Which of them proved it first is hard to tell.

Today we discuss the issue of finding simple proofs for simple facts—not just Wedderburn’s theorem but the zero-product property on which it leans.

Dickson was a professor at the University of Chicago when Wedderburn visited on a Carnegie Fellowship in 1904–05. To judge from several sources, what happened is that Wedderburn first claimed a proof. Dickson did not believe the *result* and tried to build a counterexample. Instead he found a lemma that convinced him it was true and used it to give a simpler proof. They gave back-to-back presentations on 20 January, 1905, Wedderburn wrote up his paper with his proof and two more based on the lemma, which Wedderburn ascribed to an earlier paper by George Birkhoff and Harry Vandiver. Dickson wrote a paper with his approach, saying in a footnote:

First proved by Wedderburn … Following my simpler proof above, and using the same lemma, Wedderburn constructed two further simpler proofs.

However, Wedderburn’s first proof was found to have a gap. As detailed by Michael Adam and Birte Specht, the gap was a statement that was not false *per se* but whose vague argument used only properties from a class of weaker structures in which it can fail. So:

Who first proved the theorem?

Our sources linked above differ on whether the gap was noticed at the time or anytime before Emil Artin remarked on it in 1927. Artin didn’t discuss the gap but gave his own proof instead. There seems to be consensus that the “proof from the Book” was given by Ernst Witt four years later in 1931. But two other proofs, by Israel Herstein and by Hans Zassenhaus, receive prominent mention.

As our sources attest, interest in finding other proofs continues. The surprise in the theorem, which is that finiteness and the zero-product property force the multiplication to be commutative, informs what happens in other mathematical structures. Proofs that draw on results about these structures create connections among many areas. The shortest proof drawing on deeper results was a two-page paper in the summer 1964 *Monthly* by Theodore Kaczynski (of ill note). We won’t try to compare all these proofs, but will try to flip the question by focusing on the other property involved—the zero-product property:

If then or .

Given a ring with this property, how easy is it to prove? If has inverses then it is immediate, else multiplying both sides of by in front and in back would yield the contradiction . And we know if is finite then this property makes it a field. So our quest for a different angle leads us to thinking about infinite rings.

Incidentally, a ring with the zero-product property is called a *domain*. If the multiplication is commutative then it is an *integral domain*, though Serge Lang preferred the term *entire ring*. Our example goes beyond the integers though.

A natural example that arises is the ring of integer polynomials over multiple variables. Thus for two variables the elements of this ring are

It is easy to see that they form a ring with the usual high school rules for adding and multiplying polynomials. The ring is an integral domain in general.

To show this, we claim that it has no zero-divisors. Suppose that it does: let . Then let

and

Assume that the maximum degree in for is and is for . Then and are both non-zero polynomials in one variable. It is clear by induction that is not zero. This proves the claim.

The trouble with the above property of polynomials is that it is only about formal objects. In many applications to computer science we want to view polynomials as objects that can be evaluated. So a natural issue is a slightly different property: Suppose that and are two integral polynomials. Suppose further that

for all integers . Then it clearly must be the case that either identically or identically. Right?

The trouble is that this seems to be obvious but how do we prove that it is true? Note, it must be the case that some use of the fact that and are polynomials. The fact is not true for more complex functions. For example,

for all integers . But neither the function nor is identically zero for all integers.

Here is a relatively simple proof of the fact. It uses the famous Schwarz-Zippel (SZ) Theorem. See here for our discussion of the theorem.

Theorem 1Let be in be a non-zero polynomial of total degree over a field . Let be a finite subset of and let be selected at random independently and uniformly from . Then

The S-Z lemma of course has manifest applications throughout complexity theory. Often it is used in design of randomized algorithms. What we found interesting is that here we use it for a different purpose: to prove a structural property of polynomials.

Back to our fact. Assume that

for all integers and for some finite set. Now either or must be true for at least one half of the values in . Assume that it is . Then by the S-Z theorem it must be that always if the set is large enough. Therefore, the fact is proved.

Is there a simpler proof of this key fact? Is it possible to find an easier reference in the literature of this basic fact of polynomials? We have yet to discover one—any help would be appreciated.

]]>