Maruti Ram Murty is a famous number theorist at Queen’s University in Kingston, Canada. He is a prolific author of books. His webpage has thumbnails of over a dozen. He has an Erdős number of 1 from two papers—very impressive.
Today Ken and I want to talk about a not-so-recent result of his that is also a “lower bound” type result.
Murty’s theorem in question is referenced in his 2006 paper with Nithum Thain:
Theorem 1 A ‘Euclidean’ proof exists to show there are infinitely many primes congruent to modulo if and only if
What he proves essentially is a “lower bound proof.” This theorem gives a limit to the famous Euclidean proof method showing an infinite number of primes. Let’s take a look at it.
Say that a residue modulo is abundant provided there are infinitely many primes congruent to modulo . Then must be relatively prime to , so at most residues can be abundant. Gustav Dirichlet proved in 1837 that all of those are:
Theorem 2 For all the number of residue classes that are abundant is .
So why do we care about “Euclidean” proofs? Dirichlet’s proof uses complex analysis—see this for example. As with the Prime Number Theorem there has long been a feeling that the natural numbers should divulge their secrets by more “elementary” means. Atle Selberg found analysis-free proofs of both theorems—see our discussion here. Opinions remain that the structures involved in these and similar proofs do not shed more light on the structure of the primes.
What is an appropriate level of proof complexity? This is subjective. What we can do is enumerate techniques that everyone agrees convey numerical beauty. Among them are quadratic reciprocity (QR) and cyclotomic theory (CT), albeit they hail from the time of Leonhard Euler and Adrien-Marie Legendre and Carl Gauss rather than Euclid. Indeed, what Gauss considered his nicest proof of QR used CT. With that in mind, the following arguments are considered “Euclidean” (we follow these notes by Keith Conrad):
Theorem 3 For every , the residue class modulo is abundant.
Proof Sketch: Take to be the -th cyclotomic polynomial. The CT theorems we lean on are that for every , and every divisor of either divides or is congruent to modulo . This first gives us the fact of having some prime in that residue class. Now suppose were all such primes and consider some prime divisor of where . Then cannot be a divisor of nor any of because by the first fact. This is the echo of Euclid’s proof. The echo is focused by the second fact giving . So must be a prime in the residue class other than , which gives us the “Euclidean” conclusion that the class is abundant.
Theorem 4 The residue classes modulo are all abundant.
Proof Sketch: For we note that is prime and suppose is the product of all primes in that class. We use the cyclotomic polynomial with giving . Divisors of are hence either or modulo . However, itself is so it cannot have all its prime divisors be , so some dividing must be , but it cannot be any of the , so again we have the “Euclidean” contradiction to those being all the primes in that class.
For or we note that again the residue class is immediately populated and suppose is the product of all its prime members. We use the polynomial value instead. If divides then , so is a quadratic residue modulo . This is either or , and QR theory (noting ) tells us in either case that must be or modulo . The final observation is that since and are both , we have , so . Thus needs to have a prime divisor that is but then by QR we must have without being any of the . Once again this is the Euclidean contradiction.
That “final observation” is what Murty showed—perhaps surprisingly—to be necessary in order to find a suitable polynomial in to begin with. For we get all classes but for , for instance, only , , and (besides ) obey Murty’s criterion. To prove more classes abundant we want to widen the scope of the proofs.
We connect Dirichlet’s Theorem to the famous conjecture by Christian Goldbach that every even number from 4 onward is the sum of two primes. A straightforward sieve argument implies that at least of the even integers can be written as the sum of two primes, but much stronger is known. This 2007 paper by Andrew Granville cites a claim to show that at most of the even integers up to are exceptions, though its author, János Pintz, has only recently released a long proof of as a stepping-stone. Conditioned on the Generalized Riemann Hypothesis, Daniel Goldston improved the bound of Godfrey Hardy and John Littlewood to .
There are however explicitness issues with the constants involved in all unconditional estimates sharper than divided by a polynomial in . None of the proofs is “short.” Hence we’ll say “most” to mean an intuitively provable bound on the number of Goldbach exceptions such that is . Here is our main observation:
Suppose that most even numbers are the sum of two primes. Then there is a short proof that at least residues are abundant.
Let’s explain this statement. First it is not exactly a theorem. Why not? Even granting our meaning of (a short proof of) “most” the conclusion is still subjective. But we can turn the subjectivity to advantage by viewing the statement this way:
Any simple proof of “most” for Goldbach must show that many residue classes contain an infinite number of primes.
Our first point is that we get more than the number of abundant classes from Murty’s criterion. The latter is where is the number of distinct prime factors of and is or or depending on the power of dividing —in all events this number is . We pay a price of explicitness, however: we may not know which classes are abundant.
Theorem 5 Suppose that most even numbers are the sum of two primes. Let be the number of residue classes modulo that are abundant. Then
Proof: Suppose for contradiction that . Then there is some even residue that is not of the form for abundant residues and ( included). Let bound the size of any prime in the non-abundant residues. Then for any even number and primes () allowed such that
at least one of or must come from a non-abundant residue class and hence be . Thus as , the set
has at most members that can be expressed as sums of primes. Hence the density of Goldbach exceptions up to is bounded below by , in violation of the most-Goldbach theorem. Thus .
For we get that the minimum is so we get both and without needing QR. For and however we only get so all but one the possible residue classes again are abundant. Clearly the bound gets worse as we more along. Oh well. But we can still try to improve it.
We do note, however, that the bound we get from our approach is better than any from Euclidean methods for modulo that are prime. By Murty’s theorem it follows that for a prime there are only two residue classes that are provable via the Euclidean method: these are . From estimates noted above, our bounds are better as grows for any kind of .
In some special cases we can make further inferences from the “most” property of Goldbach. Let’s look at . Then there are four residue classes but we still only get . So we miss one. But we can do better. The residue classes are . The only way to get by summing two of them is . The only way to get is summing the other two, . Hence if, say, were not abundant, say , then only finitely many even numbers could be sums of two primes, violating the “most Goldbach” property.
The case uses the symmetry . Taking gives which just satisfies Theorem 5. However, a set of residues will yield two pairs summing to the same value—meaning not enough pairs to cover the congruences of even numbers modulo —unless the three excluded values hit each of the following eight equations viewed as triples:
An inspection shows that there is no hitting set of size 3, so we get that at least classes are abundant. We could try further to argue the way we did with but it does not work: if excludes, say, and (or any of the other three pairs summing to ) then the remaining six residues cover all even-number congruences. Thus getting abundant residues is the most for our style of argument thus far.
Are there possible further improvements? Perhaps a way to leverage the tighter bounds on the density of the Goldbach exception set?
We close by noting one other quirk of Dirichlet’s Theorem. If we are given a residue and could always guarantee constructing one prime in some other residue then we could construct infinitely many in . Namely, suppose we had constructed so far. Choose such that for each , . Now take and . The we construct from our assumption is bigger than any but still congruent to modulo , so we can add it to our set and continue.
Thus if what one might call “Dirichlet-one” is constructive then the full Dirichlet Theorem becomes elementary in a clear sense. Indeed, we get the next prime explicitly, not as an unspecified divisor. This connection raises some hope of sharper estimates that play off certain residues having zero primes rather than finitely many, though they are not immediately to hand.
Can we improve the above method to prove more than a square-root bound? It would be neat if we could show that more is true. There are further ideas that Ken and I are thinking about—more in the future.
]]>
A workshop on quantum computing at UCSD
Cropped from workshop poster |
Dorit Aharonov, David Gosset, and Thomas Vidick did standup for three-and-a-half days in La Jolla earlier this year. They are not listed among the many distinguished actors who have performed at the La Jolla Playhouse on the UCSD campus. Nor were they observed by any of the Hollywood talent-search agencies in town, at least not that we know. But they were applauded by numerous computer science and physics students attending the the March 19–22 UCSD “Spring School on Quantum Computation” which they led.
Today we report on the meeting and discuss an important problem springing from condensed matter physics: the Local Hamiltonian (LH) problem.
Local Hams were proved to be “quantum -hard” by Alexei Kitaev almost 20 years ago. Intervening years have shown even greater resonances between problems about quantum Hamiltonians and complexity classes. William Hamilton didn’t know about quantum but he did know about energy, which his Hamiltonian operators capture in both the classical and quantum worlds. He is of course the person we CS people know for Hamiltonian cycles and paths, and his quaternions have even figured into CS books like this on computer graphics. But he meant a lot more to physics.
The lectures seemed tailored toward physics students—but perhaps that is because our reporter who attended the workshop is a computer science PhD student. He is Chaowen Guan, being supervised by Ken at Buffalo. This is our introduction of Chaowen writing for the blog, and we turn to him for the main body of this post.
Given a witness predicate for a language in we can make a big diagonal matrix whose rows and columns correspond to potential witnesses . We put a in position if is a witness, else . For any potential witness let denote the corresponding binary unit vector of length . Then we get
Now suppose we have witness predicates —or maybe we consider for different -es. Let us simply add up the matrices for each one: . Suppose is a witness for different cases. Then we get
This makes an eigenvector of with eigenvalue . The more witness predicates satisfies, the higher . In Hamilton’s terms, the system gets more “excited” by such and is the energy of the resonance.
Note that when we just had one witness predicate the eigenvalue was for all witnesses. So we could just add up for a bunch of witnesses and the resulting vector still satisfies with the eigenvalue . But with multiple we can only do this for the same , so we have to stratify the space according to each .
Abstractly, this is just rehashing basic facts from linear algebra about eigenvalues and their corresponding eigenspaces. But we gain relevance to complexity, even though is exponentially big, when is succinct—and when we work with “witnesses” of more-piecemeal things that are local. For instance, for each clause in a 3CNF Boolean formula , make by putting in a for each assignment that satisfies . Since only the three literals in matter, is the tensor product of an diagonal matrix with the identity on the other variables. If we add up and , then tells us the number of satisfied clauses.
We can play the same game with unsatisfying assignments instead—then with -CNF we put in only one per clause regardless of . So the matrices are super-sparse as well as ““-local. If we think of as an operator on those variables then we can sum them up without the technical crutch of saying we are really summing the matrices . Now to do this when qubits stand for the variables we need to fix what “NP” means in the quantum world.
The class most regarded as the quantum analog of is named , which stands for “Quantum Merlin Arthur.” It works like a classical protocol except that quantum randomness is involved. A language L is in if there exists a quantum polynomial time verifier and a polynomial such that:
Here is a quantum state of size at most . If we take as a classical witness but leave to be a quantum verifier, the class Quantum Classical MA () is defined.
The matrices are Hermitian operators—meaning that equals its conjugate —and positive semi-definite. The operator norm is the supremum of over all unit vectors . Their eigenvalues correspond to the allowable energies. The “local” property states that only operates on some small constant number of the qubits. Let this number be . An operator on qubits is a -local Hamiltonian if where each acts on at most qubits. Then the LH problem can be defined as following “promise problem” for any :
Given: A set of -local Hamiltonian operators, each of operator norm at most with entries specifiable by bits, and real numbers and such that .
Question: Is the minimum eigenvalue of the Hamiltonian on qubits smaller than , or are all eigenvalues larger than ?
For any , this can be thought of as the quantum analog of -SAT. From the above discussion and MAX-2-SAT being -hard we can already see that the -LH problem is -hard for any : By representing the variables by qubits and each clause by an acting on qubits, the vector of a satisfying assignment to any one will have eigenvalue 0 while an unsatisfying assignment corresponds to eigenvalue 1. Therefore, the lowest eigenvalue of the sum (recall that we really mean the sum of the corresponding matrices) corresponds to the maximum number of simultaneously satisfied clauses. Getting hardness for requires some more work, however.
The following was originally proved for the constant value of , but was brought down to 2 in this paper by Kitaev with Julia Kempe and Oded Regev.
The -local Hamiltonian problem is -complete.
To prove it is in , an obvious witness for a “yes” instance would be simply an eigenstate with eigenvalue smaller than . For a “no” instance , the amplification theorem can be used.
The proof for -hardness is more complicated, so we will discuss only the main difference from the standard Cook-Levin proof. A natural witness for the history of the computation would be the sequence of states and we want to verify the consistency of two consecutive states via local means as with the Cook-Levin proof. The issue is that it is possible in quantum for two vastly different states to give the same local appearance on all subsets of qubits. A simple example for and is that the entangled state gives the same coin-flip on one individual qubit line as the unentangled superposition of all four basis states. In general the issue is that the scalars and will come out the same when and have the same reduced density matrices on -qubit sets and yet could be different states.
However, if the history of the computation is given in superposition, we can fight fire with fire by using entanglement to help local Hamiltonians verify the consistency. Consider the following superposition:
The resulting reduced density matrix of the first qubit will give us
Hence, this can facilitate the comparison between and .
One can refer to the survey mentioned above for a detailed proof of this theorem. This paper gives a list of other known -complete problems.
To prove -completeness for , a technical tool named projection lemma had been used, which enables approximating a non-local Hamiltonian by a local Hamiltonian. The same paper also provided another self-contained proof for using a more advanced tool, perturbation theory, to analyze sums of Hamiltonians.
Below is a list summarizing the talks over the meeting:
After lunch there was a discussion of open equations and challenges. More details and materials of the talks can be found on this page.
Besides the main lecture format, there were sessions of open problems. Here are some of them:
]]>
A novel cake-cutting puzzle reveals curiosities about numbers
Alan Frank introduced the “Muffins Problem” nine years ago. Erich Friedman and Veit Elser found some early general results. Now Bill Gasarch along with John Dickerson of the University of Maryland have led a team of undergraduates and high-schoolers (Guangqi Cui, Naveen Durvasula, Erik Metz, Jacob Prinz, Naveen Raman, Daniel Smolyak, and Sung Hyun Yoo) in writing two new papers that develop a full theory.
Today we discuss the problem and how it plays a new game with integers.
The puzzle was popularized five years ago in the New York Times Online. That spoke of cupcakes not muffins, but Frank’s original term from 2009 has stuck to the pan since muffins are bigger and firmer hence easier to cut. The original question was:
How can one divide muffins among students while maximizing the size of the smallest piece?
Everyone will get of a muffin. If we cut a piece out of the first muffin and give someone else the piece left over then that person will also have a piece of size at most . That’s no better than the trivial solution of cutting each muffin into fifths. Can we do better?
In fact, we can. Quarter the first muffin—that at least is easy with a knife. With more care we divide the other two muffins into pieces of size , , and . Four people get a quarter and a piece giving . The fifth student gets the two pieces. So .
Is this optimal? If we divide any muffin into or more pieces, some piece must have size at most . On the other hand, if we divide each muffin into at most pieces, we have at most pieces total, so some student gets just piece. That must be a piece, leaving a piece which implies the need for an at-most piece either from halving it or from supplementing it. So we have proved .
Other cake-cutting problems involve protocols for “fair division” where one person cuts and another chooses. Here the division is constrained to be fair. The depth comes from the problem’s minimax—or maximin—nature. It is not a simple linear programming problem. It is not a two-player game but has game-like aspects. It does have an important duality property.
The flipped problem is to divide muffins among students. The trivial solution guarantees . We must have at least pieces so some student will get pieces, at least one of size no more than . We can achieve that by breaking four muffins into pieces of size and and the other in half. The other two students each get a piece and two pieces. This is a full proof of .
The dual nature of this argument may not be apparent at first but Friedman proved:
Theorem 1 For all , .
Proof: Picture Sweeney Todd luring the students into his barbershop with muffins each proffered by a customer of Mrs. Lovett’s Meat Pies. So we have hungry muffin providers who will be served pieces of student pie. If a muffin was shared among students then its owner will get pieces of pie in return. The piece-maximization objective is the same as when the students ate the muffins. The only change is that the piece size is reckoned in proportion to the students rather than the muffins, hence the conversion factor .
The paper shows something more: how to convert a proof of optimality of a division in the primal to a proof for the corresponding division in the dual. Above we not only have but also the fifth student with the two pieces corresponds to the muffin divided into halves, the others with a and piece showing the division out of .
To illustrate another case, strikes me as easier to reason about than : Splitting each of muffins and giving one student four pieces, the others two and a , achieves . Conversely, each student gets an total share, so if someone gets a whole muffin then the remaining share causes a piece somewhere. If not, then there are total muffin pieces, so someone gets four pieces, and the smallest of those has size at most . So and hence .
One aspect of duality that seems missing, however, is the correspondence between a feasible solution on one side and a constraint on the other. For linear programming this placed it into long before Leonid Khachiyan placed it into . As was shown by Elser, the muffin problem yields a mixed linear and integer program. This is enough to show that is computable and always a rational number but so far not to place problems about and into let alone . Trying instead will show the issues.
The duality allows us to limit attention to . Since cases where divides are trivial, we have and not an integer. Then any solution achieving optimal minimum piece size must satisfy:
The latter implies . Note that if divides then we get by halving each muffin, and vice-versa. So we also consider this a trivial case.
Not so easy to prove, apparently, is (given ). It appears as “Appendix E” of the group’s second paper. That and Bill’s talk slides for the 2018 Joint AMS-MAA Meeting have some updates over the ArXiv paper, even though the latter stretches to 199 pages.
Why is the paper so long? There are 103 pages of appendices and tables. These supplement an original effort to build a theory. It starts by defining , so that nontrivial cases have , and giving the following basic upper bounds:
Proof: In an optimal solution, every muffin must be cut into exactly two pieces, else we have . It follows that some students get shares from muffins and others partake in only muffins. The former receive some piece of size at most their total divided by , hence the first inequality. The latter similarly receive some piece of size at least , but then the other piece of the muffin it came from has size at most .
The ‘FC’ bounds are tight for , , , and , so one might expect it to continue for the whole Fibonacci sequence. But it fails for : instead of , this note posted by Bill using methods found by Metz gives an upper bound of Efforts to bound other progressions lead to theorems like this one:
Theorem 3 If and then putting gives
They have been continually charting more individual solutions and also finding more arguments by which to generate upper and lower bound theorem cases. The efforts have been joined by other students. As we go to post the following bounds—ordered by and stated with common denominators—have yet to be closed:
The `?’ marks a computer run that timed out. The Muffin Team may soon solve some of these, but there are always more to do—unless and until a full characterization is found. This all shows scope for involvement by amateur mathematicians both for finding more-effective duality arguments and for computational experiments.
The following questions spring to mind—with and the same nontriviality assumptions as above:
A mark of subtlety is that the first two answers are no while the other two remain open problems despite all the work. The first holds whenever either bound in Theorem 2 is tight, or when equals an alternative bound called in the long paper. It fails, however, for . Although it and other known exceptions have , even that hasn’t been proved.
My thought with question 2 had been to force some relation between and . But Metz refuted it by showing that and that no solution gives someone shares of equal size. Here but is not a multiple of . I have posted his note here with their permission.
If an FC or INT bound is tight for then it is tight for for all integers . These bounds are defined in terms of alone and are polynomial-time computable. The team have formalized several other bounds with the same or similar properties. But next we discuss a sense in which the original FC bounds are the ultimate answers.
For any ideal generated by homogeneous polynomials of some degree in variables over some field , we can set to be the dimension of the quotient space of homogeneous polynomials of degree modulo the ideal . David Hilbert proved that there is always a polynomial such that for all but finitely many . Well, the minimum integer such that holds for all may be huge in terms of and , but Hilbert first proved it exists and later gave bounds which have since been refined. It is called the Hilbert regularity. The Muffin crew have proved a theorem that strikes me as somehow analogous:
Theorem 4 For all there exists such that for all , equals one of the bounds in Theorem 2.
They also give a bound of roughly on . For they have computed exactly. One consequence of the regularity is that computing , while not known to be in or even in in any sense, belongs to the class of fixed-parameter tractable problems.
Their last main topic also bridges between Hilbert’s famous “Program” of automating mathematical deduction—the one supposedly destroyed by Kurt Gödel—and PolyMath projects. They have created a “Muffin Theorem Generator” for exceptional cases, and it is the subject of their second paper. They document its use to solve a sizable initial segment of exceptional cases having , and they have now resolved all for up through .
The high-level problem is to find a criterion that expresses the solution as a simple direct function of and . Or might there be irreducible complexity “underneath” the regularity bound as varies?
Short of a full characterization, what divisibility properties of integers are being used, in particular regarding and ? Their “Muffin Theorem Generator” also gives food for thought on computational experiments—and student research initiatives. Kudos to the students—note the newer bounds in the talk slides in particular.
]]>
A pretty neat paper about a pretty neat theorem
[ GLL edited ] |
Mark Villarino, William Gasarch, and Kenneth Regan are terrific writers. Bill is a co-author of a famous blog located here. Ken is a co-author of another blog that we will not mention.
Today I would like to talk about one of their recent articles that just appeared in the Math Monthly.
The Math Monthly is one of the premier journals for exposition of math results. It is a great honor for them to have published a paper there and I thought it would be nice to write about it. Congrats to Mark, Bill, and Ken.
Their paper is of course about the famous Hilbert irreducibility theorem (HIT). The paper states the theorem and proves it from first principles. But it does much more. They also explain two important ideas that are seen through mathematics:
This sounds like a paradox, but it is not. Hilbert had a particular goal in mind in proving HIT. He wanted to show the following:
Theorem 1 For every integer there exist infinitely many polynomials in of degree such that has the symmetric group as its Galois group.
He used HIT to prove exactly this. This theorem is a special case of the Inverse Galois Problem:
Does every finite group appear as the Galois group of some Galois extension of the rational numbers?
This problem first stated two centuries ago—in the 19th century—is still open. For example many finite simple groups can be constructed as extensions of the rationals. All solvable groups can also be a Galois group, but the general case is still open.
Quoting them:
In number theory, Hilbert’s irreducibility theorem, conceived by David Hilbert, states that every finite number of irreducible polynomials in a finite number of variables and having rational number coefficients admit a common specialization of a proper subset of the variables to rational numbers such that all the polynomials remain irreducible. This theorem is a prominent theorem in number theory.
The statement, quoting them again, who are quoting Hilbert’s own words is:
Theorem 2 If is an irreducible polynomial in the two variables and with integral coefficients
where are integral polynomials in , it is always possible, indeed in infinitely many ways, to substitute an integer for in such that the polynomial becomes an irreducible polynomial of the single variable~.
This statement has several issues as they point out. One other contribution of their paper is explain some of the differences in clarity and perhaps from Hilbert’s time to today. The above statement is a bit imprecise and they explain how to fix it. Read their paper for the details.
As we explained already Hilbert proved the HIT in order to make progress on the inverse Galois problem. Roughly he constructed a “generic” polynomial that had the required Galois group. Then he invoked his HIT to show that the generic polynomial could be changed into a rational polynomial that had the same Galois group. This is a powerful method that is still used today. However, it does not seem to be enough to solve the entire inverse problem.
Another neat application of HIT is a short proof of the following theorem.
Theorem 3 Suppose that is a polynomial over the integers. Suppose also that is a square for all numbers large enough. Then there is a polynomial so that .
Look at the polynomial over the field . This theorem shows that the “obvious” way to for a polynomial to always be a square is the only way. Actually proving this theorem without using HIT is possible, but not as simple as applying HIT.
There are many known proofs of HIT. What they do in their article is to show how Hilbert originally proved the theorem. See their paper for the details which are quite interesting. Quoting them, a last time:
Hilbert remains one of the greatest mathematicians of all time. His original proof still contains insights and arguments that are well worth study even today. We offer the reader a detailed exposition of this proof in hope of saving it from the oblivion of history.
An interesting point about Hilbert’s proof is that it relied on a type of Ramsey theorem. It needed a key lemma that is now called Hilbert’s cube lemma. Define an -cube as the set
Lemma 4 For any positive integers and there exists a least integer such that if the set is colored with colors, then some color class must contain an -cube.
This is clearly a type of Ramsey theorem: It shows that any coloring of a large enough object must have a color class that has some given regularity. They point out, as others have too, that Hilbert’s lemma is probably one of the first “Ramsey-like” theorems. It is interesting, I think, to note that even the great Hilbert missed an opportunity here. He could have invented Ramsey theory many years earlier.
Is there hope to solve the inverse Galois problem? Is every finite group a Galois group over the rationals? Hilbert was beyond brilliant, yet he missed creating the whole field of Ramsey Theory. Are we today missing some large new class of math that is really right in front of us?
]]>
Workshop happening this week—anyone can view it live
IAS Weyl bio source |
Hermann Weyl was one of the first members of the Institute for Advanced Study in Princeton. He made important contributions to many fields and even more contributions to groups. His work in group representation theory and polynomial invariant theory is being employed in a workshop being held this week at the Institute on “Optimization, Complexity, and Invariant Theory.” Weyl wrote a lovely book called Levels Of Infinity. One of the interesting chapters is on “Why is the world four-dimensional?” Indeed. Today we discuss some topics from the workshop. The talks are free and readers are invited to follow the remaining proceedings via the live link provided by IAS.
We have mentioned Weyl a few times, including telling the story of an interview in which the great physicist Paul Dirac was asked,
Do you ever run across a fellow that even you can’t understand?
Dirac named Weyl.
Most of us in the audience are not physicists or even pure mathematicians but rather computer scientists by trade. We are trying to understand not only Weyl but also bits of Michael Atiyah and Frances Kirwan and David Mumford and Issai Schur. And David Hilbert, who added great nonconstructive power to Arthur Cayley’s invariant theory, then sought to recover it constructively. Plus work by many other mathematicians and physicists, especially quantum physicists. Invariants in physics correspond to real-world quantities like energy, momentum, and charge(-parity(-time reversal)). Thus while invariant theory is new to us it has been part of the IAS since its inception—an invariant of Princeton.
Avi Wigderson is the workshop’s main organizer and he laid out the plan for the week in two sweeping introductory talks on Monday morning. His backbone example is how six seemingly disparate problems in various areas of math and physics and combinatorial optimization are structurally the same problem. He has been part of several joint papers which collectively improve the upper bounds on these and related problems. The notions of resources and the tradeoff of speed and closeness in approximations unite and focus the approaches.
A key concept is how a polynomial (or rational) function changes under certain applications of invertible matrices to the input vector . Consider simply . The set of such functions over belonging to some group of matrices is the orbit of under the group and is just a set of polynomials . The polynomial is invariant under the action by if is always the same polynomial as , i.e., . For instance, if is given by matrices f determinant 1 then the determinant polynomial is an invariant. It is likewise invariant under two-sided actions that also multiply by some appropriate on the right.
Note that if is a polynomial of degree with zero constant term then so is . We can identify such polynomials with their vectors of coefficients. It is possible for to have members that are arbitrarily close to the zero vector. Then is said to belong to the null cone . How hard is it to decide this? The results in the above papers, together with this paper by Peter Bürgisser, Matthias Christandl, Ketan Mulmuley, and Michael Walter, place the problem into . All of them are among the speakers at this workshop.
The side is actuated by the Hilbert-Mumford criterion which says that when there is always a subgroup of matrices with the single parameter such that . The side uses a newly-amplified duality creating a moment function such that whenever is not in the null cone there is an such that . The same complexity classification applies to membership in associated moment polytopes. Thus we have a strain of new and interesting intermediate problems to consider.
Mumford amplified Hilbert’s work on constructivizing his own results into what is now called geometric invariant theory. That in turn underpins Ketan Mulmuley and Milind Sohoni’s geometric complexity theory, and we hasten to note that this is on the docket of today’s talks at 3:45pm. Here is the whole list of talks from this afternoon on:
After lunch from 1:00pm on there will be a discussion of new directions and open problems, including some off-the-cuff short talks.
Abstracts of the talks are on this page, and again they are all being recorded and live-streamed here with free access. The recorded talks and other materials will be available later from the workshop page.
What will result from the cross-pollination of several fields? Already there has been much exciting work.
(This post may be extended with some more examples. Usually our posts are invariant but the timeframe makes this an exception. OK, here is one…)
We can give a small example of the flavor of what Avi and friends are up to. Suppose that we look at a matrix—for example, the following:
The question is: Can one change this matrix into a doubly-stochastic matrix with only re-scaling operations? These operations allow us to change any row (column) to a rescaled row (column). Thus the above matrix can be changed to
for any real . Also one can change any column, so one can next get to
for example. Here is a puzzle: Can we use these re-scaling operations to transform
into a doubly-stochastic matrix? Ken posed this to me and my first thought was no—its is obviously impossible. Do you see why I thought this was true?
But I was wrong, as I often am. The matrix can be transformed to one that gets as close as you like to a doubly-stochastic matrix. Transform it as follows:
becomes
which becomes
This is
Pretty neat.
However, if the original matrix is
then it really is impossible. Even if we zero-out the corner, the fact of two 1s below it (and to its left) will keep us bouncing back between 1/2 and 1.
There is a more fundamental way to express this impossibility. We can regard as the matrix of a bipartite graph where the positive entries correspond to entries between the partitions and the 0s are non-edges. The relavant fact noted by Marshall Hall is that the scaling is possible if and only if the graph has a perfect matching. For this it does not, so the scaling is impossible. Extensions of this idea of a “Hall blocker” have figured into several talks—and the idea of blockers in general plays into the concept of obstructions as used in geometric complexity theory.
[added note about recordings becoming available afterward]
]]>Study.com source |
Pierre de Fermat was fluent in six languages. Yes I thought we would talk about Fermat today. Something I did not know about him is: he was fluent in French, Latin, Greek, Italian, Spanish, and Occitan. I am impressed since I never could handle another language, though Ken speaks several. Occitan is a relative of Catalan and some of its constituent dialects may be more-familiar names: Langue d’Oc, Provençal, Gascon, and Limousin.
Today Ken and I thought we would discuss something named for Fermat: Fermat primes. A Fermat prime is any prime of the form .
A Fermat prime is any prime of the form
No, we did not contradict or repeat ourselves. Understanding why can possibly be prime only when is a power of is the beginning of learning the language of number theory. Any other includes an odd prime factor . Put . Then where and ; note that uses being odd. The basic fact is that is always divisible by (which in this case equals ), so is not prime.
Fermat primes are quite useful, but unfortunately there seem to only be five of them:
There are applications of Fermat primes that could really use having an infinite family of them. A prime case—bad pun—is the beautiful work of Martin Fürer on fast multiplication of integers.
This has led us to seek larger families that might have similar applications. There are of course generalized Fermat primes, which are defined as primes of the form
for any base . Obviously for they are the Fermat primes.
These numbers seem to be in relative abundance. Indeed, currently the 13th largest known prime is the generalized Fermat number with and .
Whether there are infinitely many generalized Fermat primes is trivial for but already for it subsumes the fourth problem listed by Edmund Landau in his address to the International Congress of Mathematicians in 1912, twelve years after David Hilbert gave his famous list. The problem of whether infinitely many primes are one more than a square traces back at least to Leonhard Euler.
We are, however, interested in special primes amid lists of single-exponential rather than double-exponential density. Marin Mersenne studied primes of the form several decades before Fermat. Eleven of the twelve largest known primes have this form, but again no one knows whether there are infinitely many. So we consider other odd numbers near .
Close to Fermat primes are primes among the numbers of the form:
A simple implementation of a probabilistic primality test, such as this in Python by Eli Bendersky, suffices to find them up to or so. The simplicity owes to Python’s arbitrary-precision integer arithmetic. After the list has gaps but also mini-clusters:
prime at
prime at
prime at
prime at
prime at
prime at
prime at
prime at
prime at
prime at
prime at
prime at
prime at
prime at
prime at
prime at
This gives an impression of greater abundance. The whole known sequence of such is A057732 at the Online Encyclopedia of Integer Sequences.
The sequence A059242 of such that is prime starts out . Then there is a big jump to .
There are some such that is always composite. An entertaining talk by Carl Pomerance ascribed to Paul Erdős but this needed to be fixed to . But we are interested concretely in small .
Our desire may ultimately be to have a fixed that gives a sequence of primes for that is not only infinite but has linear density in . That is, we want some fixed as well as such that for any prime in our sequence, there is and such that is prime. This ensures
(ignoring possible low-order exceptions to the second inequality), so that the next item in our sequence is polynomially bounded. Thinking of the -th element of our sequence as , we would get a rough bound
so that the density becomes analogous to that of Fermat numbers. We might allow to be slowly growing in , e.g., .
Here we are thinking of and as “magic primes” that may allow certain algorithmic ideas to succeed. The density condition ensures the existence of some of magnitude polynomial in the complexity parameter(s). If we only need that the length of is polynomial then we might tolerate a weaker density condition such as in the above.
Among values of we prioritize those of the form so that the binary expansion of has only three 1’s. Hence also we are less interested in the extension of Mersenne primes to negative . Here is a table with and :
The clusters of 3 seem to stand out the most. As noted above, there are no cases in this range. Of course, this is a small slice of data.
Are there ways to exploit the fact that there appear to be many almost-Fermat numbers? Of course proving there are an infinite number of them could be very hard.
]]>
Cropped from source |
Terrence Howard is an actor and singer who has been in a number of films and TV series. He was nominated for an Academy Award for his role in the movie Hustle & Flow. He currently stars in the TV series Empire.
Today Ken and I want to talk about his claim that .
Apparently he thinks , and has said:
This is the last century that our children will have to be taught that one times one is one.
We all know that of course. Or does it? Of course it does. Howard, because of his visibility as an actor, has a platform to explain his ideas about arithmetic and mathematics. He claims to have a new theory of arithmetic that will change the world. About his discovery, he said in a Rolling Stone interview:
If Pythagoras was here to see it, he would lose his mind. Einstein too! Tesla!
A pretty far-out claim.
The overwhelming respond to his claim has been predictable. Quoting Howard’s justification from the same interview—
How can it equal one? If one times one equals one that means that two is of no value because one times itself has no effect. One times one equals two because the square root of four is two, so what’s the square root of two? Should be one, but we’re told its two, and that cannot be.
—one commenter in a Reddit thread retorted:
No. No we are not. It’s like he hasn’t heard of decimals or something.
And it gets even nastier. He was on The View the other day and I must admit that I saw him there. The View, for those not into daytime TV, is a talk show that was created by Barbara Walters and Bill Geddie. It is on each day and has guests such as Howard on fairly often. Howard explained on the show how . The hosts of The View are, of course, not math experts, so they listened politely to Howard and later thanked him for his comments.
I have no idea if they believed him or not. But I was shocked to hear that someone had seriously suggested that times anything is not the same thing. Indeed.
So I started to write this post above his claim. My plan was to show that his claim led to a contradiction. This would of course show that he was wrong in his claim. But a funny thing happen—I did not get a contradiction. Let me explain.
My plan was to use the usual rules of arithmetic to show that is wrong. The rules I planned to use were the standard ones:
Commutative Law: .
Associative Law: .
Distributive Law: .
On this page there is a nice explanation why the distributive law is useful:
You probably use this property without knowing that you are using it. When a group (let’s say 5 of you) order food, and order the same thing (let’s say you each order a hamburger for $3 each and a coke for $1 each), you can compute the bill (without tax) in two ways. You can figure out how much each of you needs to pay and multiply the sum times the number of you. So, you each pay (3 + 1) and then multiply times 5. That’s 5(3 + 1) = 5(4) = 20. Or, you can figure out how much the 5 hamburgers will cost and the 5 cokes and then find the total. That’s 5(3) + 5(1) = 15 + 5 = 20. Either way, the answer is the same, $20.
My only comment is that what world is a burger $3 and a coke $1? Perhaps at some fast food places, but I live in New York City and this off by a large factor.
This famous passage involving Humpty Dumpty in Lewis Carroll’s Through the Looking-Glass (1872) applies just as much to mathematics as to words:
“I don’t know what you mean by ‘glory’,” Alice said.
Humpty Dumpty smiled contemptuously. “Of course you don’t—till I tell you. I meant ‘there’s a nice knock-down argument for you!’ ”
“But ‘glory’ doesn’t mean ‘a nice knock-down argument’,” Alice objected.
“When I use a word,” Humpty Dumpty said, in rather a scornful tone, “it means just what I choose it to mean—neither more nor less.”
“The question is,” said Alice, “whether you can make words mean so many different things.”
“The question is,” said Humpty Dumpty, “which is to be master—that’s all.”
What I realized is that Howard could be right that is . Really. Let’s agree to use for what Howard means by multiplication. So call
the Howard product of and . Now we could define any way we want—like Humpty Dumpty in Lewis Caroll’s world.
Let’s see what happens if . Now
by the distributive law. Thus . But now
Also
So and . In general we get that
Now . Continue in this manner and get
In general the operation is commutative, associative, and distributive.
and
But this is same as
And likewise for . Let’s check the associative law:
The point is that Howard can define product in a new way. His system that has is just fine. It has all the usual basic properties of arithmetic but is really nothing new. Rather than a radical new system of arithmetic his system is just a kind of renormalization of the standard one. It is like changing feet to yards.
Ken adds that Howard’s behaves like a parallel complexity measure for multiplication gates. It maps any ring onto the ideal of even elements in the ring—which of course can be the whole ring. Whether it has comparable utility to the idea of the field with one element is anyone’s guess.
Beyond the basic operations being the “same” there are some differences in Howard’s . Note that in his system the notion of primes is different from the usual. But I thought I would stop here.
Do you think Howard will agree? Is it more useful to point out that Howard’s product is really just our old friend renormalized than to try and argue that he is wrong? What do you think?
]]>
Should we expect simplicity in a theory named for complexity?
Amer. Phy. Soc. interview source |
Sabine Hossenfelder is a physicist at the Frankfurt Institute for Advanced Studies who works on quantum gravity. She is also noted for her BackRe(Action) blog. She has a forthcoming book Lost in Math: How Beauty Leads Physics Astray. Its thesis is that the quest for beauty and simplicity in physics has led to untestable theories and diverted attention from concrete engagements with reality.
Today we wonder whether her ideas can be tested, at least by analogy, in computational complexity.
Her book is slated to appear on June 12. We have not seen an advance copy but the book grew from her past commentaries including this from 2016, this in Nature in 2017, and this last week. The criticism of string theory goes back even before the book and blog Not Even Wrong by Peter Woit of Columbia and the book The Trouble With Physics by Lee Smolin emerged in 2006. We are not trying to join that debate but rather to engage with the general thesis she stated here:
Do we actually have evidence that elegance is a good guide to the laws of nature?
She continues: “The brief answer is no, we have no evidence. … Beautiful ideas sometimes work, sometimes they don’t. It’s just that many physicists prefer to recall the beautiful ideas which did work.” For an example, supersymmetry is beautiful but has gone far toward a definite “doesn’t work” verdict.
In theoretical computing and mathematics we both remember and preserve beautiful ideas that work. But as bloggers looking to the future as she does, we address ideas that have not yet emerged from shells, to help judge which ones to try hatching. Algorithms and complexity have feet planted not just in Platonic reality but in the empirical fact of programs giving correct answers within the time and other constraints we say they will. Hence we have a testbed for how often a-priori beautiful ideas have proved effective and vice-versa.
Certainly the burst of particle physics in the early-mid 20th Century came with unanticipated complexity. We mention one well-known anecdote that, to judge from her index, is not among those in her book: Isidor Rabi won the 1944 Nobel Prize for his discovery of nuclear magnetic resonance, which he used not to treat sports injuries but to discern the magnetic moment and nuclear spin of atoms. When the muon was discovered but appeared to play no role in nuclear interactions, he famously reacted by exclaiming,
Who ordered that?
Muons are ingrained in the physics Standard Model which has much beauty but also has “bolted-on” aspects that those seeking greater beauty seek to supersede. The model is incomplete with regard to gravity and neutrino masses and leaves issues about dark energy and the matter/antimatter imbalance unaddressed.
William of Ockham’s “Razor” is most often quoted as “Entities should not be multiplied beyond what is necessary” in Latin words by John Punch from the early 1600s. Estimating where the bar of “necessary” is set is still an issue. Anselm Blumer, Andrzej Ehrenfeucht, David Haussler, and Manfred Warmuth in 1987 connected Ockham’s Razor to the complexity of learning, and this was further sharpened by Ming Li, Paul Vitanyi, and John Tromp. Further connections via Kolmogorov complexity and algorithmic probability lead to arguments summarized in a nice survey by Li and Vitanyi with Walter Kirchherr. They quote John von Neumann,
The justification (of a model) is solely and precisely that it is expected to work. … Furthermore, it must satisfy certain aesthetic criteria—that is, in relation to how much it describes, it must be simple.
and continue in their own words:
Of course there are problems with this. Why should a scientist be governed by ‘aesthetic’ criteria? What is meant by ‘simple’? Isn’t such a concept hopelessly subjective?
The answer they seek is that simpler theories have higher probability of having been actuated. This may apply well in large-scale environments such as machine learning and “fieldwork” in biological sciences, in testable ways. Whether it applies on the one scale of one theory for one universe is another matter.
At least we can say that complexity theory proposes grounds for judgment in the physics debate. Hossenfelder seems aware of this, to go by a snippet on page 90 that was highlighted by an early reviewer of her book:
Computational complexity is in principle quantifiable for any theory which can be converted into computer code. We are not computers, however, and therefore computational complexity is not a measure we actually use. The human idea of simplicity is very much based on ease of applicability, which is closely tied to our ability to grasp an idea, hold it in mind, and push it around until a paper falls out.
It hence strikes us as all the more important to reflect on what complexity is like as a theory.
We have three main natural families of complexity classes: and and . Up to polynomial equivalence these stand apart and form a short ladder with rungs , then and , then and , and finally which is exemplified among natural computational problems by the computation of Gröbner bases and the equivalence of regular expressions with squaring.
Complexity theory’s first remarkable discovery is that almost all of the many thousands of much-studied computational problems are quantized into the completeness levels of these classes. The reductions involved are often much finer than their defining criterion of being poly-time or log-space computable. Without question the reductions and quantization into three families are beautiful. Requoting Rabi now:
Who ordered them?
The families intertwine:
The problems they quantize are similarly ordered by reductions. Thus we can extend Rabi with a pun:
Who totally ordered them?
Yet whether these classes are all distinct has escaped proof. The belief they are distinct is founded not on elegance but on myriad person-years of trying to solve these problems.
Stronger separation conjectures such as Unique Games and (S)ETH, however, seem to be hailed as much for explanatory power as for solid evidence. As a cautionary coda to how we have blogged about both, we note that the former’s bounds were shaved in exactly the range of exponential time bounds that the latter hypotheses rely on for their force.
What is also like the situation in physics is a disconnect between (i) how complexity theory is usually defined via asymptotic time and space measures and (ii) concrete real-world feasibility of algorithms, aspects of which we have flagged. This also infects the reliance on unproven assumptions in crypto, which has been remarked by many and may be unavoidable. In crypto, at least, there is vast experience with attempts to break the conditionally-secure systems, a check we don’t see how to have with published algorithms.
Rather than shrink from the physics analogy, we want to test it by going even further with conjectures and comparing their ramifications for theory-building. Here is the first:
Every “reasonable” complexity class is equal to a member of one of the three main families.
Note that some of the big surprises in complexity theory went in the direction of this conjecture. The result that is a perfect example. Also the closure under complement of space shows we only need and do not need its nondeterministic counterpart. Short of specifying exactly which of the several hundred classes in the complexity “Zoo” are “reasonable,” we note that many of its classes are reasonable and such that the equality of to one of the basic time or space classes would be a huge result. For like linear time or space or like exponential time that are not polynomially closed we still get equality to a basic time or space class.
Our second conjecture might be called “superreducibility”:
For every two “natural” computational problems and , either or .
This is roughly entailed by the first conjecture since the three families are totally ordered. It may be viable for finer reductions that collapse complements such as polynomial-time one-query reducibility. It is however false without the “natural” qualifier: whenever but does not reduce back to , there are infinitely many pairwise-incomparable languages between and . We wonder whether one can formulate an opposite of the “gappiness” property used to prove this theorem in order to make the second conjecture more formal.
Combined time-space classes for different pairs may furnish exceptions to both conjectures, but how natural? Eric Allender noted to us that has the first-order theory of the reals with addition as a natural complete problem, as shown by Leonard Berman. It fits between and but equality to either would surprise. It preserves the total order conjecture, however. Closer to home are “intermediate” problems within or in the realm of or or others. We surveyed work by Eric and others that gives some of these problems greater relatability under randomized Turing reductions but less likelihood of hardness. Notwithstanding these issues, we feel it will take a stronger principle to deflate the guidance value of these conjectures.
If we had a choice in building complexity theory, would we build it like this? Should we invest effort to simplify the theory? Is there a model that improves on the Turing machine? Are there theories within computational complexity for which lack of beauty inhibits their development? For one example, Dick and I started a theory of “progressive” algorithms but ran into uglefactions.
The clearest example for our thoughts about theory-building may be Kolmogorov complexity (KC) itself. It is the most direct effort to quantify information. If there is any place where we should expect a simple theory with unique concrete answers and radiant beauty, this is it.
Much as we love and apply the subject, we do not get that tingling feeling. First, the metric by which it is quantified—a universal Turing machine (UTM)—is an arbitrary choice. Any UTM has equal footing in the theory as it stands. The difference made by choice of is just an additive shift related to the size of and the theory is invariant under such shifts. But if you want to know about concrete KC there are strenuous efforts to make.
Second, there are multiple basic definitions, starting with whether the set of code strings needs to be prefix-free. No clear winner has emerged from the original competing proposals.
Third, the metric is uncomputable. Proposals for approximating it by feasible KC notions have only multiplied the entities further. One can base them on automata that have computable decision properties but then there are as many notions as automata models. I (Ken) mentioned here a conversation last year among several principals in this line of work that did not radiate satisfaction about concreteness.
Fourth, these issues complicate the notation. Is it or or —or or or —conditioned by default or not on and the like, and are we dropping or keeping additive constants and (log-)logs?
We note a new paper on making the KC theory more “empirical” that may help clean things up. But in the meantime, we cannot deny its importance and success. Our point is that the above marks of ugliness are a brute fact of reality, and any attempt at a more beautiful theory of strings would fly in the face of them.
In what ways might the quest for beauty and simplicity in complexity theory be necessarily compromised? What do you think of our conjectures: like them? refute them?
]]>
Triangulating proofs to seek a shorter path
Cropped from 2016 Newsday source |
Mehtaab Sawhney is an undergraduate student at MIT. His work caught my eye on finding his recent paper with David Stoner about permutations that map all three-term arithmetic progressions mod to non-progressions. Here a progression is an ordered triple where . The paper addresses when such permutations can be found in certain small subgroups of while I am interested in senses by which they are succinct. This made me curious about Sawhney’s other work.
Today Ken and I wish to report on Sawhney’s simple new proof of the famous triangle inequality in .
Sawhney presents his new proof in a short note which has just appeared on p218 of last month’s issue of the College Journal of Mathematics:
An Unusual Proof of the Triangle Inequality.
Summary: A standard proof of triangle inequality requires using Cauchy-Schwarz inequality. The proof here bypasses such tools by instead relying on expectations.
Recall that the triangle inequality for the Euclidean norm on dimensions says that for any vectors and ,
Here as usual the norm of a vector is
The “standard proof” he refers to is represented by this one taken from Wikipedia’s triangle inequality article:
Here Cauchy-Schwarz is used to obtain line 4. Now Cauchy-Schwarz also requires a few lines to prove—indeed one could write a book about it. It feels like the combined proof is tracing two sides and of a triangle, when there ought to be a shorter and direct third side. That is what Sawhney offers.
He can prove the triangle inequality in dimensions by only using the trivial one-dimensional version. That is the fact that for the absolute value
where and are real numbers. Well, it needs the notion of mathematical expected value . This is formally defined via integration on . But he really only needs that expected value obeys some simple properties that one could say are “expected”: additivity, linearity, and ability to manipulate its argument. The one special property is that the norm of a vector in scales as the expected value of its inner product with a unit vector . Formally:
where is a fixed nonzero constant. For one takes the integral of over the circle, which is , and divides it by to make an average, so . The values for higher are different but the difference doesn’t matter, only that is fixed and nonzero. The rest of the proof needs only the 1-dimensional triangle inequality to go from line 1 to line 2:
Pretty neat. No?
I must say that I was quite surprised to see a radically different proof that did not use Cauchy-Schwarz or some equivalent inequality.
Sawhney’s proof is also one a computer theorist could have found. The idea that he relies on is quite neat: the norm of a vector can be computed by taking random projections. This is not elementary but it is intuitive. It is something that “we all know”—yet we did not make the connection to the triangle inequality. This is another example of the power of expectation as a concept.
As Sawhney remarks in his note, the Cauchy-Schwarz inequality can be proved from the triangle inequality by reversing the flow of the proof cited above from Wikipedia, hence it can now be derived via his proof. But that again would be taking two sides of a triangle, while Cauchy-Schwarz has a direct proof. What’s nice is that now both inequalities have a proof that doesn’t reference the other and have a nice bridge between them.
The obvious open problem is: can we use a similar randomness trick to prove other inequalities? Perhaps there are new proofs to be discovered; perhaps there are open inequalities that can be attacked by this method.
[fixed absolute value bars]
Great Discoveries in STEM source |
Claude Bachet de Méziriac was a French mathematician of the early 1600s. He is the first person we know to have posed and solved the problem, given relatively prime (also called coprime) integers and , of finding integers and such that .
Today we revisit some questions about generating coprime pairs deterministically.
Étienne Bézout later observed that for all cases of there are integers such that . The convention of naming this identity for Bézout now extends to the case, so that and are called the Bézout coefficients of and . If we count and as being coprime with every integer (including zero) then are coprime if and only if making exist.
Bachet de Méziriac is known mostly for two books. One in 1612 was a compilation whose title translates to Pleasant and Delectable Problems Fashioned By Numbers and which inspired subsequent books on mathematical recreations. The other was his translation of the Arithmetica of Diophantus from Greek into Latin. It made a marginal contribution to number theory: it contributed the margin in which Pierre Fermat wrote the statement of his famous theorem.
Say we are given a fairly large integer and wish to find coprime pairs near it. It is of course easier to find them by random guessing than to find primes. The primes up to have density only about but the chance of finding a coprime pair approaches about 61%. More exactly it approaches which is the reciprocal of .
The connection extends to other values of the zeta function. Let us take the naturally-extended Bachet condition to be the definition of integers being relatively prime: there exist integers such that . The probability that drawing each from uniformly at random (with replacement) satisfies this condition approaches as .
We can use Bachet’s condition to extend the logic for . We might have no idea what it should mean for one number to be “coprime” but is satisfiable for integer only when and . The probability of drawing from goes to zero, which is consonant with the series for diverging to .
For it is less clear whether to apply the convention that an empty sum is zero. This would be consonant with the literal reading of “” as but not with the analytic continuation . The latter would give as a “probability.” We wonder idly whether Bachet’s condition helps toward a liberalized interpretation that would further give a sensible relation to for fractional real values of and then to complex ones.
But we digress. We want to generate coprime pairs efficiently and deterministically without any guessing.
The PolyMath8 project on “Bounded gaps between primes” generated many interesting sub-projects. Some of them address the issue that if is in the middle of a large gap between primes then simple search up or down will fail. We have blogged about the discovery that searches for primes to use in RSA keys follow the same trajectory to find the same primes so often that simple calls break many keys.
There are various ways to weaken the problem. We can ask to find an integer near that is a prime power. We can ask to find, say, a set of 100 numbers near such that at least 67 of them are prime. Note that we do not allow randomness in the solution to find .
In general we want to improve randomized assertions of the form, “there is probability at least of finding with property ” to “we build a relatively small set of which at least of the members have property .” We can then decide whether drawing randomly from or searching through is a better policy, informed by factors such as the relative cost of testing .
We have before talked about the W-trick of number theory. It maps where is the product of all primes below a small threshold and is coprime to (often just ). The inverse image of the primes under is free of biases modulo .
This comment by Ben Green neatly expresses the analytical motivation, as does section 4 of this paper. The freedom from bias simplifies reasoning about the distribution of primes while preserves arithmetic progressions. We are interested in using it and similar tricks to create sets as above—or sets with many coprime pairs—in conjunction with other assumptions.
Here is our first new situation and problem. Suppose that we have two boxes that contain the numbers and secretly. We want to make them into co-prime numbers. We are allowed to map to and map to . But we cannot see the values of and .
This is not easy it seems. We can weaken the goal a bit: Give a series of translations and so that for most of them the numbers map to co-prime numbers. Note we do not allow randomness in the solution.
However, suppose that we can compute the sum exactly. Then there is a method for solving this problem that is quite simple:
Pick a prime that is fixed. Then find so that . Add to and add to . This makes them co-prime.
The reason is simple. Note the two numbers are now
Let divide both of these numbers. Since both are positive must be a prime—the original and could have been both zero. Now must divide . This implies that is equal to . And so must have originally divided both and . So we can see now that if we move and to and it is the case that for most in say the numbers are co-prime if we select .
In our second situation, we are implicitly given a pair of numbers for which we cannot determine their values. We can, however, get some partial information about them and can change them to in a certain controlled fashion. This still may not be enough to guarantee that and are coprime.
So we apply our weakened goal: We want to generate a set of pairs such that at least two-thirds of the pairs in are co-prime. We want the method of generating the pairs to be easy to compute and deterministic. We will do this with but hint at what is needed to expand to larger sets .
Definition 1 The “2-of-3 model” problem is the following. We assume that we have two natural numbers and , but we have no idea what their values are. We do know the following:
- The value of an even number which is fixed.
- The value of the sum .
- There is so that and a so that .
Finally we can replace by for . The goal is to find
for so that for at least two values of the values
are co-prime. Moreover, we want to be able to find the values in polynomial time.
We’ve chosen some specific constants and terms but the pattern is meant to be generalizable. For the above settings we prove:
Theorem 2 There is a polynomial time algorithm that solves the 2-of-3 model problem.
Proof: Find a prime so that and is divisible by . This is possible since there are primes in such an arithmetic progression. Now we claim that there is a so that
Moreover we can find this in polynomial time. Note that exists since
which is equivalent to
But is divisible by since divides . Thus exists and is easy to compute. Now let and be so that
Suppose that and have a prime that divides both. Clearly must be odd and thus it must divide and so . But for most splits of we get that this is impossible.
What more can be said about these weaker problems of generating primes and coprime pairs?
The above results are by Dick and this post marks his return after the heart surgery six weeks ago.
The upcoming “Golden STOC” in Los Angeles will be wrapped in a 5-day TheoryFest along lines of last year’s. It will include an inaugural meeting of the TCS Women initiative. Information on travel scholarships to the TheoryFest for female graduate students. We have noticed some recent revived interest in our post a year ago on gender bias, and there is also our theme that CS has seen over its history numerous “Absolute Firsts” for women: “first X” rather than “first woman X.”
]]>