Anna Gilbert and Atri Rudra are top theorists who are well known for their work in unraveling secrets of computation. They are experts on anything to do with coding theory—see this for a book draft by Atri with Venkatesan Guruswami and Madhu Sudan called Essential Coding Theory. They also do great theory research involving not only linear algebra but also much non-linear algebra of continuous functions and approximative numerical methods.
Today we want to focus on a recent piece of research they have done that is different from their usual work: It contains no proofs, no conjectures, nor even any mathematical symbols.
Their new working paper is titled, “Teaching Theory in the time of Data Science/Big Data.” As you might guess it is about the role of theory in the education of computer scientists today. The paper contains much information that they have collected on what is being taught at some of the top departments in computer science, and how the current immense interest in Big Data is affecting classic theory courses.
A short overview of what they find is:
The above is leading to pressure to delete and/or modify theory courses. From Atri’s CS viewpoint and Anna’s as Mathematics faculty active in the theory community, both wish to see CS majors obtain degrees that leave them well versed in CS in general and theory in particular. Undergraduates in programs with a CS component should likewise be well served in formal and mathematical areas. Is this possible given the finite constraints on the curriculums? It is not clear, but their paper shows what is happening right now with theory courses (plus linear algebra and probability/statistics), what is being planned for the near future, and some options that may be useful to consider.
§
For the purpose of this post, we made some edits to their text which follows, with their permission. Some changes were stylistic and some more content-oriented. Their PDF version linked as above may evolve over time—especially upon success of their appeal for reader input at the end. So to obtain a complete and current picture please visit their paper too.
Now Anna and Atri speak:
The genesis of this article is a conversation between the two authors that started six weeks ago. One of us (Anna) was giving a talk at an NSF workshop on Theoretical Foundations of Data Science (TFoDS) and the other (Atri) was thinking about changes to the Computer Science (henceforth CS) curriculum that his department at the University at Buffalo is considering. Anna’s talk at NSF, which included data on theory courses at top ranked schools, generated a great deal of interest in knowing even more about the state of theory courses. This was followed by more data collection on our part.
This post is meant as a starting point of discussion on how we teach theory courses, especially in the light of the increased importance of data science. It is not a position paper—it does not argue that the current trends are inherently good or bad, nor does it prescribe any silver bullet. We do suggest some possible courses of action around which discussion can begin.
CS enrollments as well as the numbers of CS majors have increased exponentially in the last few years. In 2014, Ed Lazowska, Eric Roberts, and Jim Kurose exhibited the trend in the former, not only majors. Their graphs in Figure 1 show the trend in introductory CS course enrollments at six institutions in the years 2006–2014.
Figure 1. Enrollment trends in introductory CS sequences at six institutions (Stanford, MIT, University of Pennsylvania, Harvard, University of Michigan, and University of Washington) from 2006–2014. |
Lazowska’s presentation has more detailed statistics and a discussion of the potential implications of these increases. These trends remain valid in 2016, for example as shown by the following chart for the University at Buffalo. In addition to total number of CSE majors, it shows the enrollment in CSE 115 (the introduction to CSE course), CSE 191 (Discrete Math), CSE 250 (Data Structures), CSE 331 (Algorithms) and CSE 396 (Theory of Computation), all of which are required of all CS majors:
Figure 2. Enrollment trends, University at Buffalo CSE 8/08–5/16, with total majors. |
As enrollments out-pace hiring, class sizes have exploded. Lazowska points out that over 10% of Princeton’s majors are CS majors, while it is highly unlikely that 10% of Princeton’s faculty will ever be CS faculty. At the same time, many institutions are re-evaluating and changing their theoretical computer science (henceforth TCS) course requirements and content.
The twin pressures of staffing and content are shifting priorities in both the material covered and how it is covered—e.g., reducing emphasis on proofs and essay-type problems which are harder to grade. We are not judging these shifts or tying them directly to enrollments, but are for now observing that they are happening and impact a large (and increasing) number of students.
The changes in course content, in emphasis on particular TCS components, and in overall CS requirements (including mathematics and statistics) are occurring exactly when there is a big move towards “computational thinking” in many fields and a national emphasis on STEM education more broadly. Not only are the fundamental backgrounds of incoming CS majors thereby changing, but the CS audience is expanding to students in other fields that are benefiting from solid computational foundations. With the increasing role of data and concomitant needs for machine learning and statistics, it is important to obtain a deep understanding of the mathematical foundations of data science. Traditional TCS has been founded on discrete mathematics, but “continuous” math—especially as related to statistics, probability, and linear algebra—is increasingly important in ways also reflected by cutting-edge TCS research.
We considered the top 20 CS schools according to the US News ranking of graduate programs, numbering 24 including ties. It may be inappropriate to use the graduate program rankings to consider the undergraduate program requirements, and it should be noted that the rankings cover all of the graduate program not just TCS, but this is a reasonable starting point. We sent colleagues a short survey and collected data (available spreadsheet) on these 24 schools. Since several include Engineering in one department as at Buffalo or a separate department as at Michigan we will use `CSE’ as the collective term.
We counted the total number of theory courses that all CS majors have to take within the CSE department and then calculated the fraction over the total number of required courses. We categorized the theory courses under these bins:
The bounds are not sharp—a Data Structures course always covers algorithms associated to the data structures and may overlap with an Algorithms course especially when graphs are covered—and Algorithms often includes some complexity theory, especially NP-completeness. In our spreadsheet these columns are followed by the number of theory electives—besides these required courses—that all CS majors have to take. We would like to clarify four things:
We begin with statistics on the total number of semesters of theory courses that are currently required of all CS majors, standardly equating 3 quarters or trimesters to 2 semesters. The basic statistics are in Table 1.
The median number of semester-long courses was three. All but one school requires a discrete math course, all but two require a Data Structures course, and all but nine require an Algorithms course. Eight schools require a Theory of Computation course separate from Algorithms. All these schools have a significant programming component in their Data Structures course. Only one, Cornell, currently adds programming assignments in the required algorithms course. We would like to remind the reader that we are only considering TCS courses required of all CS majors—for instance, CS 124/125 at Harvard has programming assignments but is not required of all CS majors.
We limited attention to cases where courses in Probability/Statistics and/or Linear Algebra are required of all CS majors but taught outside of CSE. We focus on these two courses since they are most relevant to data science.
Probability/Statistics. Of those surveyed, nineteen schools required a Probability/Statistics course, while five did not. Five had developed a specific required course within the CSE department (Stanford, Berkeley, UIUC, Univ. of Washington, and MIT), three had choices among courses both inside and outside the CSE department, and eleven required a course outside CSE. Of the five institutions that did not require a Probability/Statistics course, two (Univ. of Wisconsin and Harvard) listed such a course among electives in Mathematics. Princeton, Yale, and Brown do not list such a course.
Linear Algebra. Sixteen surveyed schools require a Linear Algebra course, out of 24 total. Of the 16, only Brown and Columbia provide a linear algebra course within CSE that satisfies the requirement, though both allow for non-CSE linear algebra courses.
After reflecting on the data in relation to our initial observations about increasing CS enrollments and emphasis on computational thinking across disciplines, we dug deeper and asked people further questions about changes they have seen or are discussing at their institutions. Of eight departments responding (as of 6/10/16):
Four universities changed their Mathematics requirements in the last 10 years. These changes are primarily to require fewer semesters of Calculus II or III (e.g., some no longer require Ordinary Differential Equations) and, instead, require Linear Algebra and/or Probability/Statistics (whether inside the CSE department or not). Two institutions plan to make changes in the future, likely to require Linear Algebra.
We suggest that now is the time to re-think some of the theory curriculum, to work with our colleagues in Mathematics and Statistics, and to develop mathematical foundations classes that are appropriate both for CS majors and STEM majors more broadly. Especially for CS majors, this exposure should come no later than junior year. Here are some starting points for this discussion.
Our goal is to educate the different students at our respective institutions as best we can, by working with our colleagues at our home institutes and by having a dialogue with our theory colleagues across the country.
After sending emails initially to friends in our social networks to gather data and/or supplement the above preliminary analysis, we noted that we had asked only three women total. We then mused on how we could have increased that number by thinking a bit harder about which women were in our social network and whether the institutions we collected figures for had women theorists. We found that, upon reflection, we could have asked eight more women in our social networks, for a total of 11 women theorists, each at a different school, among the top 24 institutions. There are certainly more than 11 institutions with women theorists but either the women faculty are in areas we are not familiar with or they are women in our areas whom we do not yet know personally (e.g., new, junior faculty). In other words, a ten-minute reflection yielded an almost four-fold increase in representatives from an under-represented group.
We recognize that our sample covers only 24 top institutions. This was done mostly to reduce work on our part since the first data was collected by reading the relevant curricula webpages. Needless to say, a better picture of TCS and math requirements for CS degrees in schools in the US can be gained with more data. We are hoping that readers of this blog at many more institutions can make valuable contributions to our data collection and discussion. Those of you interested can contribute your institution’s information to this survey by filling in a Google form. We will periodically update the master spreadsheet with information that we get from this Google form.
We join Anna and Atri in their appeal which ends their paper: the destiny of theory courses can be considered as one large “open problem.” They conclude by thanking those who have already contributed data and others at Michigan and Buffalo and Georgia Tech (besides us) and MIT for inputs to their article.
We have a few remarks of our own: The main ulterior purpose of theory courses is to sharpen analytical modes of thinking and linear deductive argument, among skills often lumped into the general term “mathematical maturity.” The Internet and advances in technology have brought greater and quicker rewards for non-linear, associative, and more-visual modes. These might seem to compete with or even replace “theory,” but the point behind Anna and Atri’s post is that while diffused among more courses in various areas, the need for analytical and linear-deductive experience grows overall.
What emerges is a greater call for mathematical maturity before capstone courses in these areas, as opposed to the view that a required theory course can be taken in the senior year. Shifting TCS material into an early discrete mathematics course may accomplish this. As we have discussed in Buffalo, this could accompany an across-the-board upgrade in rigor of our entry curriculum, but that may discourage some types of students. That in turn might slow increased enrollments—amid several feedback loops whose consequences are an open problem.
[clarified in Buffalo figure that “Total” means majors.]
Ernie Croot, Vsevolod Lev, and Péter Pach (CLP) found a new application of polynomials last month. They proved that every set of size at least has three distinct elements such that . Jordan Ellenberg and Dion Gijswijt extended this to for prime powers . Previous bounds had the form at best. Our friend Gil Kalai and others observed impacts on other mathematical problems including conjectures about sizes of sunflowers.
Today we congratulate them—Croot is a colleague of Dick’s in Mathematics at Georgia Tech—and wonder what the breakthroughs involving polynomials might mean for complexity theory.
What’s amazing is that the above papers are so short, including a new advance by Ellenberg that is just 2 pages. In his own post on the results, Tim Gowers muses:
[The CLP argument presents a stiff challenge to my view that] mathematical ideas always result from a fairly systematic process—and that the opposite impression, that some ideas are incredible bolts from the blue that require “genius” or “sudden inspiration” to find, is an illusion that results from the way mathematicians present their proofs after they have discovered them. …[T]he argument has a magic quality that leaves one wondering how on earth anybody thought of it.
We don’t know if we can explain the source of the ‘magic’ but we will try to describe it in a way that might help apply it.
At top level there is no more sleight-of-hand than a simple trick about matrix rank. We discussed ideas of rank some time ago.
If a matrix is a sum of matrices each of rank at most , then any condition that would force to have rank must be false.
A simple case is where the condition zeroes every off-diagonal element of . Then the main diagonal can have at most nonzero entries. This actually gets applied in the papers. The fact that column rank equals row rank also helps for intuition, as Peter Cameron remarks.
A second trick might be called “degree-halving”: Suppose you have a polynomial of degree . Even if is irreducible, might be approximated or at least “subsumed” term-wise by a degree- product . When is multi-linear, or at least of bounded degree in each variable—call this —we may get where is close to .
In any case, at least one of must have degree at most , say . If we can treat and its variables as parameters, maybe even substitute them by well-chosen constants, then we are down to of degree . Then is a sum of terms each having a monomial of total degree in variables each with power at most .
The number of such monomials is relatively small. This limits the dimension of spaces spanned by such , which may in turn connect to the bound above and/or limit the size of exceptional subsets of the whole space . We discussed Roman Smolensky’s famous use of the degree-halving trick in circuit complexity here.
These tricks of linear algebra and degree are all very well, but how can we use them to attack our problem? We want to bound the size of subsets having no element such that for some nonzero , , , and all belong to . This is equivalent to having no three elements such that . This means that the following two subsets of are disjoint:
How can we use polynomials to gain leverage on this? The insight may look too trivial to matter:
Any polynomial supported only on must vanish on .
Let be the complement of and let be the space of polynomials vanishing on that belong to our set . We can lower-bound the size of by observing that the evaluation map from to the graph of its values on is a linear transformation. Its image has size at most , and since is the kernel, we have , so .
Well, this is useless unless , but is the complement of which is no bigger than the set we are trying to upper-bound. So it is useless—unless is pretty big. So we need to choose —and maybe —to be not so low. We can do this, but how can this lower bound on help? We need a “clashing” upper bound. This is where the presto observation by CLP came in.
Given the set , make a matrix whose entry in row , column , is . In APL notation this is the “ outerproduct” of with itself. Its diagonal is and the rest is .
Now apply to every entry to get a matrix . By every off-diagonal entry vanishes, so is a diagonal matrix. Its rank is hence the number of nonzero diagonal entries. If we can upper-bound , then we can upper-bound by the hoc-est-corpus rubric of description complexity:
Every can be described by its up-to- nonzero values on , so there are at most of them.
The papers use bounds on the dimension of in place of description complexity, but this is enough to see how to get some kind of upper bound. Since , taking logs base gives us:
It remains to bound , but it seems to take X-ray vision just to see that a bound can give us anything nontrivial. OK, any fixed bound on makes the right-hand side only which yields a contradiction, so there is hope. The rank trick combines with degree-halving to pull a bound involving out of the hat. Here is the version by Ellenberg and Gijswijt where the nonce choice suffices and the coefficients on in are replaced by a general triple such that :
Lemma 1 With , , and as above, put to be the set of polynomials in that vanish on . Then for all there are at most values for which
Proof: Let and be vectors of variables, and write
where each coefficient is in and the sum is over pairs of monomials whose product has degree at most and at most in any variable. Collect the terms in which has total degree at most separately from those where does, so that we get
where each and is an arbitrary function and now the sum is over monomials of total degree at most (and still no more than in any variable if we care). Now look at the matrix whose entry is , including the diagonal where . We have
This is a sum of at most single-entry matrices, so the rank of is at most that. Since is a diagonal matrix and makes , there are at most nonzero values over .
Sawing the degree in half stacks up against . We retain freedom to choose (and possibly ) to advantage. There are still considerable numerical details needed to ensure this works and tweaks to tighten bounds—for which we refer to the papers—but we have shown the “Pledge,” the “Turn,” and the “Prestige” of the argument.
Can you find more applications of the polynomial technique besides those enumerated in the papers and posts we have linked? For circuit complexity we’d not only like to go from back to as CLP have it, but also get results for when is not a prime power. Can we make assumptions (for sake of contradiction) that create situations with higher “leverage” than merely being disjoint?
[changed subtitle; linked hoc-est-corpus which literally means, “here is the body”; deleted and changed remarks before “Open Problems”; inserted tighter sum into description complexity formula.]
Shiteng Chen and Periklis Papakonstaninou have just written an interesting paper on modular computation. Its title, “Depth Reduction for Composites,” means converting a depth-, size- circuit into a depth-2 circuit that is not too much larger in terms of as well as .
Today Ken and I wish to talk about their paper on the power of modular computation.
One of the great mysteries in computation, among many others, is: what is the power of modular computation over composite numbers? Recall that a gate outputs if and otherwise. It is a simple computation: Add up the inputs modulo and see if the sum is . If so output , else output . This can be recognized by a finite-state automaton with states. It is not a complex computation by any means.
But there lurk in this simple operation some dark secrets. When is a prime the theory is fairly well understood. There remain some secrets but by Fermat’s Little Theorem a gate has the same effect as a polynomial. In general, when is composite, this is not true. This makes understanding gates over composites much harder: simply because polynomials are easy to handle compared to other functions. As I once heard someone say:
“Polynomials are our friends.”
Chen and Papakonstaninou (CP) increase our understanding of modular gates by proving a general theorem about the power of low depth circuits with modular gates. This theorem is an exponential improvement over previous results when the depth is regarded as a parameter rather than constant. Their work also connects with the famous work of Ryan Williams on the relation between and .
We will just state their main result and then state one of their key lemmas. Call a circuit of , , , and gates (for some ) an -circuit.
Theorem 1 There is an efficient algorithm that given an circuit of depth , input length , and size , outputs a depth-2 circuit of the form of size , where denotes some gate whose output depends only on the number of s in its input.
This type of theorem is a kind of normal-form theorem. It says that any circuit of a certain type can be converted into a circuit of a simpler type, and this can be done without too much increase in size. In complexity theory we often find that it is very useful to replace a complicated type of computational circuit with a much cleaner type of circuit even if the new circuit is bigger. The import of such theorems is not that the conversion can happen, but that it can be done in a manner that does not blow up the size too much.
This happens all through mathematics: finding normal forms. What makes computational complexity so hard is that the conversion to a simpler type often can be done easily—but doing so without a huge increase in size is the rub. For example, every map
can be easily shown to be equal to an integer-valued polynomial with coefficients in provided is a finite subset of . For every point , set
where the inner product is over the finitely many that appear in the -th place of some member of . Then is an integer and is the only nonzero value of on . We get
which is a polynomial that agrees with on .
Well, this is easy but brutish—and exponential size if is. The trick is to show that when is special in some way then the size of the polynomial is not too large.
One of the key insights of CP is a lemma, Lemma 5 in their paper, that allows us to replace a product of many gates by a summation. We have changed variables in the statement around a little; see the paper for the full statement and context.
Lemma 5 Let be variables over the integers and let be relatively prime. Then there exist integral linear combinations of the variables and integer coefficients so that
The value of can be composite. The final modulus can be in place of and this helps in circuit constructions. Three points to highlight—besides products being replaced by sums—are:
Further all of this can be done in a uniform way, so the lemma can be used in algorithms. This is important for their applications. Note this is a type of normal form theorem like we discussed before. It allows us to replace a product by a summation. The idea is that going from products to sums is often a great savings. Think about polynomials: the degree of a multi-variate polynomial is a often a better indicator of its complexity of a polynomial than its number of terms. It enables them to remove layers of large gates that were implementing the products (Lemma 8 in the paper) and so avoids the greatest source of size blowup in earlier constructions.
A final point is that the paper makes a great foray into mixed-modulus arithmetic, coupled with the use of exponential sums. This kind of arithmetic is not so “natural” but is well suited to building circuits. Ken once avoided others’ use of mixed-modulus arithmetic by introducing new variables—see the “additive” section of this post which also involves exponential sums.
The result of CP seems quite strong. I am, however, very intrigued by their Lemma 5. It seems that there should be other applications of this lemma. Perhaps we can discover some soon.
]]>
A way to recover and enforce privacy
McNealy bio source |
Scott McNealy, when he was the CEO of Sun Microsystems, famously said nearly 15 years ago, “You have zero privacy anyway. Get over it.”
Today I want to talk about how to enforce privacy by changing what we mean by “privacy.”
We seem to see an unending series of breaks into databases. There is of course a huge amount of theory literature and methods for protecting privacy. Yet people are still broken into and lose their information. We wish to explore whether this can be fixed. We believe the key to the answer is to change the question:
Can we protect data that has been illegally obtained?
This sounds hopeless—how can we make data that has been broken into secure? The answer is that we need to look deeper into what it means to steal private data.
The expression “the horse has left the barn” means:
Closing/shutting the stable door after the horse has bolted, or trying to stop something bad happening when it has already happened and the situation cannot be changed.
Indeed, our source gives as its main example: “Improving security after a major theft would seem to be a bit like closing the stable door after the horse has bolted.”
Photo by artist John Lund via Blend Images, all rights reserved. |
This strikes us as the nub of privacy. Once information is released on the Internet, whether by accident or by a break-in, there seems to be little that one can do. However, we believe that there may be hope to protect the information anyway. Somehow we believe we can shut the barn door after the horse has left, and get the horse back.
Suppose that some company makes a series of decisions. Can we detect if those decisions depend on information that they should not be using. Let’s call this Post-Privacy Detection.
Consider a database that stores values where is an -bit vector of attributes and is a attribute. Think of as small, even a single bit such as the sex of the individual with attributes . Let us also suppose that the database is initially secure for insofar as given many samples of the values of only, it is impossible to gain advantage in inferring the values of . Thus the leak of is meaningful information.
Now say a decider is an entity that uses information from this database to make decisions. has one or more Boolean functions of the attributes. Think of as a yes/no on some issue: granting a loan, selling a house, giving insurance at a certain rate, and so on. The idea is that while may not be secret—the database has been broken into—we can check that in aggregate that is effectively secret.
The point here is that we can detect if is being used in an unauthorized manner to make some decision, given protocols for transparency that enable sampling the values . If given a polynomial number of samples we cannot tell ‘s within then we have large-scale assurance that was not material to the decision. Our point is this: a leak of values about individuals is material only if they are used by someone to make a decision that should not depend on their “private” information. Thus if a bank gets values of , but does not use them to make a decision, then we would argue that that information while public was effectively private.
Definition 1 Let a database contain values of the form , and let be a Boolean function. Say that the part is effectively private for the decision provided there is another function so that
where . A decider respects if is effectively private in all of its decision functions.
We can prove a simple lemma showing that this definition implies that is not compromised by sampling the decision values.
Lemma 2 If the database is secure for and is effectively private, then there is no function such that .
Proof: Suppose for contradiction such an exists. Also suppose for avoiding contradiction of effective privacy that a function as above exists. Then given , we obtain with probability . Then using we obtain with overall probability at least . This contradicts the initial security of the database for .
To be socially effective, our detection concept should exert influence on deciders to behave in a manner that overtly does not depend on the unauthorized information. This applies to repeatable decisions whose results can be sampled. The sampling would use protocols that effect transparency while likewise protecting the data.
Thus our theoretical notion would require social suasion for its effectiveness. This includes requiring deciders to provide infrastructure by which their decisions can be securely sampled. It might not require them to publish their -oblivious decision functions , only that they could—if challenged—provide one. Most of this is to ponder for the future.
What we can say now, however, is that there do exist ways we can rein in the bad effects of lost privacy. The horses may have bolted, but we can still exert some long-range control over the herd.
Is this idea effective? What things like it have been proposed?
]]>
From knight’s tours to complexity
Von Warnsdorf’s Rule source |
Christian von Warnsdorf did more and less than solve the Knight’s Tour puzzle. In 1823 he published a short book whose title translates to, The Leaping Knight’s Simplest and Most General Solution. The ‘more’ is that his simple algorithm works for boards of any size. The ‘less’ is that its correctness remains yet unproven even for square boards.
Today we consider ways for chess pieces to tour not 64 but up to configurations on a chessboard.
Von Warnsdorf’s rule works only for the ‘path’ form of the puzzle, where the knight is started in a corner of an board and must visit all the other squares in hops. It does not yield a final hop back to start to make a Hamilton cycle. The rule is always to move the knight to the available square with the fewest connections to open squares. In case of two or more tied options, von Warnsdorf incorrectly believed the choice could be arbitrary, but simple tiebreak rules have been devised that work in all known cases. More-recent news is found in papers linked from a website maintained by Douglas Squirrel of Frogholt, England. We took the above screenshot from his animated implementation of the rule when the knight, having started in the upper-left corner, is a few hops from finishing at upper right.
The first person known to have published a solution was the Kashmiri poet Rudrata in the 9th century. He found a neat way to express his solution in 4 lines of 8-syllable Sanskritic verse that extend to an 8×8 solution when repeated. In modern terms he solved the following:
Color the squares so that for all k, the k-th square of the tour has the same color as the k-th square in row-major order—in other words, the usual way of reading left-to-right and down by rows—while maximizing the number m of colors used.
Note that we can guarantee by starting in the upper-left corner and using a different color for all other squares. However, the usual parity argument with the knight doesn’t even let us 2-color the remaining squares to guarantee because the last square of the first row and the first square of the second row have the same parity. Rudrata achieved for the upper half with cell 21 also a singleton color; this implies for the whole board and for . Can it be beaten? Most to our point, is there a “Rudrata Rule” for as simple as von Warnsdorf’s?
We now put a coin heads-down on each square. Our chess pieces are going to move virtually through the space by flipping over the coins in squares they attack. Our questions will be of the form, can they reach all configurations, and if not:
How small can Boolean circuits be to recognize the set of reachable strings?
Let’s warm up with a different problem. Suppose the coins are colored not embossed so you cannot tell by touch which side is which, and the room is pitch dark. You are told that k of the coins are showing heads but not which ones. You must take some of the coins off the board, optionally flipping some or all while placing them nearby on the table. The lights are then switched on, and you win if your coins have the same number of heads as the ones left on the board. Can you always win?
I may have seen this puzzle as a child but it was fresh when I read it here. Our point connecting to this post is that the solution, which can be looked up here, is simple in terms of k and so can be computed by tiny Boolean circuits.
Since the tours will be reversible, we can equally well start with any coin configuration and ask whether the piece can transform it to the all-tails state. This resembles solving Rubik’s Cube. We’ll try each chess piece one-by-one, the knights last.
Our rook can start on any square. It flips each coin in the same row or column (“rank” and “file” in chess parlance) as the square it landed on. Then it moves to one of those squares and repeats the flipping. If it moved within a rank then the coins in that row will be back the way they were except that the two the rook was on will be flipped. We can produce a perfect checkerboard pattern by moving the rook a1-c1-c3-c5-e5-g5-g7 then back g5-c5-c1. Since order doesn’t matter and operations from the same square cancel, this has the same effect as doing a1, c3, e5, and g7 “by helicopter.”
Since the rook always attacks 14 squares, an even number of coins flip at each move, so half the space is ruled out by parity. There is however a stronger limitation. Each rook flip is equivalent to flipping the entire row and then the entire column. We can amplify the rook by allowing row and column flips singly. But then we see that there are only 16 such operations. Again since repeats cancel, this means at most configurations are possible. We ask:
Is there a simple formula, yielding small Boolean circuits, for determining which configurations are reachable on an board?
We can pose this for the Rook, with-or-without “helicoptering,” and for the row-or-column flips individually. Small circuits would mean that strings in denoting reachable configurations enjoy a particular form of succinctness.
Since the rook fails to tour the whole exponential-sized space, let’s try the bishop.
The bishop can flip any odd number of coins from 7 to 13. It is limited to squares of one color but we can allow the opposite-color bishop to tag-team with it. I was just about to pose the same questions as above for the bishops when a familiar imperious voice swelled behind me. It was the Red Queen.
“I have all the power of your towers and prelates—and you need only one of me. I shall surely fill the space.”
I was no one to stand in her way, but the Dormouse awoke and quietly began scratching figures on paper. “Besides the sixteen ranks and files, there are fifteen southeast-to-northwest diagonals, including the corner squares a1 and h8 by themselves. And there are fifteen southwest-to-northeast diagonals. This makes only 16 + 15 + 15 = 46 64 operations. Hence, Your Majesty, even if we could parcel out your powers, you could fill out at most a fraction of the space.”
I expected the Red Queen to yell, “Off with his head!” But instead she stooped over the Dormouse and hissed,
“Sorry—I slept through the rest of Alice,” explained the Dormouse as he slunk away. Despite the Dormouse’s proof I thought it worth asking the same questions as for the rook and bishop about the queen’s subspace . What kind of small formulas or circuits can recognize it, whether requiring her to flip all coins in all directions or allowing to flip just one rank or file or diagonal at a time?
While I was wondering, His Majesty quietly strode to the center and said,
“I do not wantonly project power without bound; I reserve my influence so that my action on every square is distinctive.”
We can emphasize how far things stay distinctive by posing our basic questions in a more technical manner:
Do the sixty-four vectors over representing the king’s flipping action on each square span the vector space ? If not, what can we say about the circuit complexity of the linear subspace they generate?
On a board the four -vectors form a basis, but for and the king fails to span. For , kings in the two lower corners produce the same configuration as kings in the two upper corners. For , kings in a ring on a2, b4, d3, and c1 flip just the corner coins, as do the kings in the mirror-image ring. What about and ? Is there an easy answer?
Meeker still are the pawns, who attack only the two squares diagonally in front, or just one if on an edge file. They cannot attack their first rank, nor the second in legal chess games, but opposing pawns can. Then it is easy to see that the pawn actions span the space. The lowly contribution by the edge pawn is crucial, since it flips just one coin not two.
The knight flips all the coins a knight’s move away. One difference from the queen, rook, bishop, and king is that on its next move all the coins it flips will be new. Our revised Knight’s Tour question is:
Can the knight connect the string to any configuration by a sequence of knight’s moves, perhaps allowing multiple visits to some squares? Or if we disallow multiple visits in a tour, can we do it by “helicoptering”? Same questions for boards. If the answer is no, then are there easy formulas or succinct circuits determining the space of reachable configurations?
An example for needing multiple visits or helicoptering is that the configuration with heads on c2,b3 and g6,f7 is produced by knights acting in the corners a1 and h8, which are not connected by a knight’s move. If there is some other one-action-per-square combination that produces it, then by simple counting the knight cannot span—even with helicoptering.
The knight does fail to span a board because the corner d4 produces the same result as the knight on a1: heads on c2 and b3. The regular knight’s tour fails too on a so this can be excused for the same “lack of legroom” reason. What about and higher?
Thus having coins on the chessboard scales up some classic tour problems exponentially. Our larger motivation is what the solutions might tell us about complexity.
Do you like our exponential “tour” problems? Really they are reachability problems. Can you solve them?
Will von Warnsdorf’s rule ever be proved correct for all higher n?
Note: To update our recent quantum post, Gil Kalai released an expanded version of his AMS Notices article, “The Quantum Computer Puzzle.” We also congratulate him on being elected an Honorary Member of the Hungarian Academy of Sciences.
Akram Boukai is a researcher in material science, and an expert on converting heat to electric energy: thermoelectrics.
Today I wish to talk about a beautiful presentation he just gave at TTI/Vanguard in San Francisco.
Thermoelectrics is an effect that seems to have nothing to do with our usual topics. But Boukai uses a mathematical trick to make a “new” type of material. This material has to have quite special properties, and he is able to make it by using ideas that we are familiar with in theory. This is a great example, I believe, of theory interacting with technology.
Boukai presented his work at TTI/Vanguard, which is a conference I have talked about before—see here. It is oriented toward the future of technology of all kinds, with a special emphasis on electronic and computer technology. The talks often highlight new technologies, many of which are being developed by startups. This is the case with Boukai, who is co-founder of the company Silicium Energy. They are attempting to build components that will radically change how we power small devices. This is especially relevant to IoT—that is, the “Internet of Things.” Think watches, for example, that never need to be recharged.
In order to understand the math problem we need at least a high level understanding of the Seebeck effect, named after Thomas Seebeck. He discovered in 1821 that a compass needle is deflected, if it is connected to a loop that contains two metals, provided there is a temperature difference between the metals. Wikipedia’s diagram illustrates the underlying phenomenon:
The compass needle moves because the electrons in the metals act differently owing to their temperature difference, and thereby create an electrical current. This current then induces a magnetic field that moves the needle. Seebeck named this phenomenon the thermomagnetic effect, which is really wrong. The primary effect is the creation of an electrical flow—this was renamed to “thermoelectricity” by Hans Ørsted. Wrong or not, it is still called the Seebeck effect—he may have guessed how it worked incorrectly, but he discovered the effect.
Thus, the goal is to try and extract energy from a small heat difference. For example, Silicium Energy plans to use this method to build watches that need no recharging. The watches would exploit that while on your wrist there is a natural source of a heat difference: we are warm and the air around us is usually cooler. So by the Seebeck effect there will be an electrical current. The amount of energy created is tiny, but it will be large enough to power the processor in a modern digital watch.
This sounds doable. Yet it is tricky. The problem is getting a material that is a great conductor of electrons, but a poor conductor of heat. The insight that Boukai’s company is based on is that this can be made out of silicon. The advantage of using silicon and not some exotic materials, which have been used before, is cost. Silicon devices can be made using standard technology, for pennies per device, while exotic materials can be very expensive.
Being able to turn silicon into a thermoelectric material and do it at low cost is quite a feat. Silicon has good electrical properties, but also is a pretty good conductor of heat. The trick is to find a way to lower the thermal conductivity of silicon in order to increase its thermoelectrical efficiency. Lowering the thermal conductivity makes it easier to keep the cold side of the device cold to create that temperature difference needed by the Seebeck effect.
Boukai and his co-workers’ clever idea—finally—is to fabricate a piece of silicon that uses its structure to make a material that conducts electrons well and conducts heat poorly. Here is how he does this: Imagine a square of silicon, with the top side hot and the bottom cold. Initially—by the Seebeck effect—electrons will move from the top to the bottom and create a current. This is wonderful. However, the problem is that heat will quickly also flow from the hot top to the cold bottom and will make the Seebeck effect stop.
The trick is to make random defects, essentially holes, in the silicon. The point is this:
We thank him for sending the following picture:
Note: in physics, a phonon is a collective arrangement of atoms or molecules in a solid. They play a key role in the transport of heat. Boukai’s trick depends on the size of phonons, which are much larger than electrons. This explains why electrons are pictured as scooters and phonons as trucks.
I know this is a rough explanation, but I believe it is a reasonable description of what happens. And the fact that heat flows as a random-walk type process yields in practice a fold decrease in the silicon’s thermal conductivity. This keeps the watch running. See his joint paper for more technical details.
The ideas of random behavior and statistical mechanics have been around in physics for a long time. Karl Pearson coined the term “random walk” in 1905, the same year as Albert Einstein’s famous paper on Brownian motion. Ising models partly motivated the concept of , and Markov chains were long studied in physics before becoming a staple of computer theory. So there is no chicken-egg question about which methods came first where.
What strikes Ken and me as distinctively algorithmic, however, is the way the silicon materials are being programmed to have a physical property directly. This is different and feels more qualitative than programming logic gates on silicon. Of course there are other cases of mathematical structure and algorithmic behaviors being used to create new materials—witness the recent Nobel Prizes for work on graphene and quasicrystals.
I really liked the trick used here. Is there some other application where we could imagine using it to make some other new material, or even to use the trick abstractly in some algorithm?
]]>
Some fun rejection comments
Joshua Gans and George Shepherd were doctoral students in economics at Stanford University back in the 1990s. They wrote an interesting paper that I just came across titled, “How Are the Mighty Fallen: Rejected Classic Articles by Leading Economists.” It grew into a 1995 book edited by Shepherd: Rejected: Leading Economists Ponder the Publication Process.
Today I want to discuss the same issue in our area of theory.
Ken and I have not had a chance to do a formal survey of papers that were rejected in our area. We also would not do exactly the same as Gans and Shepherd since it’s not what happens to the “mighty” that matters most but rather to the great band of those doing productive and creative and sporadically uneven work. Our point is rather that all of us who write articles for conferences and journals are subject to sporadically uneven reviews.
So we will today just offer a few things from personal experience to season the grill. We are mostly interested in negative comments from bad reviews. We could also touch on the opposite, heroic reviews that found subtle mistakes—or maybe mistakes missed by everyone including the referees.
I once got the following comment back from a top theory conference in the rejection e-mail:
The authors assume incorrectly that the graph has an even number of vertices in Lemma
The graph in question was a cubic graph. By what is sometimes called the First Theorem of graph theory, all cubic graphs have this property. Just double-count edge contributions and one gets that
where is the number of vertices and the number of edges. I assume the referee was overwhelmed with work, but
I once submitted a short paper, joint with Andrea LaPaugh and Jon Sandberg, to the Hawaii International Conference on System Sciences and it was accepted. Well, sort-of accepted. The head of the conference asked me to make the paper “longer.” I asked back:
“What was missing? Was the problem not motivated? Was the proof unclear?” And so on.
The head simply replied: “we like longer papers.” I pushed and said I thought making a paper longer for no reason seemed wrong. He responded that the paper was now unaccepted.
I could not believe it. We quickly sent it off to an IEEE journal. Don Knuth handled it, and it soon was accepted with minor changes only. By the way the paper solved a simple question: what is the best way to store a triangular array in memory?
At the presentation of the Knuth Prize to Leonid Levin the following story was told about reviews:
Leonid once submitted a paper to a journal and got back a negative review: It said that the paper was too short and also terms were used before they were defined. Leonid responded by taking two identical copies of his paper, stapling them together, and resubmitting the “new” paper. It was now twice as long, which answered the first issue, and clearly all terms were defined before they were used.
It is unclear what happened to the paper.
Then there is the folklore rejection letter:
What is correct in your paper is known, and what is new is wrong.
I hope to never get this one.
We’d love to hear from you with your own examples of strange reviews.
I submitted the above post last week to my blog editor but didn’t hear back—I assumed he was overwhelmed with work. Then he replied and asked me to make the post “longer.” I asked back:
“What was missing? Was the issue not motivated? Was the evidence unclear?” And so on.
The editor simply replied, “it’s a bit thin.” I pushed and said I thought making a post longer for no reason seemed wrong. This editor at least gave some concrete suggestions:
Use something from the featured paper.
Gans and Shepherd give one interesting kind of example where the delay caused by rejection enabled others with similar ideas to get ahead—not on purpose by the rejecter but just-so. They also give some self-revealing quotes including this one by the economist Paul Krugman:
The self-serving answer [to the “why me?” question] is that my stuff is so incredibly innovative that people don’t get the point. More likely, I somehow rub referees and editors the wrong way, maybe by claiming more originality than I really have. Whatever the cause, I still open return letters from journals with fear and trembling, and more often than not get bad news. I am having a terrible time with my current work on economic geography: referees tell me that it’s obvious, it’s wrong, and anyway they said it years ago.
Use others’ personal examples or famous ones.
Ken recently heard a true giant admit he gets rejections, “often because the referees don’t believe this work is really new.” Perhaps Krugman’s last clause means the same?
A famous case in our field was the number of times the first interactive proofs paper by Shafi Goldwasser and Silvio Micali was rejected from FOCS and STOC before finally appearing at STOC 1985 with Charles Rackoff as third author. Ken recalls people excitedly telling him and everyone about the work at the FCT conference in Sweden in August 1983. This could have become an example like in the Gans-Shepherd paper of others pipping ahead, but happily didn’t.
Try to source the quotation at the end.
A version of it was used by Christopher Chabris and Joshua Hart at the end of their negative review in the New York Times last month of the book The Triple Package:
Our conclusion is expressed by the saying, “What is new is not correct, and what is correct is not new.”
In March, Ken took part in an online discussion with Chabris about sourcing it. Ken recalled hearing it in the early 1980s in this snarkier form:
“This paper has content that is novel and correct. However, the parts that are novel are not correct, and the parts that are correct are not novel.”
It was already then a widely-known math cliché. Someone else in the discussion sourced it to the slamming of John Keynes’ book, The General Theory of Employment, Interest, and Money, by Henry Hazlitt in the introduction to his 1960 book, The Critics of Keynsian Economics:
In spite of the incredible reputation of the book, I could not find in it a single important doctrine that was both true and original. What is original in the book is not true, and what is true is not original. In fact, even most of the major errors in the book are not original, but can be found in a score of previous writers.
We wonder if any of our readers can find an earlier source? When did “not true” become “not new”? In any event it was one tough review.
Was my referee right about lengthening the post?
[photo at top; format fixes]
Richard Lewis, Bill Horton, Earl Beal, Raymond Edwards, and John Wilson—the Silhouettes—were a doo wop/R&B group whose single Get A Job was a number 1 hit on the Billboard R&B singles chart and pop singles chart in 1958. Even back then it sold over one million records, and was later used in ads and movies.
Today I want to talk about hiring faculty, as we are getting near the end of the usual job hiring cycle.
From the view of the many PhD candidates who are looking for jobs, this year must seem pretty bright. Companies, company labs, and universities all seem to be hiring. We are seeing a large number of very qualified people on the market—I glad I have a job and do not have to compete with them.
At Tech we do the job search as an potential employer in the old-fashioned way. We look at applications, ask some to visit and give a formal presentation, and have them talk to our faculty and students. Then we vote on making offers, make them, and try to convince the fortunate recipients to accept.
This method has been used forever it seems, and it works reasonably well. However, a question is: can we use our own methods to make the recruiting and hiring process better? In computer science theory we have many results about making decisions under uncertainty, yet when we do hiring of faculty, we use completely ad hoc methods.
This year at our first faculty meeting to discuss hiring I brought donuts from Krispy Kreme for all to enjoy. The initial presentation on a candidate by one of our faculty had a slide that quoted a letter writer:
While you are sitting around eating donuts and evaluating candidates remember that
Somehow the writer of this recommendation letter ‘knew’ we would be eating donuts—I cannot decide if it was funny or scary.
I wonder if we can use theory methods to rethink the hiring process. Perhaps we will always do it the old way, and always eat donuts and chat. But perhaps too there is some way to use mathematical methods. In any event, I thought I would share some simple observations about this with you.
Imagine that Alice and Carol are on the job market. Assume that both are “above the bar” and would be solid additions to our school of Computer Science. Suppose also that we have one offer we can make—and if it is declined then we cannot make another offer. So our choice is simple:
How do we chose?
A model is that each candidate has a secret probability of whether they will select our offer: is the probability for Alice and is the probability for Carol. Let’s assume that
Here is the key issue. If we make an offer with a deterministic old-style method, we could pick a candidate that is very unlikely to come. This is what we wish to avoid.
A simple strategy is to flip an unbiased coin. If it’s heads make an offer to Alice, if it’s tails make the offer to Carol. Note, this trivial strategy yields an expected number of accepts of . And it does pretty well. If is much larger than , for example, we get Alice with probability within a factor of one-half. If on the other hand and are near each other we also do pretty well.
What is wrong with this strategy? Is it better than chatting and eating donuts?
Of course in real life the situation is much more complex:
And so on.
Can we make a reasonable model and find a decision strategy that still works well in a real world situation?
So is it donuts forever, or can we use some decision methods in hiring?
Any way good luck to all trying to: “Get A Job.”
Yip yip yip yip yip yip yip yip
Sha na na na, sha na na na na
Sha na na na, sha na na na na
Sha na na na, sha na na na na
Sha na na na, sha na na na na
Yip yip yip yip yip yip yip yip
Mum mum mum mum mum mum
Get a job, sha na na na, sha na na na na
]]>
An AMS article by Gil Kalai updates his skeptical position on quantum computers
Cropped from Rothschild Prize source |
Gil Kalai is a popularizer of mathematics as well as a great researcher. His blog has some entries on Polymath projects going back to the start of this year. He has just contributed an article to the May AMS Notices titled, “The Quantum Computer Puzzle.”
Today we are happy to call attention to it and give some extra remarks.
The article includes a photograph of Gil with Aram Harrow, who was his partner in a yearlong debate we hosted in 2012. We say partner because this was certainly more constructive than the political debates we have been seeing this year.
We’ve missed chances to review some newer work by both of them. In 2014, Gil wrote a paper with Greg Kuperberg on quantum noise models and a paper with Guy Kindler relating to the sub-universal BosonSampling approach. Among much work these past two years, Aram wrote a position paper on the quantum computing worldview and this year a paper with Edward Farhi. The latter reviews possible ways to realize experiments that leverage complexity-based approaches to demonstrating quantum supremacy, a term they credit to a 2011-12 paper by John Preskill.
Quantum supremacy has a stronger meaning than saying that nature is fundamentally quantum: it means that nature operates in concrete ways that cannot be emulated by non-quantum models. If factoring is not in —let alone randomized classical quadratic time—then nature can do something that our classical complexity models need incomparably longer computations to achieve. We like to say further that it means nature has a “notation” that escapes our current mathematical notation—insofar as we use objects like vectors and matrices that have roughly the same size as classical computations they describe, but swell to exponential size when we try to use them to describe quantum computations.
Aram’s paper with Fahri leverages complexity-class collapse connections shown by Michael Bremner, Richard Jozsa, and Daniel Shepherd and an earlier paper by Sergey Bravyi, David DiVincenzo, Reberto Oliveira, and Barbara Terhal. For instance, via the former they observe that if the outputs of certain low-depth quantum circuits can be sampled classically with high approximation then the polynomial hierarchy collapses to level 3. This remains true even under an oracle that permits efficient sampling of a kind of quantum annealing process. This is arguably more of a hard-and-fast structural complexity collapse than factoring being in would be.
Quantum supremacy also entails that the quantum systems be controllable. Preskill’s paper raises concrete avenues besides ones involving asymptotic complexity classes. Gil’s article picks up on some of them:
As Gil notes, the last has been undertaken by the company D-Wave Systems. This has come amid much controversy but also admirable work and give-and-take by numerous academic and corporate groups.
Our first remark is that Gil’s paper highlights a nice example of how computational complexity theory informs and gives structure to a natural-science debate. Aram and others have done so as well. We especially like the connection between prevalent noise and bounded-depth circuits vis-à-vis low-degree polynomials. We believe the AMS Notices audience will especially appreciate that. We’ve wanted to go even further and highlight how Patrick Hayden and Daniel Harlow have proposed that complexity resolves the recent black hole firewall paradox.
Our second remark is that this is still largely a position paper—the arguments need to be followed into the references. For example, the fourth of Gil’s five predictions reads:
No logical qubits. Logical qubits cannot be substantially more stable than the raw qubits used to construct them.
On the face of it, this is just the negation of what the quantum error-correcting codes in the fault-tolerance theorem purport to do. Gil follows with a more technical section countering quantum fault-tolerance in a stronger fashion with some technical detail but still asserting positions.
Our third remark is that the nub—that “when we double the number of qubits in [a quantum] circuit, the probability for a single qubit to be corrupted in a small time interval doubles”—is presented not as new modeling but “just based on a different assumption about the rate of noise.” We think there needs to be given a fundamental provenance for the increased noise rate.
For instance, a silly way to “double” a circuit is to consider two completely separate systems and to be one “circuit.” That alone cannot jump up the rate, so what Gil must mean is that this refers to doubling up when and already have much entanglement. But then the increased rate must be an artifact of entangling them, which to our mind entails a cause that proceeds from the entanglement. Preskill on page 2 of his paper tabs the possible cause of supremacy failing as “physical laws yet to be discovered.” We’ll come back to this at the end.
Gil puts the overall issue as being between two hypotheses which he states as follows:
I have a position that Dick and I have discussed that sounds midway but is really a kind of pessimism because it might make nobody happy:
This would put factoring in quantum time. That would still leave public-key cryptography as we know it under a cloud, though it might retain better security than Ralph Merkle’s “Puzzles” scheme. But the quadratic scale would be felt in all general quantum applications. It would leave everyone—“makers” and “breakers” alike—operating under the time and hardware and energy constraints of Merkle’s Puzzles, which we recently discussed.
Thus we have a “pun-intended” opinion on how the “Puzzle” in Gil’s title might resolve. However, I have not yet solved some puzzles of my own for marshaling the algebraic-geometric ideas outlined here to answer “why quadratic?” They entail having a mathematical law that burns-in the kind of lower-bound modeling in this 2010 paper by Byung-Soo Choi and Rodney van Meter, under which they prove an depth lower bound for emulating such CNOT trees with limited interaction distance where is a dimensionality parameter. This brings us to our last note.
The main issue—which was prominent in Gil and Aram’s 2012 debate—is that everything we know allows the quantum fault-tolerance process to work. Nothing so far has directly contradicted the optimistic view of how the local error rate behaves as circuits scale up. If engineering can keep below a fixed constant rate then the coding constructions kick in to create stable logical qubits. If some is a barrier in our world, what might the reason be?
It could be that this is a condition of our world, perhaps having to do with the structure of space and entanglement that emerged from the Big Bang. Gil ends by arguing that quantum supremacy would create more large-scale time-reversibility than we observe in nature. It would also yield emulations of high-dimensional quantum systems on low-dimensional hardware of kinds not achieved in quantum experiments to date—on top of our long-term difficulties of maintaining more than a handful of qubits.
This hints that an explanation could be as hard as explaining the arrow of time, or that could be a fundamental constant like others in nature for which string theory has trended against unique causes. Still, these other quantities have large supporting casts of proposed theory. If the explanation has to do with the state of the world as we find it then how do initial conditions connect to the error rate? More theory might indicate a mechanism by which initial should at least help indicate why isn’t simply an engineering issue.
Thus there is still an onus of novelty for justifying the pessimistic position. It may need to propose a new physical law, or a deepened algebraic theory of the impact of spatial geometry that retains current models as a limit, or at least a more direct replacement for what Gil’s article tabs as “the incorrect modeling of locality.”
How effective is the “Puzzle” for guiding scientific theory?
We note that the much–acclaimed and soberly–evaluated answer on quantum computing by Canada’s Prime Minister, Justin Trudeau, had this context: A reporter said, “I was going to ask you to explain quantum computing, but… [haha]…when do you expect Canada’s ISIL mission to begin again…?” Trudeau ended his spiel by saying, “Don’t get me going on this or we’ll be here all day. Trust me.” Would he have been able to detail the challenges?
Update: Gil released an expanded version to the ArXiv.
[added qualifier on Choi-van Meter result, added main sources for Farhi-Harrow]
]]>
A preview of the talks for this coming ARC Day
ARC is our Algorithms & Randomness Center at Tech. It was created by Santosh Vempala, and this Monday ARC is holding a special theory day. The organizers are Santosh with Richard Peng and Dana Randall.
Tomorrow, Monday April 11th, is the day for the talks, and I only have time to highlight just two of them.
All of the talks look great—see this for details on the other two by Rocco Servedio on “Circuit Lower Bounds via Random Projections” and Aaron Sidford on “ Recent Advances in the Theory of Interior Point Methods.” We previewed a related joint paper by Servedio last May. Sidford’s talk will include the first -time algorithm for finding the geometric median, that is a point that minimizes the sum of distances to given points in Euclidean space. This is a nice contrast to recent results where having any sub-quadratic algorithm would break conjectures about the hardness of SAT.
Virginia Vassilevska-Williams will speak on Fine-Grained Algorithms and Complexity. Virginia key point is simple:
If a problem is computable in polynomial time, then the classic reductions used to study the question are useless.
They are useless since they cannot distinguish the fine structure of polynomial time: linear time and time all look the same.
This simple insight leads one to study the topic now called “fine-grained reductions,” which focuses on exact running times. She explains that the key point of fine-grained is to allow one to compare problems that run in polynomial time. Yet the reductions also “float” so that they are not just: the algorithms run in essentially the same time. There is a lurking here. She states:
This approach has led to the discovery of many meaningful relationships between problems, and even sometimes equivalence classes.
She plans to discuss current progress and highlight some new exciting developments.
Luca Trevisan of Berkeley will speak on Ramanujan Graphs. These are graphs that are of course name after the extraordinary mathematician Srinivasa Ramanujan. Luca plans on reviewing what is known about existence and constructions of Ramanujan graphs, which are the best possible expander graphs from the point of view of spectral expansion—see this for precise definitions.
He will talk about Joel Friedman’s result that random graphs are nearly Ramanujan, and recent simplifications of Friedman’s proof, which was around 100 pages long. Luca will also talk about connections between Ramanujan graphs and the Ihara zeta function, and also about recent non-constructive existence results.
The Ihara zeta-function, named for Yasutaka Ihara, can be defined by a formula analogous to the Euler product for the usual Riemann zeta function:
This product is taken over all prime walks “p” of the graph that is, closed cycles. See this for the rest of the formal definition, and also this 2001 survey on zeta functions of graphs. Toshikazu Sunada made the key connection between Ramanujan graphs and this function that was first defined in the 1966, in a totally different context.
It seems amazing that graph problems can be coded into zeta like functions. One wonders what another problems yield to similar ideas.
We hope the talks have a nice turnout and are looking forward to a banner day.
]]>