Skip to content

Our nomination for best result of the year

 Cropped from source

Faadosly Polir has had an incredible year of research success. A year ago we covered his research on universal information entropy in baseball. This led to uncredited co-authorship on a paper, which Quanta covered last week, about reversal of entropy in a previously-built 51-qubit quantum computer. The paper focuses on quantum many-body scarring in a Bunimovich stadium, but Polir’s work on athletic body scarring in baseball stadiums was integral.

Today we report on his exciting and topical new findings.

Polir’s next project implemented a suggestion by the complexity theorist Vaughan Pratt on how to research the influence of random events on the information content of the universe:

Picture inserting a giant eggbeater into the deep state cloud servers and relocating all its bits at random. How would you recover important information like the net worth of Meghan Markle or President Trump’s tax returns?

Polir began the Cloud Beater Project using spare Amazon Web Services credits from Ken. The results so far are classified, but in addition to the above objectives there is a current push to infer from web traffic the full text of the impending report by Robert Mueller. The string length has been inferred. Polir’s model algebraically represents the entire cloud as a finite automaton analogous to Alan Turing’s model of the Enigma machine during World War II. The inferencing process is similar and cheap.

The newest work, however, will have the greatest social impact for the research community.

## The Problem

The problem is one we all are familiar with: the impact of the Internet and computing technology resources on critical thinking and academic research productivity. After extensive analysis-of-variance runs on Google Scholar data, Polir’s team isolated one key adverse factor:

LaTeX forces one to compile and recompile multiple times.

In retrospect the effect of this should have been suspected—after all, LaTeX is the one factor common to a major part of research across all scientific disciplines. Here are examples of relevant user comments, some paraphrased:

${\bullet}$ You cannot avoid it. LaTeX has to build the .aux file to store the table of contents (ToC) and such, because when typesetting the ToC it cannot know beforehand what sections will occur.

${\bullet}$ Often not only two but three or more runs are necessary. Imagine a document with lots of figures and a list of figures somewhere at the first pages of the document. All pages are numbered using ascending Arabic numbers. First compilation run will typeset the document without any entry at the list of figures but collects the information at the .lof file (via .aux file). Second compilation run will typeset the list of figures with several pages and as a consequence of this the page numbers of all figures will change. So you need a third run to get the correct numbers at the list of figures.

${\bullet}$ To correctly line up the various parts of a longtable, “In the typical case the algorithm will converge after three or four passes” (manual, section 4).

## The Magnitude

The result is the observation that there now are so many uses of LaTeX that any inefficiency in using it has great consequences. The bottom line of the first findings chapter in the report led by Polir is:

We found that over 100 million hours of productive time per year are lost to the multiple-pass issue in LaTeX.

There are now billions of LaTeX pages being worked on each year. See this for some data. Thus at a conservative estimate of \$100 dollars per hour this means that we are losing over ten billion dollars per year.

## Ken’s Solution

Ken has a way of coping. He spends the time waiting for LaTeX documents to open and compile on his office PC in a productive way. He used to check fantasy baseball or football player news during these waits. But last year he acquired a Facebook friend (FBF) from the chess world who became the first—out of his several hundred FBFs—to post cat videos. Several times daily cat videos, all new.

Ken had never seen a Facebook cat video before. So now he goes to the FBF’s page and clicks the next one while waiting for LaTeX to finish. This does, however, have less relevance to his statistical chess research than fantasy sports had.

 The mother lode?

## Polir’s Solution

Polir found a different solution that does not use cat videos—too bad for the cats. Polir first approached the most recently responsible party, who won the 2013 ACM Turing Award for solving the broken deli number dispenser problem. The prize-winning solution, as described in the citation, is to have each customer choose a number higher than all the other customers’ numbers. But how to leverage this brilliant solution to solve the multi-pass problem?

Polir’s insight came while consulting the original responsible party and author of the following computer code for a SAT solver. While hearing the author explain how this code emerged from his Pascal-based WEB and C-based CWEB programming environments, Polir realized this in a flash:

A LaTeX document is a network.

This led to the realization that the quick-compilation problems could be formulated as instances of network coding problems upon dividing the document into ${O(\log n)}$-sized chunks. A helpful reader of this blog supplied the final needed piece last week in a comment: via this new paper, there is an efficient reduction from network coding to integer multiplication. The reduction is not a string mapping but rather “corporate” via direct analysis of solving circuits.

Thus the solution came down to ideas in the brilliant new paper on integer multiplication which we just posted about. It means that we can solve the multi-pass problem essentially by multiplying pieces of the document together—or more exactly, convolving them. We needn’t be concerned about our remarks on the algorithm in that paper being galactic—there is a clear proof of concept in that the paper’s second author is also the chief developer of the TeXmacs composition environment. We thus nominate the paper plus Polir’s breakthrough consequence for result of the year.

## Open Problems

Incidentally, that comment has tag number 100,107—which means we just passed the 100,000 milepost in whatever those tags signify. We actually have 18,694 comments over the history of the blog and have been intending to mark its passing 20K. This again gives us a “no-fooling” opportunity to thank our readers.

3 Comments leave one →
1. Charley Sheaffer permalink
April 1, 2019 3:35 pm

Your analysis should include time spent clicking the Trash Aux Files button.

2. David J. Littleboy permalink
April 2, 2019 1:50 am

Re: Your discussion of Latex recompilation issues.

For next years April Fools you could do an article on how much Computer Science and Mathematics knowledge has been lost by people burning insane amounts of what would have otherwise been productive time futzing with (at both the user and designer/implementer levels) text formatting software. The only problem is that it’s not a joke…

• April 2, 2019 12:12 pm

I have my own story, told in a long comment here.