We revisit a paper from 1994

Richard Lipton is, among so many other things, a newlywed. He and Kathryn Farley were married on June 4th in Atlanta. The wedding was attended by family and friends including many faculty from Georgia Tech, some from around the country, and even one of Dick’s former students coming from Greece. Their engagement was noted here last St. Patrick’s Day, and Kathryn was previously mentioned in a relevantly-titled post on cryptography.

Today we congratulate him and Kathryn, and as part of our tribute, revisit a paper of his on factoring from 1994.

They have just come back from their honeymoon in Paris. Paris is many wonderful things: a flashstone of history, a center of culture, a city for lovers. It is also the setting for most of Dan Brown’s novel The Da Vinci Code and numerous other conspiracy-minded thrillers. Their honeymoon was postponed by an event that could be a plot device in these novels: the Seine was flooded enough to close the Louvre and Musée D’Orsay and other landmarks until stored treasures could be brought to safe higher ground.

It is fun to read or imagine stories of cabals seeking to collapse world systems and achieve domination. Sometimes these stories turn on scientific technical advances, even purely mathematical points as in Brown’s new novel, Inferno. It needs a pinch to realize that we as theorists often verge on some of these points. Computational complexity theory as we know it is asymptotic and topical, so it is a stretch to think that papers such as the present one impact the daily work of those guarding the security of international commerce or investigating possible threats. But from its bird’s-eye view there is always the potential to catch a new glint of light reflected from the combinatorial depths that could not be perceived until the sun and stars align right. In this quest we take a spade to dig up old ideas anew.

 Pei’s Pyramid of the Louvre Court = Phi delves out your prime factor… (source)

## The Paper and the Code

The paper is written in standard mathematical style: first a theorem statement with hypotheses, next a series of lemmas, and the final algorithm and its analysis coming at the very end. We will reverse the presentation by beginning with the algorithm and treating the final result as a mystery to be decoded.

Here is the code of the algorithm. It all fits on one sheet and is self-contained; no abstruse mathematics text or Rosetta Stone is needed to decipher. The legend says that the input ${x}$ is a product of two prime numbers, ${\phi}$ is a polynomial in just one variable, and ${\gcd}$ refers to the greatest-common-divisor algorithm expounded by Euclid around 300 B.C. Then come the runes, which could not be simpler:

• Repeat until exit:

• ${a := }$ a random number in ${1,\dots,x-1}$;

• ${b := \phi(a) \bmod{x}}$;

• if ${(\gcd(b,x) > 1)}$ then exit.

Exiting enables carrying out the two prime factors of ${x}$—but a final message warns of a curse of vast unknowable consequences.

How many iterations must one expect to make through this maze before exit? How and when can the choice of the polynomial ${\phi}$ speed up the exploration? That is the mystery.

Our goal is to expose the innards of how the paper works, so that its edifice resembles another famous modern Paris landmark:

This is the Georges Pompidou Centre, whose anagram “go count degree prime ops” well covers the elements of the paper. Part of the work for this post—in particular the possibility of improving ${|p| = n^{\epsilon}}$ to ${|p| = n(\frac{1}{2} - \epsilon)}$—is by my newest student at Buffalo, Chaowen Guan.

## A First Analysis

Let ${x = pq}$ with ${p}$ and ${q}$ prime. To get the expected running time, it suffices to have good lower and upper bounds

$\displaystyle \pi_p \leq \Pr_a[\phi(a) \equiv 0 \!\!\pmod{p}] \leq \rho_p$

and analogous bounds ${\pi_q,\rho_q}$ for ${q}$. Then the probability ${P}$ of success on any trial ${a}$ is at least

$\displaystyle \pi_p - \rho_q.$

This lower bounds the probability of ${b \equiv 0 \!\!\pmod{p} \wedge b \not\equiv 0 \!\!\pmod{q}}$, whereupon ${\gcd(b,x)}$ gives us the factor ${p}$.

We could add a term ${\pi_q - \rho_p}$ for the other way to have success, which is ${b \not\equiv 0 \!\!\pmod{p} \wedge b \equiv 0 }$ ${\!\!\pmod{q}}$. However, our strategy will be to make ${\rho_q}$ and hence ${\pi_q}$ close to ${0}$ by considering cases where ${p \ll q}$ but ${p}$ is still large enough to matter. Then we can ignore this second possibility and focus on ${\pi_p - \rho_q}$. At the end we will consider relaxing just so that ${\rho_q}$ is bounded away from ${1}$.

Note that we cannot consider the events ${b \equiv 0 \!\!\pmod{p}}$ and ${b \equiv 0 \!\!\pmod{q}}$ to be independent, even though ${p}$ and ${q}$ are prime, because ${b = \phi(a)}$ and ${\phi}$ may introduce bias. We could incidentally insert an initial text for ${\gcd(a,x) > 1}$ without affecting the time or improving the success probability by much. Then conditioned on its failure, the events ${a \equiv 0 \!\!\pmod{p}}$ and ${a \equiv 0 \!\!\pmod{q}}$ become independent via the Chinese Remainder Theorem. This fact is irrelevant to the algorithm but helps motivate the analysis in part.

This first analysis thus focuses the question to become:

How does computing ${b = \phi(a)}$ change the sampling?

We mention in passing that Peter Shor’s algorithm basically shows that composing certain (non-polynomial) functions ${\phi_a(y)}$ into the quantum Fourier transform greatly improves the success of the sampling. This requires, however, a special kind of machine that, according to some of its principal conceivers, harnesses the power of multiple universes. There are books and even a movie about such machines, but none have been built yet and this is not a Dan Brown novel so we’ll stay classically rooted.

## Using Polynomial Roots

Two great facts about polynomials ${\phi(z)}$ of degree ${d}$ are:

• There are guaranteed to be at most ${d}$ roots.

• If ${u \equiv v \!\!\pmod{p}}$, then ${\phi(u) \equiv \phi(v) \!\!\pmod{p}}$.

The second requires the coefficients of ${\phi}$ to be integers. Neither requires all the roots to be integers, but we will begin by assuming this is the case. Take ${R}$ to be the set of integer roots of ${\phi}$. Then define

$\displaystyle S_p = \{u + cp \!\!\pmod{x} : u \in R \wedge c \in [q]\},$

where as usual ${[q] = \{0,\dots,q-1\}}$. The key point is that

$\displaystyle P \geq \frac{1}{x}(|S_p \setminus R|) - \rho_q.$

To prove this, suppose the random ${a}$ belongs to ${S_p}$ but not to ${R}$. Then ${a = u + cp}$ for some ${u \in R}$, so ${ \phi(a) \equiv \phi(u + cp) \equiv \phi(u) = 0 \!\!\pmod{p}}$, but ${\phi(a) \neq 0}$ since ${a \notin R}$. There is still the possibility that ${\phi(a)}$ is a nonzero multiple of ${x}$, which would give ${b = 0}$ and deny success, but this entails ${\phi(a) \equiv 0 \!\!\pmod{q}}$ and so this is accounted by subtracting off ${\rho_q}$.

Our lower bound ${\pi_p}$ will be based on ${\frac{1}{x}(|S_p \setminus R|)}$. There is one more important element of the analysis. We do not have to bound the running time for all ${x}$, that is, all pairs ${(p,q)}$ of primes. The security of factoring being hard is needed for almost all ${x}$. Hence to challenge this, it suffices to show that ${\pi_p}$ is large in average case over ${p}$. Thus we are estimating the distributional complexity of a randomized algorithm. These are two separate components of the analysis. We will show:

For many primes ${p}$ belonging to a large set ${\mathcal{P}}$ of primes of length substantially below ${n/2}$, where ${n}$ is the length of ${x}$, ${\pi_p}$ is “large.”

We will quantify “large” at the end, and it will follow that since ${q}$ is substantially greater, ${\rho_q}$ is “tiny” in the needed sense. Now we are ready to estimate the key cardinality ${|S_p \setminus R|}$.

## The Lever

In the best case, ${|S_p|}$ can be larger than ${|R|}$ by a factor of ${q}$. This happens if for every root ${u}$, the values ${u + cp \!\!\pmod{x}}$ do not hit any other members of ${R}$. When this happens, ${|S_p|}$ itself can be as large as ${x}$. Then

$\displaystyle \pi_p = \frac{1}{x} (q - 1)|R| \approx 1.$

By a similar token, for any ${\delta > 0}$, if ${|S_p \setminus R| > \delta q|R|}$, then ${\pi_p > \delta}$—or in general,

$\displaystyle \pi_p > \delta\frac{|R|}{p}.$

The factor of ${(q-1)}$ is the lever by which to gain a higher likelihood of quick success. When will it be at our disposal? It depends on whether ${p}$ is “good” in the sense that ${|S_p \setminus R| \geq \delta q |R|}$ and also on ${R}$ itself being large enough.

For each root ${u \in R}$ and ${p}$ define the “strand” ${S_{u,p} = \{v_c : c \in [q]\}}$ where ${v_c = u + cp \!\!\pmod{x}.}$ There are always ${q}$ distinct values in any strand. If ${|R| \ll q}$ then every strand has most ${v_c}$ as non-roots. There is still the possibility that ${\phi(v_c) \equiv 0 \!\!\pmod{x}}$—that is, such that ${\phi(v_c) \equiv 0 \!\!\pmod{q}}$—which would prevent a successful exit. This is where ${p \ll q}$ really comes in, attending to the upper bound ${\rho_q}$.

 The Paris church of St. Sulpice and its crypt (source)

Hence what can make a prime ${p}$ “bad” is having a low number of strands. When ${v = u + cp}$ and ${v \in R}$ the strands ${S_{u,p}}$ and ${S_{v,p}}$ coincide—and this happens for any other ${p'}$ such that ${p'}$ divides ${v - u}$.

Here is where we hit the last important requirement on ${R}$. Suppose ${v = u + Cp}$ where ${C}$ is the product of every prime ${p' < x}$ other than ${p}$. Then ${S_{u,p'}}$ and ${S_{v,p'}}$ coincide for every prime ${p' < x}$. It doesn’t matter that ${v}$ is astronomically bigger than ${x}$ or ${x' = p' q}$; the strands still coincide within ${[x]}$ and within ${[x']}$.

Hence what we need to do is bound the roots by some value ${L}$ that is greater than any ${x'}$ we are likely to encounter. The ${x'}$ is not too great: if we limit to ${p'}$ of some same given length as that of ${p}$, then ${p' < 2p}$ so ${x' < 2x}$. We need not impose the requirement ${R \subseteq [L]}$ but must replace ${|R|}$ above by ${r = |R'|}$ where ${R' = R \cap [L]}$. We can’t get in trouble from ${w \in R \setminus R'}$ such that ${p}$ divides ${w - u}$ and ${p}$ divides ${w - v}$ since then ${p}$ divides ${v - u}$ already. This allows the key observation:

For any distinct pair ${u,v \in R'}$, there are at most ${\log(L)}$ primes ${p}$ such that ${p}$ divides ${v - u}$.

Thus given ${R'}$ we have ${\binom{r}{2}\log(L)}$ “slots” for primes ${p}$. Every bad prime must occupy a certain number of these slots. Counting these involves the last main ingredient in Dick’s paper. We again try to view it a different way.

Given ${\delta}$, and replacing the original ${R}$ with ${R' = R \cap [L]}$, ultimately we want to call a prime ${p}$ bad if ${|S_p \setminus R'| < \delta q r}$, where ${r = |R'|}$. We will approach this by calling ${p}$ “bad” if there are ${\leq \delta r}$ strands.

For intuition, let’s suppose ${r = 6}$, ${R' = \{t,u,v,w,y,z\}}$. If we take ${\delta = \frac{1}{2}}$ as the paper does, then we can make ${p}$ bad by inserting it into three slots: say ${(t,u)}$, ${(v,w)}$, and ${(y,z)}$. We could instead insert a copy of ${p}$ into ${(t,u)}$, ${(u,v)}$, and ${(v,w)}$, which lumps ${\{t,u,v,w\}}$ into one strand and leaves ${y,z}$ free to make two others. In the latter case, however, we also know by transitivity that ${p}$ divides ${v = t}$, ${w - u}$, and ${w - t}$ as well. Thus we have effectively used up ${6}$ not ${3}$ slots on ${p}$. Now suppose ${\delta = \frac{1}{3}}$ instead, so “bad” means getting down to ${s = 2}$ strands. Then we are forced to create at least one ${3}$-clique and this means using more than ${r(1 - \delta) = 4}$ slots. Combinatorially the problem we are facing is:

Cover ${r}$ nodes by ${s}$ cliques while minimizing the total number ${m}$ of edges.

This problem has an easy answer: make all cliques as small as possible. Supposing ${\ell = r/s = 1/\delta}$ is an integer, this means making ${s}$-many ${\ell}$-cliques, which (ignoring the difference between ${\binom{\ell}{2}}$ and ${\ell^2 / 2}$) totals ${m \sim \ell^2 s/2 = \ell r/2 = r/(2\delta))}$ edges. When ${\delta}$ is constant this is ${\Theta(r)}$, but shows possible ways to improve when ${\delta}$ is not constant. We conclude:

Lemma 1 The number of bad primes ${p}$ is at most ${\frac{r^2 \log(L)/2}{r/(2\delta)} = \delta r \log(L)}$.

We will constrain ${r}$ by bounding the degree of ${\phi}$. By ${p \ll q}$ this will also bound ${r}$ relative to ${q}$ so that the number of possible strands is small with respect to ${q}$, which will lead to the desired bound on ${\rho_q}$. Now we are able to conclude the analysis well enough to state a result.

## The Result

Define ${\mathsf{RAVGTIME}_{\gamma}[t(n)]}$ to mean problems solvable on a ${\gamma}$ fraction of inputs by randomized algorithms with expected time ${t(n)}$. Superscripting ${\phi}$ means having an oracle to compute ${\phi(a)}$ from ${a}$ for free. If ${\phi}$ is such that the time to compute ${\phi(a)}$ is ${t(n)}$ and this is done once per trial, then a given ${\mathsf{RAVGTIME}^{\phi}_{\gamma}[t(n)]}$ algorithm can be re-classified into ${\mathsf{RAVGTIME}_{\gamma}[t(n)^2]}$ without the oracle notation.

Theorem 2 Suppose ${[\phi_n]}$ is a sequence of polynomials in ${\mathbb{Z}[u]}$ of degrees ${d_n}$ having ${r_n}$ integer roots in the interval ${[0,L_n]}$, for all ${n}$. Then for any fixed ${\epsilon > 0}$, the problem of factoring ${n}$-bit integers ${x = pq}$ with ${|p| = \nu < \frac{n}{2} - \epsilon n}$ belongs to

$\displaystyle \mathsf{RAVGTIME}^{\phi_n}_{\Omega(1)}[O(n^2 \log L_n)]$

provided ${d_n = \Theta(\frac{2^{\nu}}{\log L_n})}$ and ${r_n = \Omega(d_n)}$.

Proof: We first note that the probability of a random ${a \in [x]}$ making ${\phi(a) \equiv 0 \!\!\pmod{q}}$ is negligible. By the Chinese Remainder Theorem, as remarked above, ${a}$ gives independent draws ${a_p \in [p]}$ and ${a_q \in [q]}$ and whether ${\phi(a) \equiv 0 \!\!\pmod{q}}$ depends only on ${a_q}$. This induces a polynomial ${\phi'(a_q)}$ over the field ${\mathbb{Z}_q}$ of degree (at most) ${d_n}$. So the probability ${\rho_q}$ of getting a root mod ${q}$ is at most

$\displaystyle \frac{d_n}{q} < O(2^{\nu - (n - \nu)}) = O(2^{2\nu -n}) = O(\frac{1}{2^{2\epsilon n}}),$

which is exponentially vanishing. Thus we may ignore ${\rho_q}$ in the rest of the analysis. The chance of a randomly sampled ${a}$ in a strand of length ${q}$ coinciding with another member ${v}$ of ${R'}$ is likewise bounded by ${\frac{d_n}{q}}$ and hence ignorable.

The reason why the probability of ${a_p}$ giving a root in the field ${\mathbb{Z}_p}$ is not vanishing is that ${r_n}$ is close to ${p}$. By ${r_n \leq d_n}$, we satisfy the constraint

$\displaystyle r_n = O(\frac{2^{\nu}}{\log L_n}) = O(\frac{p}{\log(p)\log_p(L_n)}).$

The condition ${r_n = \Omega(d_n)}$ ensures that this is the actual asymptotic order of ${r_n}$. Since we are limiting attention to primes of the same length as ${p}$, the “${\log(L)}$” above can be to base ${p}$. Hence ${r_n}$ has the right order to give that for some ${\delta > 0}$ and constant fraction of primes of length ${\nu}$, the success probability ${\pi_a}$ of one trial over ${a}$ satisfies

$\displaystyle \pi_a \geq \frac{\delta r_n q}{x} = \frac{\delta r_n}{p} = \Omega(\frac{1}{\log(p)\log_p(L_n)}) = \Omega(\frac{1}{\log(L_n)}).$

Hence the expected number of trials is ${O(\log L_n)}$. The extra ${n^2}$ in the theorem statement is the time for each iteration, i.e., for arithmetic modulo ${x}$ and the Euclidean algorithm. $\Box$

It follows that if ${\phi_n}$ is also computable modulo ${x}$ in ${n^{O(1)}}$ time, and presuming ${L_n < x^{n^{O(1)}}}$ so that ${\log(L_n) = n^{O(1)}\log(x) = n^{O(1)}}$, then factoring products ${x = pq}$ of primes whose lengths differ by just a hair is in randomized average-case polynomial time. Of course this depends on the availability of a suitable polynomial ${\phi}$. But ${\phi}$ could be any polynomial—it needs no relation to factoring other than having plenty of distinct roots relative to its degree as itemized above. Hence there might be a lot of scope for such “dangerous” polynomials ${\phi}$ to exist.

 Is there a supercomputer under the Palais Royale? (source)

Dick’s paper does give an example where a ${\phi}$ with specified properties cannot exist, but there is still a lot of play in the bounds above. This emboldens us also to ask exactly how big the “hair” needs to be. We do not actually need to send ${\rho_q}$ toward zero. If a constant fraction of the values ${a}$ get bounced by the ${\phi(a) \equiv 0 \!\!\pmod{q}}$ event, then the expected time just goes up by the same constant factor.

## Open Problems

We have tried to present Dick’s paper in an “open” manner that encourages variations of its underlying enigma. We have also optically improved the result by using ${\nu = n(\frac{1}{2} - \epsilon)}$ rather than ${n^{\epsilon}}$ as in the paper. However, this may be implicit anyway since the paper’s proofs might not require “${\epsilon}$” to be constant, so that by taking ${\epsilon = 1 + \frac{\log C}{\log n}}$ one can make ${n^{\epsilon} = Cn}$ for any desired factor ${C < \frac{1}{2}}$. Is all of this correct?

If so, then possibly one can come even tighter to ${\frac{n}{2}}$ for the length of ${p}$. Then the question shifts to the possibilities of finding suitable polynomials ${\phi}$. The paper “Few Product Gates But Many Zeroes,” by Bernd Borchert, Pierre McKenzie, and Klaus Reinhardt, goes into such issues. This paper investigates “gems”—that is, integer polynomials of degree ${N}$ having ${N}$ distinct integer roots and minimum possible circuit complexity for their degree—finding some for ${N}$ as high as 55 but notably leaving ${N = 32}$ open. Moreover, the role of a limitation ${L}$ on the magnitude of a constant fraction of a gem’s roots remains at issue, along with roots exceeding ${L}$ having many relatively small prime factors.

Finally, we address the general case ${\phi(z) = \sum_{i=0}^d c_i z^i}$ with rational coefficients ${c_i}$ (and ${c_d = 1}$). If ${\phi(a) = \frac{b}{c}}$ in lowest terms then ${\gcd(\phi(a),x)}$ means ${\gcd(b,x)}$ (divided by ${c}$) so the algorithm is the same. Suppose ${\frac{r}{s}}$ is a rational root in lowest terms and ${p}$ does not divide ${s}$, nor the denominator of any ${c_i.}$ Then we can take ${w}$ such that ${ws = bp + 1}$ for some ${b}$ and define ${u = w(p + r)}$. This gives

$\displaystyle u - \frac{r}{s} = \frac{us - r}{s} = \frac{w(p+r)s - r}{s} = \frac{(bp+1)(p+r) - r}{s} = \frac{bp^2 + brp + p}{s},$

which we write as ${\frac{\beta p}{s}}$. Then ${\phi(u) = \phi(\frac{r}{s} + \frac{\beta p}{s})}$. Because ${\frac{r}{s}}$ is a root, it follows that ${\phi(u)}$ is a sum of terms in which each numerator is a multiple of ${p}$ and each denominator is not. So ${\phi(u) = \frac{\alpha p}{\sigma}}$ in lowest terms where possibly ${\alpha = 0}$. Thus ${\gcd(\phi(u),x)}$ either yields ${p}$ or falls into one of two cases we already know how to count: ${u}$ is another root of ${\phi}$ or we have found a root mod ${q}$. Since ${u + cp}$ behaves the same as ${u}$ for all ${c \in [q]}$, we can define integer “strands” ${S_{u,p} = S_{r,s,p}}$ as before. There remains the possibility that strands induced by two roots ${y = r/s}$ and ${z = r'/s'}$ coincide. Take the inverse ${w'}$ for ${s'}$ and resulting integer ${u' = w'(p + r')}$, then the strands coincide if ${u \equiv u' \!\!\pmod{p}}$. This happens iff ${wr \equiv w' r' \!\!\pmod{p}}$. Multiplying both sides by ${ss'}$ gives

$\displaystyle wrss' = (bp+1)rs' \equiv rs' \equiv w' r' s s' = r' s(b' p + 1) \equiv r' s,$

so it follows that ${p}$ divides the numerator of ${z - y = \frac{rs' - sr'}{ss'}}$ in lowest terms. Thus we again have “slots” for each distinct pair ${y,z \in R}$ of rational roots and each possible prime divisor of the numerator of their difference. Essentially the same counting argument shows that a “bad” ${p}$ must fill ${\Omega(|R|)}$ of such slots. The other ways ${p}$ can be bad include dividing the denominator ${s}$ of a root or the denominator of a coefficient ${c_i}$—although neither way is mentioned in the paper it seems the choices for ${d_n}$ and ${r_n}$ in the above theorem leave just enough headroom. Then we just need ${L}$ to be a bound on all numerators and denominators involved in the bad cases, arguing as before. Last, it seems as above that only a subset ${R'}$ of the roots with ${|R'|/|R|}$ constant (or at least non-negligible) is needed to obey this bound. Assuming this sketch of Dick’s full argument is airtight and works for our improved result, we leave its possible further ramifications over the integer case as a further open problem.

Update 7/10: >
I’ve made a web folder of photos from Dick’s wedding and honeymoon.

[linked wedding announcement, clarified nature of Shor’s phi, added more about gems, linked photos, fixed n/2 – eps to n/2 – eps*n in theorem statement]

1. June 30, 2016 9:18 am

congratulations! paris/ louvre huh? pretty suave! guess there is hope/ inspiration for hardcore geeks/ hackers/ mathematicians/ scientists everywhere in the luv department now. & dont forget the Musee de Orsay too ofc.

oh re the technical content: it seems like the theoretical grounding of RSA deserves a lot more analysis and that an entire book could be written on the subj. there is some, maybe lots of mystery remaining. its basically a special case of turings halting problem once again… if there were a top 10 algorithms for CS, it seems like RSA would be in top 5 or maybe even top 3…

2. June 30, 2016 9:29 am

Congratulations Dick and Kathryn! Best wishes to both of you!

July 1, 2016 7:01 pm

are you saying factoring is in P?

• July 1, 2016 8:00 pm

No. The original big obstacle is the complexity of the sequence of polynomials φ_n in terms of n. This is related to the so-called “tau” conjecture of Lenore Blum, Michael Shub, and Steve Smale. The post clarifies L as a second obstacle even in the integer-roots case where all arithmetic can be done mod x. However, the original paper’s limitation to p being much smaller than q may be largely removable, and the timing also improved a bit at least in the terms stated—Dick’s paper defines everything to be quasi-polynomial from the get-go. But both he and I take very seriously the position that factoring may be in P.

July 2, 2016 1:27 pm

Two queries:
1) If straight line complexity of factorial (n!) is small does that mean there exists at least one polynomial of form (x-a1)…(x-am)g(x) at every m such that straight line complexity of this polynomial is O((log m)^c)?

2) If integer factorization is easy then does that mean there exists at least one polynomial of form (x-a1)…(x-am)g(x) at every m such that straight line complexity of this polynomial is O((log m)^c)?

July 2, 2016 4:26 pm

What is lipton blog’s stance on P=BPP?

July 2, 2016 9:51 am

Congratulations! I was wondering, what level of mathematics is recommended for understanding these posts? I am going into AP Calc next year, and I have trouble understanding some of the jumps you make and notation you use. Are there any good write ups on the mathematical concepts used most often in TCS?

Thank You

• July 2, 2016 10:40 am

Julian, thanks for the interesting comment. When we get technical, we center on promoting undergraduates and early-year graduate students to take up computing theory. This present post is also at research level—it’s in the gray zone where it’s not really an original paper (though it may become the “journal version” of one) but is pointing out some things that may be new. The part after “Open Problems” is a technical appendix that I wanted to include for peers but keep out of the main narrative. I for one may have special interest in adjusting my writing to high-school level.

As for a good source on concepts, this is part of the argument connected with the previous post on CS theory education. The answer is a background in discrete mathematics, automata/formal-languages, and elementary linear algebra, which typically range 1xx–3xx in college. In particular there is a move to fuse the first and second at maybe 2xx level as opposed to having separate textbooks like Rosen’s and Sipser’s which I could point you to now. But if I recall right, Hungary in the late 1950s or early 1960s started a program to introduce this material in high school. Dick and I started discussing this before this post last December.

5. July 4, 2016 4:15 am

Congratulations to Dick and Kathryn!

July 10, 2016 7:49 am

Dear Gil, I understood the follow the perturbed leader principle, which had been bothering me for a while what it really means, in realizing that Hedge is a log sum exp version of it. The true meaning of why being greedy doesn’t make sense in computational complexity theory implies that I still don’t get why I did not immediately get why non stochastic means probabilistic. I think it is weird that learning theory should do some soul searching on their definitions and how they apply to gaming adversaries. Rob Schapire’s famous one line proof that Hedge cannot compete against an arbitrary opponent is wrong I think. As Richard points out in his book it makes little sense that one bit impossibility proofs (cf. P is not equal to NP) are seriously problematic only to thank the people who commented that his question makes sense. Best, Yannis (to contact me should you wish please refer to my contact details in this blog as a one time pad)

July 11, 2016 2:33 pm

Τι κάνουν τα πουλιά όταν πεθαίνουν τραγουδώντας; Βγαζουν αυτό το ωραίο άρθρο: http://www.mixanitouxronou.gr/ta-poulia-pethenoun-tragoudontas-o-pater-ralf-itan-gkei-ke-i-megki-eroteftike-ton-antra-pou-misise-sto-sirial/ που ακυρώνει τις μαλακίες που έφαγε μια ωραία γερμανίδα στο YouTube.

I had asked Jen that I cannot find a good relationship between optimization and security in walking on Nassau Street toward the Fitz-water gate. Very good water though. The gate is actually FitzRandolph in Princetoniana font.

Look up and look up Straffin’s book because liberal art colleges are not like Princeton. Your’s truly from the love seat in my western movie.

The microeconomics bible explains the story differently, first you do optimization for one person against the economy and then the economy asks his wife to pray to money. Did she know about it and what if he did not have a wife. Will he ever?

Apologize to the FitzRandolph Gate.

• July 12, 2016 8:45 am

Τιποτα αλλο εχεις να πεις, ή με αυτο τα ειπες ολα και εξαντληθηκες?

(translation)

Is there anything else you wish to say, or you actualy exhausted all that was to be said?

Do you think we are stupid?

July 4, 2016 6:13 pm

Hi: I was reading a newspaper article on Georgia being up to the effort of germinating wellsprings only to be disappointed by the jest of the Athens decorum having had declined in with the idea that geopolitics should be based on the principle that academic institutions can only flourish under principles of treason unless trails of history are laid bare. In other words, Georgia Tech sounds weird however Georgia** Tech sounds more honest. Maybe the wildcards should correspond to the order in which Georgia was instituted as a state of the United States of America to pay tribute to the fact that γεωργία means something else in Greek which only makes sense as an echo from a slightly different culture if and only if this difference is, in my humble opinion, properly clarified. Maybe the difference however is not so slight but I would like to know about it.

July 4, 2016 6:14 pm

Yannis whatever

July 7, 2016 6:15 pm

But Paris under the water was a great sight too…
http://www.cbc.ca/news/multimedia/this-is-what-it-looks-like-in-flooded-paris-1.3614718

July 9, 2016 3:51 pm

“The extra ${n^2}$ in the theorem statement is the time for each iteration, i.e., for arithmetic modulo {x} and the Euclidean algorithm.”

Can the ${n^2}$ be replaced by something quasilinear, i.e., ${n (\log n)^2 \log \log n}$, by using a fast Euclidean algorithm?

(Also, has the degree of the polynomial been included in the iteration analysis? I would think there should be a $\log d$ term here when evaluating the polynomial, but perhaps I missed something.)

• July 9, 2016 4:38 pm

Thanks, yes, a faster Euclidean algorithm would cut the time. I left as a hedge the possibility of using modular exponentiation in O(n^2) time. The polynomial is treated as an oracle with evaluations “for free”, thus skirting the degree being exponential in n in the theorem statement. The straight-line programs used in Dick’s paper can reach high degree quickly by iterated squaring (unlike formulas), whence your log(d) term would enter, but the polynomial may also have exponentially-many terms. The tau-conjecture is about the complexity of polynomials with many distinct roots; this and the magnitude L of the roots are two defenses for factoring from the attack in the paper—do you think perhaps the “p << q" condition in the paper no longer is?

July 9, 2016 5:23 pm

Thanks, I had forgotten the oracle would sidestep the polynomial evaluation.

It seems like this tightens p’s size significantly, so yes, my intuition says it’s not the issue. But the tau-conjecture is pretty significant.

The “gem” paper is fascinating. I’d seen a limited case of it in Crandall-Pomerance (the normalized gems), but this gives much more details.

Thanks as always for the great blog!