Skip to content

The Halting Problem For Linear Machines

July 4, 2015

A small idea before the fireworks show


Thoralf Skolem was a mathematician who worked in mathematical logic, set theory, and number theory. He was the only known PhD student of Axel Thue, whose Thue systems were an early word-based model of computation. Skolem had only one PhD student, Öystein Ore, who did not work in logic or computation. Ore did, however, have many students including Grace Hopper and Marshall Hall, Jr., and Hall had many more including Don Knuth.

Today Ken and I try to stimulate progress on a special case of Skolem’s problem on linear sequences.

Although Ore worked mainly on ring theory and graph theory the seeds still collected around Skolem’s tree: Hall’s dissertation was titled “An Isomorphism Between Linear Recurring Sequences and Algebraic Rings.” Sequences defined by a finite linear operator are about the simplest computational process we can imagine:

\displaystyle  x_{n} = a_1 x_{n-1} + a_2 x_{n-2} + \cdots + a_m x_{n-m}.

The coefficients {a_1,\dots,a_m} and initial values {x_{0},\cdots,x_{m-1}} can be integers or relaxed to be algebraic numbers. Skolem posed the problem of deciding whether there is ever an {n} such that {x_n = 0}.

This is a kind of halting problem. It seems like it should be simple to analyze—it is just linear algebra—but it has remained open for over 80 years. We have discussed it several times before. This 2012 survey by Joel Ouaknine and James Worrell, plus this new one, give background on this and some related problems.

The Sum-of-Powers Case

Let {f(n)} be

\displaystyle  \lambda_{1}^{n} + \cdots + \lambda_{m}^{n},

where each {\lambda_{i}} is an algebraic integer. Our problem is:

Does there exist a natural number {n} so that {f(n)=0}?

This is a special case of the Skolem problem. It arises when the coefficients {a_1,\dots,a_m} are the evaluations of the elementary symmetric polynomials {s_1,\dots,s_m} at {\lambda_1,\dots,\lambda_m} with alternating signs. For example, with {m = 2} we get

\displaystyle  a_1 = \lambda_1 + \lambda_2, \quad a_2 = -\lambda_1 \lambda_2,

which for {x_0 = 2} and {x_1 = \lambda_1 + \lambda_2} gives

\displaystyle  x_2 = a_1 x_1 + a_2 x_0 = (\lambda_1 + \lambda_2)^2 - 2\lambda_1 \lambda_2 = \lambda_1^2 + \lambda_2^2

and so on. For {m = 3} we have

\displaystyle  a_1 = \lambda_1 + \lambda_2 + \lambda_3, \quad a_2 = -(\lambda_1 \lambda_2 + \lambda_1 \lambda_3 + \lambda_2 \lambda_3),\quad a_3 = \lambda_1\lambda_2\lambda_3.

Then {f(n) = 0} means {\lambda_1^n + \lambda_2^n + \lambda_3^n = 0.} If the {\lambda_i} are nonzero integers then for odd {n \geq 3} this is asking whether {\lambda_1,\lambda_2,-\lambda_3} is a solution to Pierre Fermat’s equation, and we can simply answer “no.” Of course whether {X} is a solution can be easier than asking whether the equation has a solution, but this shows our case contains some of the flavor of Fermat’s Last Theorem.

We can point up some minor progress on this problem. Our methods can handle somewhat more general cases where the sum of {n}-th powers is multiplied by {cn^{d}} for some fixed constants {c} and {d}, but we will stay with the simpler case. Our larger hope is that this case embodies the core of the difficulty in Skolem’s problem, so that solving it might throw open the road to the full solution.


Let’s begin the proof for the case when {n} is a prime {p}. Suppose that {f(p)=0}. Recall

\displaystyle  f(n)=\lambda_{1}^{n} + \cdots + \lambda_{m}^{n},

Clearly we can assume that {f(1) \neq 0}. Note that this is decidable. Put {X = \mathsf{Norm}(f(1))}. The key is to look at the quantity

\displaystyle  Y = \mathsf{Norm}( (\lambda_{1} + \cdots + \lambda_{m})^{p} ),

where {p} is a prime. We employ the following generalization of the binomial theorem:

\displaystyle  (x_1 + x_2 + \cdots + x_m)^n 		 = \sum_{k_1+k_2+\cdots+k_m=n} {n \choose k_1, k_2, \ldots, k_m} 		\prod_{1\le t\le m}x_{t}^{k_{t}},


\displaystyle  {n \choose k_1, k_2, \ldots, k_m} = \frac{n!}{k_1!\, k_2! \cdots k_m!}.

The upshot is that all terms are divisible by a proper factor of {n} except those from the cases {k_j = n}, all other {k_i = 0}. Each gives a factor of {{n \choose n} = 1} and leaves the term {\lambda_i^n}. When {n} is a prime {p} this factor must include {p} itself. Thus we get that {Y = \mathsf{Norm(y)}} for some {y} of the form

\displaystyle  y = \lambda_{1}^{p} + \cdots + \lambda_{m}^{p} + pR,

where {R} is an algebraic integer. But by the supposition {f(p) = 0} this simplifies to {y = pR}, and so {Y} is divisible by {p}. Thus

\displaystyle  Y \equiv 0 \bmod p.

Since {Y = \mathsf{Norm}(f(1)^p) = \mathsf{Norm}(f(1))^p = X^p}, {X} too is divisible by {p}. But {X} is independent of {p.} Hence, {X} acts as a bound on any possible prime {p} such that {f(p) = 0}. Testing the finitely many values of {p} up to {X} thus yields a decision procedure for this restricted case of Skolem’s problem.

Fine-Tuning and Fireworks

Ken chimes in an observation that might be distantly related: The Vandermonde determinant

\displaystyle  \prod_{1 \leq i < j \leq m} (x_j - x_i)

is the “smallest” alternating polynomial in {m} variables. Together with the symmetric polynomials it generates all alternating polynomials. When the {x_i} are the {m}-th roots of unity it gives the determinant of the Fourier matrix {F_m} up to sign. This determinant has absolute value

\displaystyle  D = m^{m/2} = 2^{\frac{1}{2}m\log_2 m}.

It is also the product of the lengths of the {{m \choose 2}} chords formed by {m} equally-spaced points on the unit circle. The observation is that this 2-to-the-nearly-linear quantity is extraordinarily finely tuned.

To see how, let’s estimate the product of the chords in what is caricatured as the style of physicists: The length of an average chord is {\sqrt{2}}. So we can estimate the size of the product as

\displaystyle  (\sqrt{2})^{m \choose 2} \approx 2^{\frac{1}{4}m^2} = 2^{\Theta(m^2)}.

This is off by an order of magnitude in the exponent—not even close. We can be a little smarter and use the average length of a chord instead, integrating {\frac{1}{\pi}\sqrt{2 - 2\cos(x)}} from {0} to {\pi} to get {4/\pi}. This is still a number greater than {1} and plugs in to yield {2^{\Theta(m^2)}} anyway.

Such a calculation looks silly but isn’t. If we enlarge the circle by a factor of {(1 + \delta)} then every term in the product is multiplied by that factor and it dominates:

\displaystyle  D_{1+\delta} \approx (1 + \delta)^{m \choose 2} = 2^{\Theta(m^2)}.

If we shrink the circle by {(1 - \delta)} the opposite happens: we divide by {2^{\Theta(m^2)}} which crushes everything to make the analogous quantity {D_{1-\delta}} virtually zero. Furthermore this “big crush” happens under more-plausible slight perturbations such as forbidding any of the {m} points from occupying the arc between {0} and {\epsilon} radians, which prevents the equal-spacing maximization when {m > 2\pi/\epsilon}. We covered this at length in 2011.

The underlying reality is that when you take the logarithm of the product of chords, the terms of all growth orders between {m^2} and {m^{1 + \delta}} all magically cancel. There are many more chords of length {> 1} than chords of length {< 1}, but the latter can be unboundedly short in a way that perfectly balances the multitudes of longer chords. The actual value of {D} seems tiny amidst these perturbative possibilities.

This gigantic cancellation reminds Dick and me of the present argument over the tiny observed magnitude of the cosmological constant {\Lambda}. Estimation via quantum field theory prescribes a value 120 orders of magnitude higher—one that would instantly cause our universe to explode in fireworks—unless vast numbers of terms exactly cancel. Quoting Wikipedia:

This discrepancy has been called “the worst theoretical prediction in the history of physics” … the cosmological constant problem [is] the worst problem of fine-tuning in physics: there is no known natural way to derive the tiny [value] from particle physics.

“Fine-tuning” of constants without explanation is anathema to science, and many scientists have signed onto theories that there is a multiverse with 500 or more orders of magnitude of universes, enough to generate some with the tiny {\Lambda} needed to allow life as we know it. However, any fine-tuning discovered in mathematics cannot be anathema. Perhaps the universe picks up the Fourier fine balancing act in ways we do not yet understand. More prosaically, the fine balance in quantities similar to {f(n) = \lambda_1^n + \cdots \lambda_m^n} above could be just what makes Skolem’s problem hard.

Open Problems

I believe that the general case of the Skolem can be handled, not just the simple case. But the problem of handling more than just primes seems hard. I believe that this method can be used to handle more cases than just primes. Ken and I are working on this. Meanwhile, we wish everyone Stateside a happy Fourth of July, whether or not that includes fireworks.

[added link to new survey in intro]

4 Comments leave one →
  1. July 4, 2015 4:38 pm

    Thanks for these fascinating new fireworks around the Skolem Problem! By the way, a much more recent survey on the topic and closely related problems can be found at .


  1. The Long Reach of Reachability | Gödel's Lost Letter and P=NP
  2. Open Problems That Might Be Easy | Gödel's Lost Letter and P=NP

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s