Skip to content

Solving Diophantine Equations the Easy Way

June 29, 2009

Solving Diophantine equations with mixed rationals and reals


Derrick Lehmer was an early computational number theorist and, during his long career, both proved wonderful theorems and made amazing special purpose computers. He used his machines, for one, to factor the largest numbers possible at the time.

Today I planned to talk about an approach to integer factorization–a problem that Lehmer worked on his whole life. The method is a refinement of an idea I discussed in an earlier post. However, there is an issue that came up while writing that post, that I want to talk about first; I will post shortly on the whole factoring idea. The question concerns solving mixed Diophantine equations. More on these later.

I was honored to meet Lehmer, in 1979, when I was on the faculty at Berkeley. One day Manny Blum and I went up to Lehmer’s office, where we had a wonderful chat on a wide range of topics.

One topic we discussed at length was the production of random numbers. Lehmer had thought quite deeply about the issues involved in the creation of “truly” random numbers. The bottom line of the conversation was the claim–well argued by Lehmer–that it is extremely hard to generate truly random numbers, even with physical devices.

It was a long discussion, but here is one of his insights, which I still explain in my lectures to students, whenever we discuss randomized algorithms. Suppose we try to use the randomness that appears to be inherent in radioactive decay. We use a Geiger counter to count the number of beta and gamma particles emitted by a low-grade radioactive source. Every now and then we stop the counter and read off the low order bit. The claim, Manny and I believed, was that this bit would be random.

Derrick smiled and explained that our idea was flawed–in many ways. He pointed out that it is easy to make a counter, by design or by error, that when you stop it, tends to stop with an even count rather than an odd count. This is true, especially, if the counter is “spinning” at a high rate. Obviously, such a counter would give very biased bits. Our answer was to have the random process flip the counter at a much slower rate and, we stated, this would mean that when we stopped the counter it would likely be unchanging. However, Derrick said that then the rate of getting random bits would be very slow. And so on.

Before we talk about mixed Diophantine equations, I thought you might like to know about Lehmer’s machines. One of his most interesting special purpose machines was made of bicycle parts, an electric motor, a light, and a photodetector; all to solve Diophantine equations.

The machine had several wheels that rotated at different speeds and, holes had been cut into the wheels, allowing light to potentially shine through at special positions. As the wheels rotated, if all the holes were aligned, then the light got through to the photodetector–and this stopped the machine. The wheels would be turned back slowly, by hand, so that the exact point where they were all aligned was found.

Lehmer used his machine to solve certain sieve problems, which yielded solutions to Diophantine equations. The advantage of the machine was that the electric motor could turn the wheels at a high speed, which allowed a large number of combinations to be tried per second.

The device was relatively cheap, and Lehmer could run it day and night: after all it was his own machine. With this device he could vastly out perform computers of the day, both in speed and definitely in price-performance. There was a time when it was the state of the art in factoring.

Mixed Diophantine Equations

I have an approach to factoring based on finding certain special matrices. The properties that these matrices need to satisfy can be stated as finding the solution to Diophantine equations. I will discuss today, a relaxation that may make the required equations easier to solve. Also the questions that arise seem to have independent interest.

In general a Diophantine equation is

\displaystyle  f(x_{1},\dots,x_{n}) = 0

where {x_{i}} are restricted to be integers (or sometimes rationals) and {f} is a polynomial over the integers. Finding a solution over the integers is well known to be undecidable, and even special cases can be extremely difficult to resolve. Finding a solution over the rationals is a long-standing open problem, by the way.

We are interested in the following simpler question: Suppose that

\displaystyle  f(x_{1},\dots,x_{n}) = 0

has a solution over the complex numbers. When does that imply that there is another solution {x'_{1},\dots,x'_{n}} so that {x'_{1}\dots x'_{k}} are rationals? Of course, the interesting case is when {k<n}.

The question is: does such a solution always exist? If it does, then that has potential to greatly simplify the approach I have to factoring.

There is a simple counterexample: Consider the polynomial {f(x,y)=x^{2}-2}. Then, it clearly has solutions, but no solution is possible with {x} rational.

One Equation Case

The problem with the simple counterexample, is that the polynomial {f} does not depend on {y}. A natural conjecture is the following: Suppose {f(x,y)} depends on {y}. Then if it has a solution it has a solution with {x} a rational.

Let’s examine this, by writing the equation in the following form:

\displaystyle  f(x,y) = a(x)y^{2} + b(x)y + c(x) = 0

for the case when {y} has degree {2}. Let’s assume that {a(x)} is not identically {0}. For that, let’s simply select {x} as an integer so that {a(x)} is not zero. Then, there is a {y \in \mathbb{C}} so that {f(x,y)=0}.

Theorem: Suppose that {f(x_{1},\dots,x_{n},y)} depends on the variable {y}. Then, there are integers {x_{1},\dots,x_{n}} and a {y \in \mathbb{C}} so that

\displaystyle  f(x_{1},\dots,x_{n},y) = 0.

The proof is the same as the example we just went over. Note, this result is essentially just the Fundamental Theorem of Algebra.

One Equation and One Inequation Case

A much more interesting problem arises with multiple equations.

As usual {|a|} is the absolute value of {a}. Also for a vector {x}, let {||x||} denote the value

\displaystyle  |x_{1}| + |x_{2}| + \dots + |x_{n}|.

Note, {||(a,b)-(c,d)|| = ||a-c|| + ||b-d||}.

The goal of this section is to prove the following theorem: (This is a joint result with Subrahmanyam Kalyanasundaram and Farbod Shokrieh.)

Theorem: Suppose that {f(x_{1},\dots,x_{n},y)} and {g(x_{1},\dots,x_{n},y)} are polynomials, and that {f(x_{1},\dots,x_{n},y)} depends on the variable {y}. Also assume that there are reals {w_{1},\dots,w_{n},z} so that {f(w,z)=0} and {g(w,z) \neq 0}. Then, there are rationals {x_{1},\dots,x_{n}} and a {y \in \mathbb{C}} so that

\displaystyle  f(x,y) = 0


\displaystyle  g(x,y) \neq 0.

We first need a classic lemma:

Lemma: Let the polynomial {f(y)} equal

\displaystyle  y^{k} + a_{k-1}y^{k-1} + \dots + a_{0}

where {k \ge 1}, and the polynomial {g(y)} equal

\displaystyle  y^{k} + b_{k-1}y^{k-1} + \dots + b_{0}.

Suppose that {f(z)=0}. Then, for any {\delta>0}, there is an {\epsilon>0} so that if { ||a-b|| < \epsilon}, then there exists a {y \in \mathbb{C}} so that {|z-y| < \delta} and {g(y)=0}.

Lemma: Suppose that the polynomial {f(x,y)} is equal to

\displaystyle  a_{k}(x)y^{k} + a_{k-1}(x)y^{k-1} + \dots + a_{0}(x)

where {k \ge 1} and each {a_{i}(x)} is a polynomial, and {x=(x_{1},\dots,x_{n})}. Then, for any {\delta>0}, there is an {\epsilon>0} so that if {a_{k}(w) \neq 0} and {f(w,z)=0} and {||w-x||<\epsilon}, then there exists a {y \in \mathbb{C}} so that {|z-y| < \delta} and {f(x,y)=0}.

Proof: Since {a_{k}(w) \neq 0}, this follows from the previous lemma, by dividing by the lead term. \Box

Now we will prove the main theorem:

Proof: First, the function {g(x,y)} is such that for any {\delta>0} there is an {\epsilon>0} so that

\displaystyle  	 ||(w,z)-(x,y)|| < \epsilon \rightarrow |g(w,z)-g(x,y)| < \delta. \ \ \ \ \ (1)

This is, of course, just uniform continuity of the polynomial function.

Second, since {f} is a polynomial that depends on {y}, by the lemma, for any {\delta>0}, there is an {\epsilon>0} so that

\displaystyle  	 ||w-x|| < \epsilon \rightarrow \exists y, |z-y| < \delta, f(x,y)=0. \ \ \ \ \ (2)

Let { |g(w,z)| = b >0}. Select {\epsilon_{1}>0} small enough so that (1) holds for {\delta=b/2}. Now select {\epsilon_{2}>0} so that it is smaller than {\epsilon_{1}}, and (2) holds for {\delta=\epsilon_{1}/2}.

Since the rationals are dense, for {\epsilon_{2}>0} there is an {x} all rationals so that {||w-x|| < \epsilon_{2}/2}. By (2) there is a {y} so that {|z-y| < \epsilon_{1}/2} and also {f(x,y)=0}. Note, that {||(w,z)-(x,y)|| < \epsilon_{2}/2 + \epsilon_{1}/2 < \epsilon_{1}}. Thus, by (1)

\displaystyle  |g(w,z)-g(x,y)| < b/2.

This implies that {g(x,y) \neq 0}. \Box

General Case

What we really would like is a generalization that could handle several equations and one inequation. Note, one inequation is the same as many. The condition

\displaystyle  g_{1}(z) \neq 0 \text{ and } \dots \text{ and } g_{k}(z) \neq 0

is the same as

\displaystyle  g_{1}(z)g_{2}(z) \cdots g_{k}(z) \neq 0.

The theorem we need to help solve the factoring problem would look like our theorem, but would allow several equations and one inequation.

Suppose that {x =(x_{1},\dots,x_{n})} and {y=(y_{1},\dots,y_{m})}. Let {f_{i}(x,y)} for {i=1,\dots,k} and {g(x,y)} be polynomials. Also assume that for some real {w,z},

\displaystyle  f_{1}(w,z)=0,\dots,f_{k}(w,z)=0 \text{ and } g(w,z) \neq 0.

Then, there are rationals {x} and a {y \in \mathbb{C}^{m}} so that

\displaystyle  f_{1}(x,y) = 0,\dots,f_{k}(x,y)=0


\displaystyle  g(x,y) \neq 0

provided {\Gamma} is true.

The key question is what is the property {\Gamma}? In the case of {k=1}, {\Gamma} was that the polynomial depended on the single variable {y}. I would imagine that now the right property is a stronger notion of dependence? What is it?

Open Problems

The main open problem is to find the right statement of the general theorem and prove it. That what should {\Gamma} be?

Are there some interesting applications of this ability to “almost” solve Diophantine in rationals? What about a similar result that holds over the complex numbers?

One potential application that comes to mind is the area of solutions to economic or game theoretic problems. In order to apply to these problems the value of the {y} variable must be restricted to be real also. This does not seem possible with the technology that we have so far. Thus, an important additional direction is to try to prove a theorem, even with one equation and one inequation, that allows {y} to stay real. I believe that such a theorem could be quite interesting.

About these ads
6 Comments leave one →
  1. Ryan Williams permalink
    June 29, 2009 1:56 pm

    The story about Lehmer’s machine is fascinating. Did he ever publish a paper? Sounds like he could have published something about the machine’s output…

  2. June 29, 2009 6:10 pm

    In the context of Lehmer’s photo-mechanical device, Shamir’s TWINKLE and Shamir and Tromer’s TWIRL are worth mentioning.

  3. July 2, 2009 8:35 pm

    Regarding your open problems, I suggest you talk to an algebraic geometer. You basically have an algebraic set defined by f_1,\dots,f_k and an open subset Y of it defined by g\neq 0. You want to know under which conditions the projection of Y to the first n-1 coordinates has a rational point. In such a generality this is probably hopeless, but probably when k is much smaller than n and some nondegeneracy conditions holding, you might have a chance…

    In the case of reals, the situation is even harder, as you can encode any number of equations by just one equation (take the sum of squares), so it must be easy to construct counterexamples.
    (take f=0 having finitely many real solutions, all of them irrational)

  4. Jeffrey Shallit permalink
    July 3, 2009 2:32 am

    I’d like to point out that Lehmer’s mechanical methods of factoring were anticipated by the Frenchman Carissan, who not only built a similar machine before Lehmer, but also used it successfully to factor numbers. Strangely enough his work is hardly known.

    See, for example,

  5. February 28, 2014 9:59 am

    I have encountered two expresions in an equation which I am assured can only be solved via Diophantine linear equation methods. (Simultaneous equations do not) The expressions are :-
    (3x + 43) – (5y +37) = 0
    Can anyone show me how this is done step by step please?

    (I happen to know that each expression = 67 and x = 8 and y = 6)

    Harry at


  1. Rough Problems | Gödel's Lost Letter and P=NP

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


Get every new post delivered to your Inbox.

Join 1,255 other followers