Sometimes definitions are more important than theorems, sometimes

Atle Selberg was one of the first two postwar Fields Medalists, awarded in 1950. This partly recognized his work on an “elementary” proof of the Prime Number Theorem. There was some bitter contention with Paul Erdös over priority, but both used a new asymptotic formula that Selberg had found. The formula threw in an extra ${\log x}$ factor on one term, and things “magically” came out nice. How did that factor come to be part of the definition?

Today I want to talk about the role that definitions play in mathematics, and how proper definitions can unravel formulas, if not controversies.

Though growing up in Norway, Selberg learned of the legend of Srinivasa Ramanujan in 1934 as a teenager. As he related in his speech for the 1988 Ramanujan Centenary, the Indian genius was described by a Norwegian word meaning “strange” as much as “remarkable,” and looking in to Ramanujan’s formulas “made a deep impression.” Ramanujan sometimes said that the Hindu goddess Namagiri gave him the formulas—which implies to Ken and me that they were not merely computed as answers but defined as trials and starting points.

The controversy about who-proved-what-when is not what I wish to talk about today. We have touched on it before here. For two papers that go into some detail about it see Dorian Goldfeld’s paper titled “The Elementary Proof Of The Prime Number Theorem: An Historical Perspective” or Joel Spencer and Ronald Graham’s paper with a simpler title, “The Elementary Proof of the Prime Number Theorem.”

## The Players

Besides Erdös and Selberg the key players are certain arithmetical functions. A function is arithmetical if it maps

$\displaystyle 1,2,3, \dots$

to the complex numbers. Note that we do not insist that the function be defined at zero, and we do allow the values of the function to be complex numbers. Many of the fundamental questions of number theory can be stated most directly as questions about the behavior of such functions. So it is not surprising that we need to define several of them.

${\bullet }$ Möbius Function The Möbius function is denoted by ${\mu}$ and is defined as follows: ${\mu(1)=1}$, and if ${n=p_{1}^{a_{1}} \cdots p_{k}^{a_{k}}}$, then ${\mu(n)=(-1)^{k}}$ provided

$\displaystyle a_{1} = a_{2} = \cdots a_{k} = 1.$

Otherwise, ${\mu(n)=0}$.

${\bullet }$ Mangoldt Function The Mangoldt function ${\Lambda}$ is defined as follows: If ${n = p^{m}}$ for some prime ${p}$ and some ${m \ge 1}$, then ${\Lambda(n) = \log p }$; otherwise, it is ${0}$.

${\bullet }$ Identity Function The Identity function is simple: ${I(1) = 1}$, and otherwise, it is ${0}$.

${\bullet }$ Unit Function The Unit function ${u}$ is just always ${1}$.

## The Operations

The key to understand these and other functions is that they form a very rich algebraic structure, one that is not obvious at all. The key is they have a product defined on them, called the Dirichlet Product. If ${f}$ and ${g}$ are arithmetic functions, then ${h = f*g }$ is defined by

$\displaystyle h(n) = \sum_{d | n} f(d)g(\frac{n}{d}).$

Recall ${d | n}$ means that ${d}$ is a divisor of ${n}$.

This product is easily seen to be both commutative and associative. The latter follows from the observation that ${f*(g*h)}$ is equal to

$\displaystyle \sum_{abc=n} f(a)g(b)h(c).$

Thus arithmetic functions form a commutative—that is, abelian—group under the Dirichlet product. But there is more: they also have a natural unary operation that acts as a derivative. This comes perhaps as a surprise since there are no limits of any kind lurking around. Nor are the functions polynomials as we pointed out here.

Recall that one can define a derivative operation on polynomials. Arithmetic functions are not restricted in any way, except for their domain and range—so how can we define a derivative? Here is how: if ${f}$ is an arithmetic function, then define ${f'}$ as:

$\displaystyle f'(n) = f(n)\log n, \text { for } n \ge 1.$

Note, ${I' = 0}$ and ${u' = \log n}$.

This derivative satisfies many of the usual properties that we would want it to, which is why we are allowed under math naming rules—just kidding—to call it a derivative.

Lemma: If ${f}$ and ${g}$ are arithmetical functions then:

1. ${(f+g)' = f' + g'}$.
2. ${(f*g)' = f'*g + f*g'}$.
3. ${(f^{-1})' = -f' * (f * f)^{-1}}$, provided ${f(1) \neq 0}$.

## The Basic Formulas

The above definitions make it possible to write in a very elegant manner many famous results. For example,

$\displaystyle \sum_{d | n} \mu(d)$

is ${1}$ for ${n=1}$ and zero otherwise. We can write this as ${\mu * u = I}$. Thus,

$\displaystyle u = \mu^{-1} \text{ and } \mu = u^{-1}.$

There is another basic identity which is:

$\displaystyle \sum_{d | n} \Lambda(d) = log(n).$

This now becomes

$\displaystyle \Lambda * u = u'.$

It is pretty neat how this succinct notation allows one to express relationships.

## The Identity

The Selberg identity. For ${n \ge 1}$,

$\displaystyle \Lambda(n)\log(n) + \sum_{d | n}\Lambda(d)\Lambda(\frac{n}{d}) = \sum_{d | n}\mu(d)\log^{2}\frac{n}{d}.$

The above is not exactly the famous identity. It can be used to get the real identity which is:

$\displaystyle \theta(x)\log(x) + \sum_{p \le x} \log(p)\theta(\frac{x}{p}) = 2x\log(x) +O(x),$

where

$\displaystyle \theta(x) = \sum_{p \le x} \log(p)$

for primes ${p}$. I am just following Tom Apostol’s famous book on number theory: check it out for the details.

## The Easy Proof

$\displaystyle \Lambda' * u + \Lambda * u' = u'',$

or since ${u' = \Lambda * u}$,

$\displaystyle \Lambda' * u + \Lambda * (\Lambda * u) = u''.$

Now multiply both sides by ${ \mu = u^{-1}}$ to obtain,

$\displaystyle \Lambda' + \Lambda * \Lambda = u'' * \mu.$

But this is the famous identity.

## Open Problems

Are we missing some basic definitions in complexity theory that would shed new light on some of our open problems? Note a general mathematical idea that we do not seem to exploit very often: given a collection of objects, can one define a natural algebraic structure on the objects? Often there are hidden such structures, and finding them may unlock the key to great insights.

August 17, 2012 10:16 am

I think it would be useful to have a more precise definition of the complexity of proofs. I’m always struck that algorithm complexity is a carefully defined concept while proof complexity is a vaguer notion. I’m sure theorems relating them could be proved – about, for instance, how hard it is to prove that an algorithm has some given complexity.

2. August 17, 2012 11:40 am

Where do you think is the best hope for looking for definition gaps?

August 17, 2012 12:30 pm

Serge, I am not sure what you mean. There is a precise notion of propositional proof complexity formalized by Cook and Reckhow.and related work ion hierarchies of bounded arithmetic that give uniform notions of the complexity of proofs.

However,I suspect that you mean something else, namely a sort of “inverse mathematics”: What concepts are necessary to prove a theorem (or to prove it constructively)? In this respect the best sorts of things I have seen are Papadimitriou’s comparisons of notions of the second order principles that are used to define complexity classes like PPA, PPAD, PPP, PLS etc.

August 17, 2012 4:10 pm

OK Paul, thanks for the information: I’m going to look this up. 🙂

Indeed, I suspect that some kind of “reverse math” is needed in complexity theory. As I’ve commented elsewhere on this blog, proofs are the only tools we have at our disposal to study algorithms. In view of the Turing-Howard proof-program equivalence, wouldn’t it be fruitful to study the complexity of proofs in relation to the complexity of algorithms?

Look at the PvsNP problem: of course it is well-defined and possesses an answer in terms of natural numbers. Be we can’t know this answer because we’re using algorithms to study algorithms! We’re using brain processes to study computer processes! This is a vicious circle… so let’s stop thinking absolute and start thinking relative.

August 19, 2012 6:02 am

August 17, 2012 5:41 pm

In fact, what I’m looking for is something like this:

1) If an algorithm solves an NP-complete problem, then the complexity of the proof that it does solve the problem is inversely proportional to the complexity of the proof of its running-time.

2) If its running-time is polynomial, then whenever one of these two proofs has finite complexity the other one has infinite complexity.

3) If both proofs have finite complexity, then that algorithm must have some exponential term in its running-time.

I’ll leave it to more competent mathematicians than me to check if all this makes sense. 🙂

August 17, 2012 5:57 pm

And in view of the above-mentioned vicious circle, such a relation could be viewed either as an axiom of math… or as a principle of the physics of processes. I’m still undecided about this point. 🙂

August 17, 2012 3:29 pm

It is arguable that the theorems and toolkits of proof theory determined ex post facto the standard definitions of proof theory. That is, our modern definitions of “proof” are largely tuned so as to accommodate the “algebrized” proof methods of Church, Turing, Gödel, etc.

Similarly, a recurring theme here on Gödel’s Lost Letter are the prospects for advancing complexity theory by tuning definitions so as to facilitate proofs.

Can “quantum information theory” be satisfactorily defined on on a class of Käherian Segre varieties? Can “P versus NP” be decided more easily on oracle-independent subsets of P and NP?

In both cases a good answer is “Show us some theorems, and then we’ll decide whether adjustments in definitions are warranted.”

Generally I prefer *not* to attempt to foresee the future, but here my heart and intuition both speak plainly: adjustments in the standard definitions of “quantum information theory” and “complexity theory” are equipping the coming generation of researchers with good prospects of major advances in proof-and-simulation technologies.

Prediction  Simulating quantum trajectories on flat state-spaces and proving complexity theorems relating to oracle-dependent classes will be viewed by future generations of researchers as the “quaint” way that research *formerly* was conducted.

August 19, 2012 5:03 pm

There was occasion to amplify the above ideas more fully on Scott Aaronson’s Shtetl Optimized (here and here) … and this provides an occasion to mention, that Shtetl Optimized is enjoying a return to form that is much appreciated by every quantum researcher.

5. August 18, 2012 11:15 pm

Are you asking whether a complexity *class* would have a hidden algebraic structure? Or something more concrete, like a program or a circuit?

August 19, 2012 2:21 pm

@ Serge: I (as Paul) also cannot see why proof complexity is a more vague concept than computational complexity. Proofs are algorithms as well (desribed by rules of inference). But you are right: questions like “how hard it is to prove that an algorithm has some given complexity” are apparently not studied in proof complexity, in computational complexity either. The focus there is on the complexity of *problems* or *claims* themselves – not on the complexity of *particular* algorithms, or on the complexity of proofs of their complexity.

August 19, 2012 3:20 pm

Thank you Stasys for explaining my point this clearly. What I had in mind is the transfer of complexity between algorithms and their proofs – after all, it’s a well-known fact that more efficient algorithms have more complex proofs. I think it’s interesting to try to quantify this phenomenon.

August 19, 2012 5:34 pm

… and I’m sorry to have called “vague” by ignorance something that was in fact very clear. I thank both Paul and you for making me learn something new. 🙂

7. August 19, 2012 5:57 pm

“Are we missing some basic definitions in complexity theory that would shed new light on some of our open problems?”

There is a proposed answer on the papers at:

See some excerpts on they:

(i) The profoundest questions in Complexity Theory (P vs NP vs P/poly vs RP) were solved by those plain ingenious new definitions (generalizations) stated in these papers.

(ii) Some experts, as in [15], are asserting: “– The XG-SAT is not in NP (in the author’s terms): The polynomial nk CANNOT depend on the input.” However, this assertion is false, being true only for the old traditional definition of polynomial-time DTM, since in the new definition (Def. 3.7), the polynomial CAN definitely depend on the input – as long as it does not depend on the input’s length. Think: This is just a matter of Math object definition, not of mathematical error or correctness, at all. We are not obligated to follow obsolete definitions only because they are established, unless the Science is finished (or dead). “– The essence of Mathematics is Freedom.” (Georg Cantor)

(iii) Very important: Verify that these new definitions of the classes P and NP are simply good generalizations of the old traditional ones: Any traditional P or NP problem IS too, respectively, in the new class P or NP defined above (even though the converse is in general false, since these new generalized classes are strictly larger than the traditional ones), and any superpolynomial deterministic or nondeterministic problem is NOT in the new class P or NP, respectively, which proves that these generalizations are consistent and smooth.

8. August 20, 2012 8:03 am

hi,
is this a typo in the text ? or a *very* different usage?
Identity Function The Identity function is simple: , and otherwise, it is .
Surely the identity function (the usual one) is simply $f(n)=n$ for all n.

August 21, 2012 7:26 am

As I’ve suggested above: the more efficient the algorithm, the more complex its proof of correctness. But is there a way to prove this? I may be wrong, but I don’t think this phenomenon is reflected in any known theorem. Take integer factoring: why is it necessary to bring to bear more and more concepts – such as elliptic curves – in order to write a more efficient algorithm than Euclid’s? I think the basic reason for this phenomenon lies in the thermodynamics of processes and – in my opinion – those principles of physics should be rewritten as axioms of computer science.

The act of counting was taken into account by the axioms of Dedekind and Peano. The act of reasoning about collections was taken into account by Cantor and the later axioms of set theory. But the act of programming still needs its axioms. At least, a new axiom seems to be required for relating the complexity of an algorithm to that of its proof.

10. August 30, 2012 9:55 pm

Deligne in “La categorie des representations du groupe symetrique St, lorsque t n’est pas un entier naturel, Algebraic groups and homogeneous spaces, Tata Inst. Fund. Res. Stud.
Math., Tata Inst. Fund. Res., Mumbai, 2007, pp. 209-273.” constructs a symmetric group $S_{t}$ category for $t$ being complex.

Is there a way to change the definitions of NP-complete problems so that the cardinality of input size of the problem is a complex number such that when the input size is integral, the NP-complete problem reduces to its traditional version?

August 31, 2012 5:05 am

The asymptotic behavior of algorithms gives rise to a clearcut notion of problem complexity, but unfortunately this notion is generally uncomputable. Manuel Blum has given axioms for the complexity measures of computable functions – cf Wikipedia – but why not try to enlarge the problem by defining procedure complexity and problem complexity in a non-recursive framework ? To me the very notion of problem complexity is fuzzy by nature, so an adequate set of axioms should not attempt to define it too strictly.