Are inequalities necessary?

Every once in a while, “at an uncertain hour”, algebraic fever returns, and I look at inequalities with mistrust. (This is a dramatized introduction, to be sung to the tune of The Rime of the Ancient Mariner; as Wodehouse would say, the poet Coleridge puts these things well.)

Compared with an identity such as

\sum_{n\geq 1}{\frac{1}{n^2}}=\frac{\pi^2}{6},

which one can easily imagine occupying pride of place in a platonic heaven, what is one to make of an inequality like

|K(p)|\leq 2\sqrt{p},


K(p)=\sum_{x=1}^{p}{\exp\left(2i\pi \frac{x+x^{-1}}{p}\right)},

for every prime number p? (This is the famous Weil bound for Kloosterman sums). Or what should one think of a statement like: for every ε>0, there exists a constant Cε such that

d(n)\leq C_{\eps}n^{\eps}

for all positive integers n, d(n) being the number of positive divisors of n? (See this post by T. Tao for an enlightening discussion of this well-known inequality, which was apparently first proved by Runge in 1885, in a paper in Acta Mathematica on solvable equations of the type x5+ux+v=0 — this slightly surprising reference is given in Montgomery and Vaughan’s Multiplicative Number Theory).

The suspicion that inequalities are not quite “right” may have led to a number of devices to transform them into equalities (or identities) as much as possible. For instance, one may say that any inequality of the type

A\geq 0,

where A is an arbitrarily complicated real-valued expression (notice that any inequality could be written in this way…) is a bad version of an identity


where B is some more intrinsic (and possibly even more complicated) expression. Note that this is in fact of some importance in logic (and in algebra, with the theory of real fields): extending slightly, it shows that the positive integers are definable existentially in the integers by the first order formula φ(n) given by

\exist a\ \exist b\ \exist c\ \exist d,\ n=a^2+b^2+c^2+d^2

in the language of rings (and let me recall the much subtler formula of J. Robinson that extracts the integers from the rationals, though not purely existentially — whether the latter is possible is still an open problem).

I have also heard it said that Selberg sometimes claimed that his whole career was built on the fact that the square of a real number is non-negative, but I have no idea if he actually did — at the very least, it seems to completely ignore the Trace Formula…

Turning to Kloosterman sums, algebraists certainly sleep better at night knowing that the “correct” statement is that


for some complex number αp of modulus 1, which can be defined in some beautifully elegant manner: the Weil bound then becomes a simple matter of neglecting part of this interesting information, and only remembering the triangle inequality.

Another way of understanding certain inequalities is to rephrase them as instances of bounds arising from the norm of a linear operator between normed vector spaces: if T is such an operator, its norm is equal to the infimum of numbers C such that

||T(v)||\leq C||v||

for all vectors in the source space. So, for instance, one of the large sieve inequalities, as a purely analytic statement, is that for any N and any complex numbers an we have

\sum_{q\leq Q}{\ \ \ \ \ \sum_{1\leq a\leq q,\ (a,q)=1}{\left|\sum_{n=1}^N{a_n\exp(2i\pi an/q)}\right|^2}}\leq (N-1+Q^2)\sum_{n}{|a_n|^2}

and one could say that N-1+Q2 is here a placeholder for the “right” quantity which is simply


the norm of some (fairly obvious) linear map between two finite-dimensional Hilbert spaces. The implied criticism would be that it is only because we are not clever enough to find a formula for this norm that we have to do with disappointing inequalities. (Though Ramaré’s investigations of eigenvalues of the large sieve operator show that this operator is in fact quite mysterious: in the critical case where N and Q2 are of comparable size, there seems ot be a limiting distribution for the eigenvalues, but it has a very strange look).

This method of introducing linear operators applies for many well-known inequalities, for instance the Cauchy-Schwarz and Hölder inequalities: they can be interpreted as giving the formula for the norms of linear forms

f\mapsto \int_X{f(x)g(x)d\mu(x)}

between suitable Lp spaces.

But now, this ghastly tale being told, I come back to my analytic senses: I am sure that there is much more to inequalities than being des identités manquées! But it’s not clear if, or how, this might be formalized. Maybe what is needed is a very elementary inequality where the best possible constant is known, but involves much more sophisticated notions than the statement of the inequality? Possibly, the Hilbert inequality

\left|\sum_{1\leq n,m\leq N}{\ \ \ \frac{a_nb_m}{n+m}}\right|\leq \pi ||a|| ||b||

(again with arbitrary complex coefficients an and bm, and with arbitrary N) might be interpreted in this way. The constant π is here best possible, but since π occurs everywhere from the most elementary mathematics, it may not be “sophisticated” enough to carry conviction. Are there other known operators with similarly simple descriptions and norm known to be very complicated numbers (in some sense)?

Another possibility towards proving that inequalities are unavoidable would be to look at something like the function

\epsilon\mapsto \max_{n\geq 1}\ \ \frac{d(n)}{n^{\epsilon}}

and hope to show that, in some sense, it is “much more complicated” than the divisor function, and thus unlikely to be replaceable by something nicer — though in that specific case, it seems not clear that it can be true, because the divisor function itself is really quite a complicated object already (for instance, with respect to its algorithmic computability).

One situation in which, I think, there is a fairly general consensus that the inequality can not be replaced in a non-tautological manner even by an asymptotic formula, is that of class numbers of imaginary quadratic fields, a famous problem going back to Gauss. This can in fact be described very elementarily (though the algebraic interpretation is probably the only way to justify why one would look at this particular question): for a positive squarefree number d, let h(d) be the number of integral solutions (a,b,c) , with no common factor e>1, to the (in)equations

-a<b\leq a\leq c,\ \text{with }\ b\geq 0\ if\ a=c,\ \ \ \ \ b^2-4ac=d.

The question is then to know the size of h(d), and this is a notoriously difficult problem. It is relatively simple to show an inequality like

h(d)<2\sqrt{d}\log d

and it is further known that, for every ε>0, there is a constant Cε>0 such that

h(d)\geq C_{\epsilon}d^{1/2-\epsilon}

(a theorem due to Siegel, which is much harder, and for which no one knows how to compute Cε if ε is small enough; and here I can’t help quoting what may be the century’s greatest understatement, taken from MathWorld: “There are at least two Siegel’s theorems“).

Those two inequalities show that the class number is of size about d1/2, in some sense, but after extensive work, I don’t think anyone who has looked at the problem in some depth would expect to get even an asymptotic formula

h(d)=g(d)+\text{(smaller remainder)}

where the function g(d) is “elementary” in a reasonable sense. Of course, this is not a formal statement, and it’s not clear if a precise version is possible (this may be another interesting somewhat meta-mathematical problem to consider…)

In that particular case, algebraists might exclaim that the Class Number Formula of Dirichlet provides the required “identity” version of h(d): up to minor (explicit) factors, we have

h(d) \simeq \sqrt{d}L(1,\chi_{-d})

relating the mysterious class number to a special value of a Dirichlet L-function associated with the underlying imaginary quadratic field. But this gives essentially no information on the size of h(d), so this transcription is not as convincing as what we saw in the case of Kloosterman sums. (Although this formula has been the basis of the deepest estimates for h(d), which have been deduced from bounds for L-functions.)

Published by


I am a professor of mathematics at ETH Zürich since 2008.

5 thoughts on “Are inequalities necessary?”

  1. I think one might want to make a distinction between _inequalities_ (where the sharp constant is of interest) and _estimates_ (in which one is generally only interested in “asymptotic” regimes when some parameter, e.g. n, is very large or very small). For inequalities I certainly agree that one would often like to view an inequality as a projection of a more algebraic and mathematically richer fact. But estimates, I think, are of their own intrinsic interest, being basically a question as to which quantities depending on one or more parameters dominate which other quantities.

    If you want an unusual example of a best constant, one could take the best constant C in the Hardy-Littlewood maximal inequality

    |\{ \sup_{r > 0} \frac{1}{2r} \int_{x-r}^{x+r} |f| > \lambda \}| \leq \frac{C}{\lambda} \int_{-\infty}^\infty |f|.

    It turns out that this constant C is equal to \frac{1}{12}(11 + \sqrt{61}), a recent result of Melas.

  2. The distinction between inequalities and estimates seems very useful indeed to catch some of the features I was thinking about. And thanks for the example of best constant: it’s quite interesting!

    There’s another point I had wanted to make with respect to the “algebraisation” of the Weil bound for Kloosterman sums: as is well-known, the principle is of considerable generality, but it breaks down (as far as deducing any estimate for the archimedean size of the exponential sum) in some contexts, e.g., (as far as I know) there is no algebraico-geometric approach to bound non-trivially a sum like

    \sum_{1\leq x\leq p}{\exp(2i\pi (x^d+x+1))}

    if d is itself of size about p1/2, because the algebraic version of this sum has about d terms, each of size p1/2.

  3. There is in fact an elegant asymptotic formula for h(-d) in a paper of Granville and Stark on the Siegel zero, however it is conditional on GRH.

    You presumably meant b^2-4ac=-d?

  4. Thanks for correcting the sign of the discriminant!

    As for the asymptotic formula, I guess this is partly a matter of definition; to my mind (as I wrote the post), the main term would have to be explicit and “elementary”, and this is not the case of the Granville-Stark formula (which, I assume, is Theorem 3 in their Inventiones 2000 paper) — in particular the main term hides the oscillations of the class number above and below the elementary factors sqrt(d)/log d (in the sum over reduced representatives; which is another way in which it is not quite what I had in mind: the main term contains references to the objects to be counted…)

  5. Terry’s distinction between estimates and inequalities is a nice idea, and reminds me of a remark that Selberg made in an interview published in the Bulletin of the AMS. Selberg referred to his famous result
    $\sum_{p<x} \log\^2 p + \sum_{pq<x} \log p \log q = 2x \log x + O(x)$ as an “equality,” and was rather disdainful of Erdos’s habit of calling it an “inequality.” But in Terry’s language it’s really an estimate, and there’s no real reason to quibble over whether it’s an equality or an inequality.

Leave a Reply to Timothy Chow Cancel reply

Your email address will not be published.