Q.E.D, H.E.Q.D.I, H.E.Q.D.P, Q.D.I, Q.F.I, and all that

Today’s “Word of the day” from the Oxford English Dictionary is quod erat demonstrandum (which prompts the question of the largest number of words in an OED “word”, but let us leave that aside for now).

The full entry shows latin translators of Euclid toiling mightily to catch just the right sizzle of the Greek “ὅπερ ἔδει δεῖξαι” (which, we are told, means literally “what needed to be [shown] has been shown”), just like Joyce or Wodehouse trying to find the perfect mot juste for their latest novel.

So were tried, in turn, hoc est quad demonstrare intendimus (“This is what we intended to demonstrate”, too pedestrian); hoc est quod demonstrare proposimum (“This is what we proposed to demonstrate”, too egocentric); quod demonstrare intendimus (“What we intended to demonstrate”; dashing, but reeks of false modesty); quod fuit propositum (“Which was proposed”, yes, it was proposed, but is it proved?) — and who knows how many other versions which go unmentioned.

I personally think that the Greek sounds and reads much better; this suggests introducing the abbreviation “ὅἔδ”, or the transliteration “OED” if needed, instead of “QED” in the more refined mathematical literature…

Bounded gaps between primes!

And so it came to pass, that an almost millenial quest found a safe resting place…

Like all analytic number theorists, I’ve been amazed to learn that Yitang Zhang has proved that there exist infinitely many pairs of prime numbers \ell<p with p-\ell bounded by an absolute constant C.

So, how did he do it?

Well, since the paper just became available, I don’t have anything intelligent to say yet on the new ideas that he introduced (but I certainly hope to come back to this!). However, one can easily list those previously-known tools that he uses, which involve some of the deepest and most clever results in analytic number theory of the last 30 to 35 years.

(1) At the core, the proof is based on the method discovered about ten years ago by Goldston, Pintz and Yıldırım to show that

\liminf \frac{p_{n+1}-p_n}{\log n}=0.

As I discussed a while back, this remarkable result — besides its intrinsic interest — was notable for being the first to bring the problem of bounded gaps between primes within a circle of well-studied and widely believed conjectures on primes in arithmetic progressions to large moduli. Precisely, Goldston, Pintz and Yıldırım had derived the statement above, after many ingenious steps, by applying the Bombieri-Vinogradov Theorem, and they showed that any progress beyond it towards the so-called Elliott-Halberstam Conjecture would imply the bounded gap property. However, in my former blog post, I discussed why it seemed extremely difficult to go in that direction…

(2) … despite the existence of some results going beyond the Bombieri-Vinogradov theorem, due first to Fouvry-Iwaniec and later improved by Bombieri-Friedlander-Iwaniec; but Zhang uses indeed some of the ideas behind these results…

(3) … results which themselves depend crucially on two big ideas: the well-factorable weights of the linear sieve, due to Iwaniec, and the development and applications of the Kuznetsov formula and other results concerning the spectral theory of automorphic forms and estimates for sums of Kloosterman sums, the outcome of the work of Deshouillers and Iwaniec (actually, at first glance, it seems that Zhang does not explicitly use those results arising from the Kuznetsov formula; he does reach sums with incomplete Kloosterman sums which the spectral methods are designed to handle, but he can deal with them with the Weil bound only; this might be a place where the result can be improved…)

(4) … but furthermore, Zhang uses also an estimate for a certain character sum over finite fields which had appeared in the work of Friedlander and Iwaniec on the exponent of distribution for the ternary divisor function; this sum is a three-variable additive character sum, and its estimation (with square-root cancellation), proved by Bombieri and Birch in an Appendix to the paper of Friedlander and Iwaniec, depends crucially on the Riemann Hypothesis over finite fields of Deligne.

Here are some references to surveys or explanations of some of these tools. Amusingly, I have written something on most of them…

  • There have been many surveys of the work of Goldston, Pintz and Yıldırım, and in particular I wrote a Bourbaki report on it, which may be interesting to those who read French;
  • Concerning the automorphic Kloostermania that comes into the Fouvry-Iwaniec and Bombieri-Friedlander-Iwaniec circle of ideas (although it is apparently not needed for Zhang’s proof…), I happened to write a few years ago, for a book on Poincaré’s mathematical workan account of the applications of Poincaré series to analytic number theory, which are used to prove the Kuznetsov formula;
  • Fouvry has written a survey Cinquante ans de théorie analytique des nombres from the point of view of sieve methods, which discusses the philosophy of extending the ranges of exponents of distribution for important sequences, as well as the well-factorable weights of Iwaniec;
  • Fans of trace functions may remember that I noticed in a previous post (see the very end) that the exponential sum of Friedlander-Iwaniec, estimated by Birch and Bombieri, is (for prime moduli) just a special case of the general “correlation sums” that appeared in my recent work with Fouvry and Ph. Michel — in particular, our arguments (based on the sheaf-theoretic Fourier transform of Deligne, Laumon, Katz and others) give a conceptually simple proof of that estimate (I just wrote it down in a short separate note);

And although it doesn’t seem that Zhang uses it directly, I’d like to mention that the result of Friedlander and Iwaniec concerning the exponent of distribution of d_3 was improved by Heath-Brown a few years later, and that Fouvry, Michel and I very recently improved it quite a bit further (for prime moduli; the second part of that paper involves another application of the Bombieri-Friedlander-Iwaniec techniques to improve the exponent of distribution on average…)

And a philosophical preliminary conclusion, before diving into the work of Zhang: it is thrilling to see this result, and I particularly like that it comes completely unexpectedly, and yet uses all these beautiful ideas and methods from this analytic number theory that I love!

 

Another exercise with characters

While thinking about something else, I noticed recently the following result, which is certainly not new:

Let G be a compact topological group [ADDITIONAL ASSUMPTION pointed out by Y. Choi: connected, Lie group], and let \rho be a finite-dimensional irreducible unitary continuous representation of G on a vector space V. Then the natural representation \pi of G on \mathrm{End}(V) decomposes as a direct sum of one-dimensional characters if and only if \rho is of dimension 1.

One direction is clear: if \rho has dimension one, then \pi is simply the trivial one-dimensional representation. For the converse, here is an argument with character theory.

As a first step, note that if \rho (of dimension d\geq 1, say) has this property, then in fact \pi decomposes as a direct sum of distinct one-dimensional characters: indeed, the multiplicity of a character \chi in \pi is the same as
n_{\chi}=\int_{G}\chi(x)\mathrm{Tr}(\pi(g))dg,
where dg is the probability Haar measure on G, and since
\mathrm{Tr}(\pi(g))=|\mathrm{Tr}(\rho(g))|^2,
we get
n_{\chi}\leq \int_{G}\mathrm{Tr}(\pi(g))dg=1
by the orthogonality relations of characters. (Algebraically, this is just an application of Schur’s lemma).

Thus if we decompose \pi into irreducible representations, we get
\pi=\bigoplus_{1\leq i\leq d^2} \chi_i,
where the \chi_i are distinct one-dimensional characters. We then know by orthogonality that
d^2=\int_{G} |\mathrm{Tr}(\pi(g))|^2 dg=\int_{G} |\mathrm{Tr}(\rho(g))|^4 dg.

Now the last-integral is bounded by
\int_{G} |\mathrm{Tr}(\rho(g))|^4 dg\leq \mathrm{Max}_{g}|\mathrm{Tr}(\rho(g))|^2 \times \int_G|\mathrm{Tr}(\rho(g))|^2dg\leq d^2,
(since |\mathrm{Tr}(\rho(g))|\leq d). Comparing, this means that there must be equality throughout in this estimate, which in turn implies that |\mathrm{Tr}(\rho(g))|=d for all g\in G. Since \rho(g) is unitary of size d, this implies that \rho(g) is scalar for all g, and since it is assumed to be irreducible, it is in fact one-dimensional.

I see two interesting points in this argument: (1) is there a purely algebraic proof of the last part? I haven’t thought very hard about this yet, but it would be nice to have one; (2) the appearance of the fourth moment of \rho is nicely reminiscent of the Larsen alternative (see Section 6.3 of my notes on representation theory, for instance…)

Zeros of Hermite polynomials

In my paper with É. Fouvry and Ph. Michel where we find upper bounds for the number of certain sheaves on the affine line over a finite field with bounded ramification, the combinatorial part of the argument involves spherical codes and the method of Kabatjanski and Levenshtein, and turns out to depend on the rather recondite question of knowing a lower bound on the size of the largest zero x_n of the n-th Hermite polynomial H_n, which is defined for integers n\geq 1 by
H_n(x)=(-1)^n e^{x^2} \frac{d^n}{dx^n}e^{x^2}.

This is a classical orthogonal polynomial (which implies in particular that all zeros of H_n are real and simple). The standard reference for such questions seems to still be Szegö’s book, in which one can read the following rather remarkable asymptotic formula:
x_n=\sqrt{2n}-\frac{i_1}{\sqrt[3]{6}}\frac{1}{(2n)^{1/6}}+o(n^{-1/6})
where i_1=3.3721\ldots>0 is the first (real) zero of the function
\mathrm{A}(x)=\frac{\pi}{3}\sqrt{\frac{x}{3}}\Bigl\{J_{1/3}\Bigl(2\Bigl(\frac{x}{3}\Bigr)^{3/2}\Bigr)+J_{-1/3}\Bigl(2\Bigl(\frac{x}{3}\Bigr)^{3/2}\Bigr)\Bigr\}
which is a close cousin of the Airy function (see formula (6.32.8) in Szegö’s book, noting that he observes the Peano paragraphing rule, according to which section 6.32 comes before 6.4).

(Incidentally, if — like me — you tend to trust any random PDF you download to check a formula like that, you might end up with a version containing a typo: the cube root of 6 is, in some printings, replaced by a square root…)

Szegö references work of a number of people (Zernike, Hahn. Korous, Bottema, Van Veen and Spencer), and sketches a proof based on ideas of Sturm on comparison of solutions of two differential equations.

As it happens, it is better for our purposes to have explicit inequalities, and there is an elementary proof of the estimate
x_n\geq\sqrt{\frac{n-1}{2}},
which is only asymptotically weaker by a factor 2 from the previous formula. This is also explained by Szegö, and since the argument is rather cute and short, I will give a sketch of it.

Besides the fact that the zeros of H_n are real and simple, we will use the easy facts that \deg(H_n)=n, and that H_n is an even function for n even, and an odd function for n odd, and most importantly (since all other properties are rather generic!) that they satisfy the differential equation
y''-2xy'+2ny=0.

The crucial lemma is the following result of Laguerre:

Let P\in \mathbf{C}[X] be a polynomial of degree n\geq 1. Let z_0 be a simple zero of P, and let
w_0=z_0-2(n-1)\frac{P'(z_0)}{P''(z_0)}.
Then if T\subset \mathbf{C} is any line or circle passing through z_0 and w_0, either all zeros of P are in T, or both components of \mathbf{C}-T contain at least one zero of P.

Before explaining the proof of this, let’s see how it gives the desired lower bound on the largest zero x_n of H_n. We apply Laguerre’s result with P=H_n and z_0=x_n. Using the differential equation, we obtain
w_0=x_n-\frac{n-1}{x_n}.
Now consider the circle T such that the segment [w_0,z_0] is a diameter of T.

Now note that -x_n is the smallest zero of H_n (as we observed above, H_n is either odd or even). We can not have w_0<-x_n: if that were the case, the unbounded component of the complement of the circle T would not contain any zero, and neither would T contain all zeros (since -x_n\notin T), contradicting the conclusion of Laguerre's Lemma. Hence we get -x_n\leq w_0=x_n-\frac{n-1}{x_n},
and this implies
x_n\geq \sqrt{\frac{n-1}{2}},
as claimed. (Note that if n\geq 3, one deduces easily that the inequality is strict, but there is equality for n=2.)

Now for the proof of the Lemma. One defines a polynomial Q by
P=(X-z_0)Q,
so that Q has degree n-1 and has zero set Z formed of the zeros of P different from z_0 (since the latter is assumed to be simple). Using the definition, we have
Q'(z_0)=P'(z_0),\quad\quad Q''(z_0)=\frac{1}{2}P''(z_0).
We now compute the value at z_0 of the logarithmic derivative of Q, which is well-defined: we have
\frac{Q'}{Q}=\sum_{\alpha\in Z}\frac{1}{X-\alpha},
hence
\frac{Q'}{Q}(z_0)=\sum_{\alpha\in Z}\frac{1}{z_0-\alpha},
which becomes, by the above formulas and the definition of w_0, the identity
\frac{1}{z_0-w_0}=\frac{1}{n-1}\sum_{\alpha\in Z}\frac{1}{z_0-\alpha},
or equivalently
\gamma(w_0)=\frac{1}{n-1}\sum_{\alpha\in Z}{\gamma(\alpha)},
where \gamma(z)=1/(z_0-z) is a Möbius transformation.

Recalling that |Z|=n-1, this means that \gamma(w_0) is the average of the \gamma(\alpha). It is then elementary that for line L, either \gamma(Z) is contained in L, or \gamma(Z) intersects both components of the complement of L. Now apply \gamma^{-1} to this assertion: one gets that either Z is contained in \gamma^{-1}(L), or Z intersects both components of the complement of \gamma^{-1}(L). We are now done, after observing that the lines passing through \gamma(w_0) are precisely the images under \gamma of the circles and lines passing through w_0 and through z_0 (because \gamma(z_0)=\infty, and each line passes through \infty in the projective line.)