# E. Kowalski's blog

## Bounded gaps between primes: some grittier details

Because I was teaching a course on prime numbers this semester and had just finished a chapter on Vinogradov’s method when his paper appeared, I promptly switched my plans for the last classes in order to present some aspects of Yitang Zhang’s theorem on bounded gaps between primes. In addition, one of the speakers of this week’s conference celebrating 25 years of the “Zahlentheorie Seminar” of ETH had to cancel at short notice, and I replaced her and gave yesterday another survey-style talk. The notes for the latter (such as they are…) can be downloaded in scanned form.

My insights to Zhang’s work remain clearly superficial, but here are some remarks going a bit beyond what I mentioned in the previous post, coming after these lectures, and some discussions with Ph. Michel and P. Nelson.

(1) The most delicate estimates seem to be those for the “Type III” sums. These concern the “good” distribution in invertible residue classes of an arithmetic function $f(n)$ for integers $N, modulo a "large modulus" $q$, where $f(n)$ is of the very special type
$f(n)=\sum_{m_1n_1n_2n_3=n}\alpha(m_1),$
and the variables $m_1$, $n_1$, $n_2$ and $n_3$ are, roughly, of size $M_1$, $N_i$ with $M_1N_1N_2N_3=N$, and (crucially) $q$ is a bit larger than $N^{1/2}$: one needs to handle these for $q$ up to $N^{1/2+\delta}$ for some $\delta>0$ in order to obtain bounded gaps between primes.

The lengths $M_1$ and $N_i$ are constrained in various ways, and the most critical case seems to be when $M_1\approx q^{1/8}$, $N_i\approx q^{5/8}$ (the $\approx$ means that one must be able to go a bit beyond such a case, since $N$ is a bit beyond $q^2$).

(2) Another point is that Zhang manages to bound these sums for each individual residue class $a\pmod q$ (coprime to $q$). In other words, denoting
$\Delta_f(N,q,a)=\sum_{N
he proves individual bounds for $\Delta_f(N,q,a)$, instead of average bounds over $q$ (as in the other main part of this argument).

Also, he does not need to use the variable $m_1$ at all (but since the $\alpha(m_1)$ are mostly unknown coefficients, and the sum is rather short, exploiting it does not seem easy). Hence the result looks enormously like controlling the distribution in residue classes of the ternary divisor function. This is exactly the question that Friedlander and Iwaniec had studied in the famous paper where they proved that the exponent of distribution is at least $1/2+1/230$, but their argument is not quite sufficient for Zhang's purpose.

(3) One of the last tricks is Zhang's second use of the structure of the moduli $q$ that are involved in his argument: these were chosen to have only prime factors $\leq z=N^{\delta}$ for some small positive $\delta$), and Zhang exploits the "granular" (or friable) structure of such moduli in order to obtain flexibility in the possibility of factoring them as $q=rs$ with $r$ of size determined up to a factor at most $z$. This is particularly important for the “other” sums (it gives bilinear structure and makes it possible to use the dispersion method of Linnik, as already done by Fouvry-Iwaniec and Bombieri-Friedlander-Iwaniec in their works on primes in arithmetic progressions). For the "type III" case, it does not seem to be so much of the essence, but Zhang needs to gain a very small amount compared with Friedlander-Iwaniec, and does so by factoring $q=rs$ with $r$ rather small. He then gains from $r$ a factor $r^{1/2}$ which is essential, by exploiting the fact that a Ramanujan sum modulo $r$ is bounded (so he gets more than square-root cancellation from $r$…)! This is an extremely special situation, and right now, it is what seems the most “miraculous” about this proof (at least to me).

It is for the contribution of the complementary divisor $s$ that Zhang manages to position himself into applying the estimate of Birch and Bombieri for the exponential sums which Friedlander-Iwaniec had also encountered.

(4) This use of Deligne’s work is also very delicate: one can not relax the requirement of square-root cancellation, except by very tiny amounts. For instance, obtaining a bound of size $p^2$ for the three-variable sum modulo $p$ is useless; in fact, the bound $p^2$ can be considered here as the trivial estimate, since the sum can be written as an average of one-variable Kloosterman sums. With Zhang’s parameters, one needs an estimate of for the sum which is no larger than $p^{3/2+1/2000}$ (or so) in order to get the desired gain. However, as I explained in the last five minutes of my talk today (and as is explained in this note with Fouvry and Michel) the Birch-Bombieri bound is very well understood from a conceptual point of view.

(5) I was very curious, when first looking at the paper, to see how Zhang would handle the residue classes in the Goldston, Pintz, Yıldırım method, since the most uniform results on primes in arithmetic progressions (those of Fouvry-Iwaniec and Bombieri-Friedlander-Iwaniec) are constrained to use (essentially) a single residue class. What happens is that Zhang detects these classes by inserting a factor corresponding to their characteristic function, and by avoiding the Kloostermania approaches that rely on spectral theory of automorphic forms. The important properties of these residue classes are their multiplicative structure (coming from the Chinese Remainder Theorem) and the fact that, on average over moduli, there are not too many of them (the average is bounded by a power of $\log N$). In particular, his use of the dispersion method is in fact closer in spirit to some of its earliest uses by Fouvry and Iwaniec (for integers without small prime factors instead of primes), which also involved, in the final steps, the classical Weil bound for Kloosterman sums instead of (seemingly stronger) results on sums of Kloosterman sums.

June 4th, 2013 at 8:39 am

Posted in ETH,Mathematics

## Q.E.D, H.E.Q.D.I, H.E.Q.D.P, Q.D.I, Q.F.I, and all that

Today’s “Word of the day” from the Oxford English Dictionary is quod erat demonstrandum (which prompts the question of the largest number of words in an OED “word”, but let us leave that aside for now).

The full entry shows latin translators of Euclid toiling mightily to catch just the right sizzle of the Greek “ὅπερ ἔδει δεῖξαι” (which, we are told, means literally “what needed to be [shown] has been shown”), just like Joyce or Wodehouse trying to find the perfect mot juste for their latest novel.

So were tried, in turn, hoc est quad demonstrare intendimus (“This is what we intended to demonstrate”, too pedestrian); hoc est quod demonstrare proposimum (“This is what we proposed to demonstrate”, too egocentric); quod demonstrare intendimus (“What we intended to demonstrate”; dashing, but reeks of false modesty); quod fuit propositum (“Which was proposed”, yes, it was proposed, but is it proved?) — and who knows how many other versions which go unmentioned.

I personally think that the Greek sounds and reads much better; this suggests introducing the abbreviation “ὅἔδ”, or the transliteration “OED” if needed, instead of “QED” in the more refined mathematical literature…

May 29th, 2013 at 12:15 pm

Posted in Language,Mathematics

## Bounded gaps between primes!

And so it came to pass, that an almost millenial quest found a safe resting place…

Like all analytic number theorists, I’ve been amazed to learn that Yitang Zhang has proved that there exist infinitely many pairs of prime numbers $\ell with $p-\ell$ bounded by an absolute constant $C$.

So, how did he do it?

Well, since the paper just became available, I don’t have anything intelligent to say yet on the new ideas that he introduced (but I certainly hope to come back to this!). However, one can easily list those previously-known tools that he uses, which involve some of the deepest and most clever results in analytic number theory of the last 30 to 35 years.

(1) At the core, the proof is based on the method discovered about ten years ago by Goldston, Pintz and Yıldırım to show that

$\liminf \frac{p_{n+1}-p_n}{\log n}=0.$

As I discussed a while back, this remarkable result — besides its intrinsic interest — was notable for being the first to bring the problem of bounded gaps between primes within a circle of well-studied and widely believed conjectures on primes in arithmetic progressions to large moduli. Precisely, Goldston, Pintz and Yıldırım had derived the statement above, after many ingenious steps, by applying the Bombieri-Vinogradov Theorem, and they showed that any progress beyond it towards the so-called Elliott-Halberstam Conjecture would imply the bounded gap property. However, in my former blog post, I discussed why it seemed extremely difficult to go in that direction…

(2) … despite the existence of some results going beyond the Bombieri-Vinogradov theorem, due first to Fouvry-Iwaniec and later improved by Bombieri-Friedlander-Iwaniec; but Zhang uses indeed some of the ideas behind these results…

(3) … results which themselves depend crucially on two big ideas: the well-factorable weights of the linear sieve, due to Iwaniec, and the development and applications of the Kuznetsov formula and other results concerning the spectral theory of automorphic forms and estimates for sums of Kloosterman sums, the outcome of the work of Deshouillers and Iwaniec (actually, at first glance, it seems that Zhang does not explicitly use those results arising from the Kuznetsov formula; he does reach sums with incomplete Kloosterman sums which the spectral methods are designed to handle, but he can deal with them with the Weil bound only; this might be a place where the result can be improved…)

(4) … but furthermore, Zhang uses also an estimate for a certain character sum over finite fields which had appeared in the work of Friedlander and Iwaniec on the exponent of distribution for the ternary divisor function; this sum is a three-variable additive character sum, and its estimation (with square-root cancellation), proved by Bombieri and Birch in an Appendix to the paper of Friedlander and Iwaniec, depends crucially on the Riemann Hypothesis over finite fields of Deligne.

Here are some references to surveys or explanations of some of these tools. Amusingly, I have written something on most of them…

• There have been many surveys of the work of Goldston, Pintz and Yıldırım, and in particular I wrote a Bourbaki report on it, which may be interesting to those who read French;
• Concerning the automorphic Kloostermania that comes into the Fouvry-Iwaniec and Bombieri-Friedlander-Iwaniec circle of ideas (although it is apparently not needed for Zhang’s proof…), I happened to write a few years ago, for a book on Poincaré’s mathematical workan account of the applications of Poincaré series to analytic number theory, which are used to prove the Kuznetsov formula;
• Fouvry has written a survey Cinquante ans de théorie analytique des nombres from the point of view of sieve methods, which discusses the philosophy of extending the ranges of exponents of distribution for important sequences, as well as the well-factorable weights of Iwaniec;
• Fans of trace functions may remember that I noticed in a previous post (see the very end) that the exponential sum of Friedlander-Iwaniec, estimated by Birch and Bombieri, is (for prime moduli) just a special case of the general “correlation sums” that appeared in my recent work with Fouvry and Ph. Michel — in particular, our arguments (based on the sheaf-theoretic Fourier transform of Deligne, Laumon, Katz and others) give a conceptually simple proof of that estimate (I just wrote it down in a short separate note);

And although it doesn’t seem that Zhang uses it directly, I’d like to mention that the result of Friedlander and Iwaniec concerning the exponent of distribution of $d_3$ was improved by Heath-Brown a few years later, and that Fouvry, Michel and I very recently improved it quite a bit further (for prime moduli; the second part of that paper involves another application of the Bombieri-Friedlander-Iwaniec techniques to improve the exponent of distribution on average…)

And a philosophical preliminary conclusion, before diving into the work of Zhang: it is thrilling to see this result, and I particularly like that it comes completely unexpectedly, and yet uses all these beautiful ideas and methods from this analytic number theory that I love!

May 21st, 2013 at 9:47 pm

Posted in Mathematics

## Another exercise with characters

While thinking about something else, I noticed recently the following result, which is certainly not new:

Let $G$ be a compact topological group [ADDITIONAL ASSUMPTION pointed out by Y. Choi: connected, Lie group], and let $\rho$ be a finite-dimensional irreducible unitary continuous representation of $G$ on a vector space $V$. Then the natural representation $\pi$ of $G$ on $\mathrm{End}(V)$ decomposes as a direct sum of one-dimensional characters if and only if $\rho$ is of dimension $1$.

One direction is clear: if $\rho$ has dimension one, then $\pi$ is simply the trivial one-dimensional representation. For the converse, here is an argument with character theory.

As a first step, note that if $\rho$ (of dimension $d\geq 1$, say) has this property, then in fact $\pi$ decomposes as a direct sum of distinct one-dimensional characters: indeed, the multiplicity of a character $\chi$ in $\pi$ is the same as
$n_{\chi}=\int_{G}\chi(x)\mathrm{Tr}(\pi(g))dg,$
where $dg$ is the probability Haar measure on $G$, and since
$\mathrm{Tr}(\pi(g))=|\mathrm{Tr}(\rho(g))|^2,$
we get
$n_{\chi}\leq \int_{G}\mathrm{Tr}(\pi(g))dg=1$
by the orthogonality relations of characters. (Algebraically, this is just an application of Schur’s lemma).

Thus if we decompose $\pi$ into irreducible representations, we get
$\pi=\bigoplus_{1\leq i\leq d^2} \chi_i,$
where the $\chi_i$ are distinct one-dimensional characters. We then know by orthogonality that
$d^2=\int_{G} |\mathrm{Tr}(\pi(g))|^2 dg=\int_{G} |\mathrm{Tr}(\rho(g))|^4 dg.$

Now the last-integral is bounded by
$\int_{G} |\mathrm{Tr}(\rho(g))|^4 dg\leq \mathrm{Max}_{g}|\mathrm{Tr}(\rho(g))|^2 \times \int_G|\mathrm{Tr}(\rho(g))|^2dg\leq d^2,$
(since $|\mathrm{Tr}(\rho(g))|\leq d$). Comparing, this means that there must be equality throughout in this estimate, which in turn implies that $|\mathrm{Tr}(\rho(g))|=d$ for all $g\in G$. Since $\rho(g)$ is unitary of size $d$, this implies that $\rho(g)$ is scalar for all $g$, and since it is assumed to be irreducible, it is in fact one-dimensional.

I see two interesting points in this argument: (1) is there a purely algebraic proof of the last part? I haven’t thought very hard about this yet, but it would be nice to have one; (2) the appearance of the fourth moment of $\rho$ is nicely reminiscent of the Larsen alternative (see Section 6.3 of my notes on representation theory, for instance…)

March 22nd, 2013 at 10:19 am

Posted in Exercise,Mathematics

## Zeros of Hermite polynomials

In my paper with É. Fouvry and Ph. Michel where we find upper bounds for the number of certain sheaves on the affine line over a finite field with bounded ramification, the combinatorial part of the argument involves spherical codes and the method of Kabatjanski and Levenshtein, and turns out to depend on the rather recondite question of knowing a lower bound on the size of the largest zero $x_n$ of the $n$-th Hermite polynomial $H_n$, which is defined for integers $n\geq 1$ by
$H_n(x)=(-1)^n e^{x^2} \frac{d^n}{dx^n}e^{x^2}.$

This is a classical orthogonal polynomial (which implies in particular that all zeros of $H_n$ are real and simple). The standard reference for such questions seems to still be Szegö’s book, in which one can read the following rather remarkable asymptotic formula:
$x_n=\sqrt{2n}-\frac{i_1}{\sqrt[3]{6}}\frac{1}{(2n)^{1/6}}+o(n^{-1/6})$
where $i_1=3.3721\ldots>0$ is the first (real) zero of the function
$\mathrm{A}(x)=\frac{\pi}{3}\sqrt{\frac{x}{3}}\Bigl\{J_{1/3}\Bigl(2\Bigl(\frac{x}{3}\Bigr)^{3/2}\Bigr)+J_{-1/3}\Bigl(2\Bigl(\frac{x}{3}\Bigr)^{3/2}\Bigr)\Bigr\}$
which is a close cousin of the Airy function (see formula (6.32.8) in Szegö’s book, noting that he observes the Peano paragraphing rule, according to which section 6.32 comes before 6.4).

(Incidentally, if — like me — you tend to trust any random PDF you download to check a formula like that, you might end up with a version containing a typo: the cube root of $6$ is, in some printings, replaced by a square root…)

Szegö references work of a number of people (Zernike, Hahn. Korous, Bottema, Van Veen and Spencer), and sketches a proof based on ideas of Sturm on comparison of solutions of two differential equations.

As it happens, it is better for our purposes to have explicit inequalities, and there is an elementary proof of the estimate
$x_n\geq\sqrt{\frac{n-1}{2}},$
which is only asymptotically weaker by a factor $2$ from the previous formula. This is also explained by Szegö, and since the argument is rather cute and short, I will give a sketch of it.

Besides the fact that the zeros of $H_n$ are real and simple, we will use the easy facts that $\deg(H_n)=n$, and that $H_n$ is an even function for $n$ even, and an odd function for $n$ odd, and most importantly (since all other properties are rather generic!) that they satisfy the differential equation
$y''-2xy'+2ny=0.$

The crucial lemma is the following result of Laguerre:

Let $P\in \mathbf{C}[X]$ be a polynomial of degree $n\geq 1$. Let $z_0$ be a simple zero of $P$, and let
$w_0=z_0-2(n-1)\frac{P'(z_0)}{P''(z_0)}.$
Then if $T\subset \mathbf{C}$ is any line or circle passing through $z_0$ and $w_0$, either all zeros of $P$ are in $T$, or both components of $\mathbf{C}-T$ contain at least one zero of $P$.

Before explaining the proof of this, let’s see how it gives the desired lower bound on the largest zero $x_n$ of $H_n$. We apply Laguerre’s result with $P=H_n$ and $z_0=x_n$. Using the differential equation, we obtain
$w_0=x_n-\frac{n-1}{x_n}.$
Now consider the circle $T$ such that the segment $[w_0,z_0]$ is a diameter of $T$.

Now note that $-x_n$ is the smallest zero of $H_n$ (as we observed above, $H_n$ is either odd or even). We can not have $w_0<-x_n$: if that were the case, the unbounded component of the complement of the circle $T$ would not contain any zero, and neither would $T$ contain all zeros (since $-x_n\notin T$), contradicting the conclusion of Laguerre's Lemma. Hence we get $-x_n\leq w_0=x_n-\frac{n-1}{x_n},$
and this implies
$x_n\geq \sqrt{\frac{n-1}{2}},$
as claimed. (Note that if $n\geq 3$, one deduces easily that the inequality is strict, but there is equality for $n=2$.)

Now for the proof of the Lemma. One defines a polynomial $Q$ by
$P=(X-z_0)Q,$
so that $Q$ has degree $n-1$ and has zero set $Z$ formed of the zeros of $P$ different from $z_0$ (since the latter is assumed to be simple). Using the definition, we have
$Q'(z_0)=P'(z_0),\quad\quad Q''(z_0)=\frac{1}{2}P''(z_0).$
We now compute the value at $z_0$ of the logarithmic derivative of $Q$, which is well-defined: we have
$\frac{Q'}{Q}=\sum_{\alpha\in Z}\frac{1}{X-\alpha},$
hence
$\frac{Q'}{Q}(z_0)=\sum_{\alpha\in Z}\frac{1}{z_0-\alpha},$
which becomes, by the above formulas and the definition of $w_0$, the identity
$\frac{1}{z_0-w_0}=\frac{1}{n-1}\sum_{\alpha\in Z}\frac{1}{z_0-\alpha},$
or equivalently
$\gamma(w_0)=\frac{1}{n-1}\sum_{\alpha\in Z}{\gamma(\alpha)},$
where $\gamma(z)=1/(z_0-z)$ is a Möbius transformation.

Recalling that $|Z|=n-1$, this means that $\gamma(w_0)$ is the average of the $\gamma(\alpha)$. It is then elementary that for line $L$, either $\gamma(Z)$ is contained in $L$, or $\gamma(Z)$ intersects both components of the complement of $L$. Now apply $\gamma^{-1}$ to this assertion: one gets that either $Z$ is contained in $\gamma^{-1}(L)$, or $Z$ intersects both components of the complement of $\gamma^{-1}(L)$. We are now done, after observing that the lines passing through $\gamma(w_0)$ are precisely the images under $\gamma$ of the circles and lines passing through $w_0$ and through $z_0$ (because $\gamma(z_0)=\infty$, and each line passes through $\infty$ in the projective line.)

February 20th, 2013 at 10:43 pm

Posted in Exercise,Mathematics