# E. Kowalski's blog

## Sliding over the Polya-Vinogradov gap

In my series of papers with É. Fouvry and Ph. Michel, we seem to alternate between longer papers and shorter ones. The last one, which we just put up on arXiv, is in some sense the shortest one: even if it goes up to 19 pages in length, the basic idea can be explained extremely quickly, and much of the paper is taken with variations on its basic theme and illustrations.

The context is the all-important problem of estimating short exponential sums, in the specific case of sums over intervals in a cyclic group $A=\mathbf{Z}/m\mathbf{Z}$, where (for the sake of this post) we define an interval $I$ in $A$ to be the injective image of a set of successive integers under reduction modulo $m$. An interval is “short” if its length is significantly smaller than $m$, in the sense that $|I|=m^{\theta}$ for some real parameter with $0\leq \theta<1$. We are then looking for non-trivial estimates for
$\sum_{x\in I}\varphi(x)$
where $\varphi\,:\, A\rightarrow \mathbf{C}$
is some complex-valued function, which is supposed to oscillate in such a way that we expect that
$\sum_{x\in I}\varphi(x)=o(|I|\|\varphi\|_{\infty}).$

There is a fundamental technique to attack this problem, which is of constant use in analytic number theory, and which is known as the completion method. Abstractly, one can see it as a case of the Plancherel formula in (discrete) Fourier analysis, namely, one writes
$\sum_{x\in I}\varphi(x)=\sum_{t\in A}\hat{\varphi}(t)\hat{I}(t),$
where
$\hat{\varphi}(t)=\frac{1}{\sqrt{m}}\sum_{x\in A}\varphi(x)e\Bigl(\frac{tx}{m}\Bigr),$
and $\hat{I}$ is the same transform applied to the characteristic function of the interval $I$. One then deduces the bound
$\sum_{x\in I}\varphi(x)\ll \|\hat{\varphi}\|_{\infty}\sqrt{m}(\log m),$
where the implied constant is absolute, simply by estimating the $L^1$-norm of $\hat{I}$ — in our normalization, this is $\ll \sqrt{m}(\log m)$.

In many cases, the Fourier transform of $\varphi$ can be estimated quite well, and in particular, in the context of finite fields $A=\mathbf{F}_p$, if $\varphi$ is a trace function which is not proportional to an additive character $e(x/p)$, Deligne’s Riemann Hypothesis gives
$\|\hat{\varphi}\|_{\infty}\ll 1,$
where the implied constant depends only on the conductor of (the sheaf underlying) $\varphi$.

The resulting estimate
$\sum_{x\in I}\varphi(x)\ll \sqrt{m}(\log m)$
is known as the general Polya-Vinogradov bound (although the name is sometimes restricted to the case where $\varphi=\chi$ is a Dirichlet character modulo $m$.)

This is quite an efficient result: it gives non-trivial estimates for all intervals $I$ such that $p/(\log p)=o(|I|)$ (or even $|I|=c\sqrt{p}(\log p)$, for a suitable $c>0$, if one only desires some cancellation by a multiplicative constant, and not that the sum be of smaller order of magnitude.) With this generality, it is also almost best possible: if we take $\varphi(x)=e(x^2/p)$, then the sum over $1\leq x\leq \sqrt{p}$ has essentially no cancellation (the phase does not have enough time to turn around sufficiently.)

It seems natural, then, to ask: is the Polya-Vinogradov range (where $I$ is a bit larger than $\sqrt{p}(\log p)$) best possible? Does the gap between $\sqrt{p}$ and $\sqrt{p}(\log p)$ really represent a different behavior for such generalized exponential sums?

Our paper answers this question quite satisfactorily: there is, in fact, no gap, in the sense that for trace functions (not proportional to an additive character), one can get cancellation in sums over intervals of length $\sqrt{p}\beta(p)$ for any function $\beta(p)$ tending to infinity with $p$ (assuming we do tend to infinity while keeping the conductor bounded.)

As a direct corollary, we get for instance the equidistribution in $[0,1]$ (with respect to Lebesgue measure) of fractional parts $\{\frac{f(n)}{p}\}$ for $1\leq n\leq \sqrt{p}\beta(p)$, for any fixed polynomial $f\in \mathbf{Z}[X]$ of degree $\geq 2$, and the equidistribution with respect to the Sato-Tate measure of the angles of Kloosterman sums $\theta_p(x)$ such that
$\frac{1}{\sqrt{p}}\sum_{y\in\mathbf{F}_p^{\times}}e\Bigl(\frac{xy+y^{-1}}{p}\Bigr)=2\cos\theta_p(x),$
again for $1\leq x\leq \sqrt{p}\beta(p)$, whenever $\beta(p)\rightarrow +\infty$. (This last result had been proved in the Polya-Vinogradov regime by Philippe a while ago, and in the full range $1\leq x\leq p-1$, it is a celebrated result of Katz.)

As I mentioned, the basic method is very easy, and we call it the “sliding sum method”. It is reminiscent of the van der Corput inequality, and we wouldn’t be surprised to learn that it had already been established by other people. (The applicability to trace functions, on the other hand, requires some rather deep results, as I will discuss below.)

I will give the simplest variant. Consider an interval $I$ in $A=\mathbf{Z}/m\mathbf{Z}$. We assume that $|I|=\sqrt{m}\beta$ with $\beta=\beta(I)\geq 1$ (so we are certainly above the square-root length). We then compare upper and lower bounds for the average
$\Sigma=\sum_{a\in A}{\Bigl|\sum_{x\in I}{\varphi(x+a)}\Bigr|^2}.$

For the upper-bound, we expand the square and exchange the order of summation, obtaining
$\Sigma=\sum_{x,y\in I}C(\varphi;x-y)$
where
$C(\varphi,t)=\sum_{a\in A}\varphi(a)\overline{\varphi(a+t)}$
is an additive correlation sum of $\varphi$ (it is a special case of those correlation sums we have considered extensively in our first works.)

We assume that $\varphi$ has the property that, if $t$ is not in a set containing at most $c$ values, we have
$|C(\varphi,t)|\leq c \sqrt{m}$
(i.e., the correlation exhibits uniform square-root cancellation, except for a few exceptional cases, among which of course we expect to have $t=0$). Then, using the trivial bound
$|C(\varphi,t)|\leq \|\varphi\|_{\infty}^2m$
for the exceptional values of $t=x-y$, we get
$\Sigma\ll m|I|+m^{1/2}|I|^2\ll m^{1/2}|I|^{1/2},$
where the implied constant depends on the parameter $c$ that we just introduced, and on the $L^{\infty}$-norm of $\varphi$ (and we use the fact that $|I|\geq \sqrt{m}$.)

As for the lower bound, we just use the fact that if we shift $I$ by a small amount, the sum over $a+I$ does not change too much (because the interval and its shift overlap significantly; intuitively, we slide the sum over a certain range, hence our name for the method):
$\Bigl|\sum_{x\in I}\varphi(x)-\sum_{x\in I}\varphi(x+a)\Bigr|\leq 2|a|\|\varphi\|_{\infty}.$

This means that if we consider the sums shifted by $a$ of size up to (roughly) the sum
$S=\sum_{x\in I}\varphi(x)$
itself, these will have a contribution to $\Sigma$ of size roughly $\gg |S|^3.$ So by comparing the upper and lower bounds (using positivity), we get
$|S|^3\ll m^{1/2}|I|^2,$
or, in terms of the parameter $\beta\geq 1$, we get
$S\ll |I|^{2/3}m^{1/6}=|I|\beta^{-1/3}$
(where the implied constant depends on $c$ and on $\|\varphi\|_{\infty}$).

Hence, provided we work with functions with bounded $L^{\infty}$-norm and with uniformly small correlations in the sense we described, then we can get cancellation as long as $\beta(I)\rightarrow +\infty$, but arbitrarily slowly.

This is elementary, but now the really significant part is that if $\varphi$ is a trace function modulo $m=p$, and if it is associated to an irreducible sheaf, and is not proportional to an additive character, then it does have the desired properties, the parameter $c$ being bounded in terms of the conductor $c(\varphi)$ of $\varphi$ (which also satisfies $\|\varphi\|_{\infty}\leq c(\varphi)$, so that it serves as a universal complexity parameter, as it did in our previous works.)

We do not need to fight very hard to prove this in our new paper, but this is because we can quote and use various lemmas from the previous ones, and — as usual — rely on Deligne’s fundamental version of the Riemann Hypothesis over finite fields; the result itself is undoubtedly quite deep when applied to trace functions which are not of rank $1$.

As a last remark: the method is robust, and we adapt it for instance to sums over proper generalized arithmetic progressions, which are natural generalizations of intervals. Here we get bounds of quality depending also on the dimension of the progression (the saving $\beta^{-1/3}$ is replaced by $\beta^{-1/(k+2)}$ for a $k$-dimensional progression), but the result is appealing even for large $k$ because the Polya-Vinogradov gap is
$\sqrt{m}\leq |B|\leq \sqrt{m}(\log m)^k$
for a $k$-dimensional progression $B\subset \mathbf{Z}/m\mathbf{Z}$, hence its size also increases with $k$ (this follows from a recent result of X. Shao bounding the $L^1$-norm of the Fourier transform of the characteristic function of $B$.)

June 30th, 2013 at 9:37 pm

Posted in Mathematics

## Fouvry 60: tableaux d’une conférence

I more or less designated myself as the official photographer of the Fouvry60 conference, and took many pictures, quite a few of which turned out rather well. Certainly, when I view them, I think they give an accurate impression of the atmosphere of the conference. In this post I will just preview a very small selection — I and the other organizers are attempting to find the best way to make the full set available for the participants (at least.)

Jean-Marc Deshouillers, Henryk Iwaniec and John Friedlander

Although Jean-Marc Deshouillers could only attend the first day of the meeting, I was very happy to be able to take a good picture to remind us all of Kloostermania

Harald Helfgott

Voted best-dressed speaker, and speaking about the ternary Goldbach problem, here is Harald Helfgott.

Mise en abîme

The picture for the poster was taken by C.J. Mozzochi around 1999 in Princeton.

Karim Belabas and Henri Cohen

Any time we use Pari/GP for number-theoretic computations, we can thank Karim and Henri (and a few others) for their work; Henri is currently working on a new modular forms script for Pari…

A friendly visitor

Philippe Michel

Étienne Fouvry

Henryk Iwaniec

Henryk Iwaniec presented his work with Brian Conrey in extending the Levinson-Conrey method to allow long mollifiers.

Cécile Dartyge

Cécile Dartyge, before she presented her very impressive work on the largest prime divisors of values of a polynomial of degree $4$.

“Reach for the skies, punk!”

Tim Browning remembers his first meeting with Fouvry…

Bouillabaisse

Sitting, from left to right, Philippe Michel, Régis (du Moulin) de la Bretèche, Joël Rivat (co-organizers of the meeting) and Étienne Fouvry; on the table, the Bouillabaisse: Congre, Rascasse, Saint Pierre, j’en passe, et des meilleurs.

Jürgen Klüners

J. Klüners discussed his work with Fouvry on Cohen-Lenstra heuristics and the negative Pell equation; note the last line of the slide: “I can do this”; when collaborating with Fouvry, you have a good chance of receiving such a reassuring message at a tricky point of the work…

June 27th, 2013 at 4:44 pm

Posted in Mathematics

## A ternary divisor variation

Here is a sketch of the argument mentioned in the previous post (which arose from the discussions with Étienne Fouvry, Philippe Michel, Paul Nelson, etc, but presentation mistakes are fully mine…).

Theorem. We have
$\frac{1}{Q}\sum_{r\sim R}\sum_{s\sim S}\sum_{a\bmod q,\ P(a)= 0\bmod q}\Bigl|S(a,q)-MT_q\Bigr|=O\Bigl(\frac{X}{Q\log^A X}\Bigr)$
provided
$Q^{3/2}S^{1/2}
where
(1) we put
$S(a,q)=\sum_{m\sim M}\alpha_m\sum_{mn_1n_2n_3\equiv a\bmod{q},\ n_i\sim N_i}1,$
and denote by $MT_q$ the expected main term;
(2) the parameters are $X=MN=MN_1N_2N_3$, $Q=RS$, the modulus is $q=rs$, the moduli $r$ and $s$ are coprime and squarefree, and $P\in \mathbf{Z}[X]$ is the usual polynomial associated to an admissible tuple.

If we take $R=Q^{1/2-\epsilon}$ and $S=Q^{1/2+\epsilon}$, we get a non-trivial result for $Q$ as large as $N^{4/7}X^{-\epsilon}$.

(In fact, the special shape of $P$ will play no role in this argument, and any non-constant polynomial will work just as well.)

More precisely, I will give a proof which is — except for its terseness — essentially complete for $r$ and $s$ of special type, and we anticipate only technical adjustments to cover the general case — we will write this down carefully of course.

Before starting, a natural question may come to mind: given that qui peut le plus, peut le moins, can one give an analogue result for the usual divisor function? Recall that, for the latter, the (individual) exponent of distribution has been known to be at least $2/3$ for a long time (by work of Linnik and Selberg independently, both using the Weil bound for Kloosterman sums.) This exponent has not been improved, even on average over $q$ (although Fouvry succeeded, on average over $q$, in covering the range $X^{2/3-\epsilon}\leq q\leq X^{1-\epsilon}$) despite much effort. However, Fouvry and Iwaniec (with an Appendix by Katz to treat yet another complete exponential sum over finite fields) proved already twenty years ago that one could improve it to $2/3+1/48$ if one averages for a fixed $r \leq X^{3/8}$ over special moduli $q=rs$ with $rs^2\leq X^{1-\epsilon}$ — this gives in particular a nice earlier illustration of the usefulness of factorable moduli for this type of questions.

So, to work. For fixed $r$ and $s$, we begin by applying the Poisson summation formula to the three “smooth” variables $n_i$ (the smoothing is hidden in the notation $n_i\sim N_i$); the simultaneous zero frequencies $(h_1,h_2,h_3)=(0,0,0)$ give the main term, as they should, and the other degenerate cases are easier to handle than the contribution of the non-zero $h_i,$ so the main secondary term for a given $q$ and $a$ is given by
$S_2=\frac{N}{q^2}\sum_m \alpha_m\sum_{1\leq |h_i|\\ H_i} K_3(a\bar{m}h_1h_2h_3;q),$
where the dual lengths are $H_i=Q/N_i$, so that the total number of frequencies is $H_1H_2H_3=Q^3/N$, and where $K_3$ is a normalized hyper-Kloosterman sum modulo $rs$:
$K_3(u;q)=\frac{1}{q}\sum_{xyz=u}\psi_q(x+y+z),\quad\quad \psi_q(x)=e(x/q).$

(Below I will usually not repeat the range of summations when they are unchanged from one line to the next.)

Now we sum over $r$ and $s$, move the sum over $m$ outside, and apply the Cauchy-Schwarz inequality to the $r$ sum, for a fixed $(s,a_s,m)$, where $a_s$ is $a$ modulo $s$. To prepare for this step, we use the Chinese Remainder Theorem to split the condition $P(a)=0\bmod rs$, and to factor the hyper-Kloosterman sum as a sum modulo $r$ times one modulo $s$.

The contribution of a fixed $(s,a_s,m)$ is
$T=\frac{1}{R}\sum_{r}\sum_{P(a_r)=0\bmod{r}}\Bigl|\sum_{h_i}K_3(a\bar{m}h_1h_2h_3;rs)\Bigr|$
and we can bound
$T\ll H^{1/2+\epsilon} U^{1/2},$
where
$U=\frac{1}{R^2}\sum_{r_1,r_2} \lambda(r_1,a_1)\lambda(r_2,a_2) \sum_{1\leq |h|\ll H} \overline{K_3(a_1\bar{s}^3\bar{m}h;r_1)}K_3(a_2\bar{s}^3\bar{m}h;r_2)\overline{K_3(a_s\bar{r}_1^3\bar{m}h;s)}K_3(a_s\bar{r}_2^3\bar{m}h;s)$
for some coefficients $\lambda(\cdot,\cdot)$ which are bounded.

The point of this is that we have smoothed the variable $h=h_1h_2h_3$ by eliminating its multiplicity, and that the range of this variable can be quite long; as long as $H>(R^2S)^{1/2}$, completing the sum in the Polya-Vinogradov (or Poisson) style will be useful.

Now we continue with $U$. It is here that it simplifies matters to have $R and to do as if $r$ and $s$ were primes (this is a technicality which experience shows should give no loss in the final, complete, analysis.)

So we consider the sum over $r_1, r_2$. The diagonal contribution where $r_1=r_2$ is $\ll H^{1+\epsilon}R^{-1+\epsilon}$.

In the non-diagonal terms, we distinguish whether $r_1\equiv r_2\bmod s$ or not. If not, we complete the two hyper-Kloosterman sums. This gives two complete exponential sums modulo $r_1, r_2$ and modulo $s$. The latter is the Friedlander-Iwaniec sum, in its incarnation as "Borel" correlations of hyper-Kloosterman sums (see the remark at the end of our note on these sums; this identification is already in Heath-Brown's paper, and Philippe realized recently that he had also encountered them in a paper on lower bounds for exponential sums.)

Both sums give square root cancellation (using Deligne’s work, of course) except for the $s$ sum if $r_1^3\equiv \pm r_2^3\bmod s$. But we may just push these to the second case. Thus, the contribution $U_0$ of these non-exceptional terms gives
$U_0\ll (R^2S)^{1/2+\epsilon}\Bigl(\frac{H}{R^2S}+1\Bigr)$
(counting the number of complete sums in the $h$-interval).

On the other hand, the exceptional $(r_1,r_2)$ are still controlled by the diagonal terms (because of the condition $R; there is a minor trick involved here if the cubic roots of unity exist modulo $s$, but I'll gloss over that.)

Now we can gather everything, and one checks that we end up with a bound
$\sum_{r,s}\sum_a S_2 \ll X^{\epsilon} QM\Bigl(\frac{1}{R^{1/2}}+\frac{R^{1/4}N^{1/2}}{Q^{5/4}}\Bigr).$

We need this to be $\ll MNQ^{-1} (\log MN)^{-A}$ for any $A$, and we see that we succeed as long as
$Q^2R^{-1/2}
as stated in the theorem (the second condition is implied by the first if $R). And as mentioned just afterwards, if we have $R=Q^{1/2-\epsilon}$, this gives a good distribution up to $X^{-\epsilon}N^{4/7}$ (note that epsilons may change from one inequality to the next.)

Remark. As the reader can see, we do not use either Weyl shifts, or cancellation in Ramanujan sums. The latter might appear in a more precise analysis, however, and give some extra gain.

June 25th, 2013 at 9:06 pm

Posted in Mathematics

## Bounded gaps between primes: the dawn of (some) enlightenment

Thanks to the recent conference celebrating the 25th anniversary of the Zahlentheorie Seminar, and even more as this week’s conference for Fouvry’s 60th Birthday in Marseille, I have been able to talk with a number of people about Zhang’s result on bounded gaps between primes, especially Philippe Michel, Paul Nelson and Étienne Fouvry. All generic “our”s and “we”s below refer to these discussions.

We have concentrated exclusively on the critical “ternary-divisor” part of the argument, and attempted many variants and reconfigurations. Our goal is not really to do better than Zhang (in terms of exponents), but to see by direct experience what works and what doesn’t.

Quite a few of these attacks failed, which of course helps in building some understanding. But two or three directions are promising, and one definitely does seem to lead to a different version of the argument. We have not yet written full details, but the approach succeeds (i.e., it definitely breaks the barrier $1/2$ of the Riemann Hypothesis, which is the whole game) in getting an exponent of distribution $4/7$ for the ternary divisor function with nicely factorable moduli, at a level of informality which inspires great confidence. (When applied in the context of my previous papers with Fouvry and Michel, this type of sketches give fully consistent exponents and uniformity in comparison with the final papers.)

Here are just some remarks. We view the basic problem as understanding

$\sum_r\sum_s \sum_{mn_1n_2n_3\equiv a \bmod{rs}} 1$

where $q=rs$ is the squarefree modulus, which should therefore (to beat RH) be of size a bit larger than $X^{1/2}$ (by a small positive power of $X$), and $m$ is fixed (though one can try to exploit average over it, and it will be important in the end that it be relatively small). In fact, quite a few attempts purposely express this as a special case of

$\sum_r\sum_s \frac{1}{\sqrt{rs}}\sum_{n} d_3(n) K(n)$

for some general “trace function” (these are more general than usually discussed, both because the modulus is not prime, and because the $L^2$-normalized characteristic function of a residue class modulo a prime is only the trace function of a perverse sheaf, and not a standard lisse sheaf, but that last point is not particularly problematic). The idea is that we want to exploit the general context and insights that we have developped about these objects.

In contrast to what I mentioned in my last post, it does seem that the factorization of moduli in the ternary case is crucial, but the cancellation from Ramanujan sums might not be (although of course extra cancellation can not hurt…)

On the other hand, in trying to work with a generalish $K$, we understand now this extra cancellation at a higher level: it is not a super-specific fact about Ramanujan sums, but the latter is a concrete illustration of a fundamental phenomenon concerning trace functions of the so-called (pointwise pure) “middle-extension” sheaves, due to Deligne: at a singularity of such a sheaf, the Frobenius eigenvalues have lower weights than at “generic” points (see Lemma 1.8.1 in Weil II), and often strictly lower. (On the automorphic side, it corresponds to the well-known fact that the Satake parameters at a ramified prime are smaller in absolute value than the Ramanujan-Petersson bound at the unramified primes.)
In this picture, the Ramanujan sums correspond to the singularity at $0$ of the basic Kloosterman sheaf. (Of course one doesn’t need Deligne to estimate Ramanujan sums, but we can now confidently play complicated games with the trace functions without being afraid of having lost a very special property of these particular sums…)

The working technique also shows that one can argue without the Weyl-type shifts in the original paper — this was not entirely surprising since Heath-Brown’s improvement of the Friedlander-Iwaniec exponent for the bare $d_3(n)$ (and also ours) do not involve such shifts.

Finally, all our current attempts to play games to avoid Deligne-level estimates for exponential sums have hit barriers…

There is of course still much to be done.

June 22nd, 2013 at 8:47 am

Posted in Mathematics