Skip to content

Alas, poor Yorick…

Je suis de passage, presque par hasard, ce soir à Paris, et je viens de lire que Patrice Chéreau est mort. Il y a peu d’occasions dont je me souvienne aussi vivement que d’avoir vu sa mise en scène de Hamlet, il y a longtemps, à Grenoble — “Good night, sweet prince // And flights of angels sing thee to thy rest!”

Our research institute has a nicer logo than yours

Here is the logo of the new Institute for Theoretical Studies of ETH:

ITS Logo

ITS Logo

Conductors of one-variable transforms of trace functions

In one of the recent posts by T. Tao on the progress of the Polymath8 project, the question arose of whether such functions as
\varphi(x)=\sum_{y\in\mathbf{F}_p} e\Bigl(\frac{f(x,y)}{p}\Bigr)
defined for x\in\mathbf{F}_p and a rational function f\in\mathbf{F}_p(X,Y), are trace functions, and more importantly, what is their conductor (see this, and the following, comments). In particular, if f is obtained by reduction modulo primes of a fixed rational function in \mathbf{Q}(X,Y), one could expect that the answer is “Yes”, with a bound for the conductor independent of p.

This is in fact a question that É. Fouvry, Ph. Michel and I had considered in special cases (or variants) in some of our papers. In fact, it is natural to consider more generally the linear map sending a function K\,:\, \mathbf{F}_p\rightarrow \mathbf{C} to the function
\varphi(x)= T_{f}(K)(x)=\sum_{y\in\mathbf{F}_p} K(y) e\Bigl(\frac{f(x,y)}{p}\Bigr),
(which is an analogue of an integral operator) and to ask whether this linear map sends trace functions to trace functions, and if yes, if the conductor of \varphi is bounded in terms of the conductor of K and the degree of the numerator and denominator of f.

The most important case, which is crucial in all our series of papers, is the Fourier transform, which corresponds simply to f(X,Y)=XY. We proved the desired property (which we view, rather naturally I think, as a form of “continuity” of the Fourier transform, in an algebraic sense) in that case using Deligne’s definition and Laumon’s study of the sheaf-theoretic Fourier transform. Most importantly, in order to estimate the conductor of the Fourier transform of a trace function, we used Laumon’s theory of the local Fourier transform, which is rather deep.

It is by no means clear (and this is a rather interesting question of algebraic geometry!) that such a local theory exists, and leads to the continuity property, for arbitrary “kernels” e(f(x,y)/p). However, we figured out a way to prove the conductor bound that bypasses these local results. It applies to the Fourier transform, and although it is then much less precise than what is obtained from Laumon’s theory, it gives a (possibly) more accessible proof of the continuity property.

We have just put here our preprint with this result. It is not yet submitted to arXiv because we are considering various possibilities for either extensions of applications, but the proof of the main result is complete.

The paper was rather interesting to write. On the one hand, it turns out that it is not needed for the original question from Polymath8: Philippe found an elementary argument that reduces the specific example of the problem which reduces it, essentially, to a special case of the Fourier transform (this is written down in the Deligne section of the Polymath8 paper). On the other hand, although we had thought a bit about this question beyond the Fourier transform, we had not made progress. The reason, in retrospect, is that in order to treat the general transform
T_{f}(K)(x)
(as defined above), we begin by treating the special case
T_f(1)(x)=\sum_{y\in\mathbf{F}_p} e\Bigl(\frac{f(x,y)}{p}\Bigr),
and without the motivation from the Polymath8 project, we had not thought of this first step.

The reason that things work this way is, as all other main ideas in this paper, very easy to explain by writing down and manipulating sums, and assuming that those behave always as the best Riemann Hypothesis over finite fields suggest. But the actual arguments are purely algebraico-geometric, and we end up using quite a bit of the general formalism of étale cohomology, but not the Riemann Hypothesis (which is morally as things should be.)

I will give here the informal sketch: first of all, the function
\varphi(x)= T_{f}(K)(x)=\sum_{y\in\mathbf{F}_p} K(y) e\Bigl(\frac{f(x,y)}{p}\Bigr),
is indeed a trace function if K is one, in almost all cases: precisely, the existence of higher-direct image sheaves with compact support, the proper base change theorem, and the Grothendieck trace formula, show that
\varphi(x)=t_0(x)-t_1(x)+t_2(x),
where t_i is the trace function of the sheaf
R^ip_{1,!} (p_2^*\mathcal{F}\otimes \mathcal{L}),
where p_1,\ p_2 are the two projections (x,y)\mapsto x and (x,y)\mapsto y, the sheaf \mathcal{F} is the one with trace function K, and the sheaf \mathcal{L} is the Artin-Schreier sheaf on the plane with trace function
t_{\mathcal{L}}(x)=e(f(x,y)/p)
for x,\ y\in\mathbf{F}_p. In many cases, both t_0 and t_2 vanish, and then \varphi is (minus) the trace function of the sheaf
\mathcal{G}=R^1p_{1,!} (p_2^*\mathcal{F}\otimes \mathcal{L}).

This essentially answers the first question in full generality: the “transform” K\mapsto T_f(K) maps trace functions to trace functions.

Now consider the conductor of the sheaf \mathcal{G} above. We defined it as the sum of three terms. Two are relatively accessible: they are the (generic) rank of the sheaf, and the number of singularities. We can basically determine these if we know the dimension of all fibers: the maximum dimension gives an upper bound for the rank, and a result of Deligne says (roughly) that the singularities are the points where the fiber has smaller than maximal dimension. So let us assume that we can bound these quantities for the transform sheaf \mathcal{G}. Then there only remains to estimate the third term, which is the sum of the Swan conductors at the singularities. This is rather delicate, at least for people for whom — and this applies to us… — the Swan conductor remains a rather mysterious and subtle data.

The first idea is that, assuming the other two pieces of the conductor are under control, the sum of the Swan conductors is bounded by a global invariant that may be accessible in applications. Namely, if a sheaf \mathcal{M} is lisse on a dense open subset U of the affine line then the Euler-Poincaré characteristic formula (of Néron, Ogg, Shafarevitch) easily proves that
\sum_x \mathrm{swan}_x(\mathcal{M})\ll \text{(rank of }\mathcal{M}\text{)}+ \text{(nb. of sing.)} + \dim H^1_c(\bar{U},\mathcal{M}).
So we have to deal with the dimension of the cohomology group H^1_c(\bar{U},\mathcal{M}). The point is that, in good circumstances, and especially if \mathcal{M} is of weight 0, we can expect that
\dim H^1_c(\bar{U},\mathcal{M})=\limsup p^{-n/2} |S_n|,
where S_n is the sum of the trace function of \mathcal{M} over the points of U(\mathbf{F}_{p^n}). Estimating this becomes a problem of analytic number theory, and we may hope to succeed.

For instance, if we apply this principle to
\mathcal{M}=R^1p_{1,!}\mathcal{L},
with \mathcal{L} the Artin-Schreier sheaf associated to a rational function f as before, the sum S_n is simply
p^{-n/2}S_n=\frac{1}{p^{n/2}}\sum_{x\in U(\mathbf{F}_{p^n})}\sum_{y\in\mathcal{F}_{p^n}}e\Bigl(\frac{\mathrm{Tr}_nf(x,y)}{p}\Bigr),
were \mathrm{Tr}_n is the trace from \mathbf{F}_{p^n} to \mathbf{F}_p.

In good circumstances, we know square-root cancellation for this two-variable character sum, and we obtain a bound for the limsup of p^{-n/2}S_n, which depends only on the degree of the numerator and denominator of f, using Bombieri’s bounds for sums of Betti numbers for such exponential sums (or the generalizations of Adolphson-Sperber, or those of Katz.)

This deals (optimistically) with the transform with kernel \mathcal{L} when the input sheaf is the trivial sheaf, which we note is the case in the Polymath8 case. Now for the second idea: assume we consider
\mathcal{M}=R^1p_{1,!}(_2^*\mathcal{F}\otimes\mathcal{L})
now, and try to use the same principle. With K denoting the trace function of \mathcal{F}, the sum S_n now satisfies
p^{-n/2}S_n=\frac{1}{p^{n/2}}\sum_{x\in U(\mathbf{F}_{p^n})}\sum_{y\in \mathbf{F}_{p^n}} K(y)e\Bigl(\frac{\mathrm{Tr}_nf(x,y)}{p}\Bigr).

In impeccable style, we exchange the two sums of course. We get
 p^{-n/2}S_n=\frac{1}{p^{n/2}}\sum_{y\in \mathbf{F}_{p^n}} K(y) \sum_{x\in U(\mathbf{F}_{p^n})}e\Bigl(\frac{\mathrm{Tr}_nf(x,y)}{p}\Bigr)=\frac{1}{p^{n/2}}\sum_{y\in \mathbf{F}_{p^n}} K(y)L(y)
where
L(y)=\sum_{x\in U(\mathbf{F}_{p^n})}e\Bigl(\frac{\mathrm{Tr}_nf(x,y)}{p}\Bigr).
But, by the first step, applied to f^*(X,Y)=f(Y,X) instead of f(X,Y), the function L is a trace function with conductor bounded in terms of the degrees of f^*, or equivalently of f. Thus p^{-n/2}S_n is the inner-product, over \mathbf{F}_{p^n}, of the trace functions of two sheaves with bounded conductor, and we can expect both to have weight 0. We can then expect quasi-orthogonality from the Riemann Hypothesis, and a resulting bound for the limsup that depends only on the conductors of these two sheaves, i.e., on the conductor of \mathcal{F} (for K) and on the degrees of the numerator of denominator of f (for L). This is the desired conclusion.

This sketch explains why we can prove the results in our paper. In many cases, it is certainly a valid reasoning, but it is not easy to make it rigorous in great generality. The basic problems are that it depends on the sums S_n having square-root cancellation (which for the transform of the trivial sheaf is a non-trivial assumption), and also on S_n detecting all of the cohomology space, and not just the part of weight 1: by Deligne’s Riemann Hypothesis, the eigenvalues of Frobenius on H^1_c(\bar{U},\mathcal{G}) are of weight \leq 1 if \mathcal{G} has weight 0, but the limsup only gives the number of eigenvalues of weight 1, and having too many smaller eigenvalues would create problem.

We work around these possible difficulties by dropping the diophantine motivation, and going straight at the dimension of H^1_c(\bar{U},\mathcal{G}). To do this, we need algebraic analogues of the two fundamental analytic steps we used:

(1) expressing the sum S_n for the trivial sheaf as a two-variable character sum;

(2) exchanging the order of the two sums when inserting a general input sheaf \mathcal{F}.

Both of these are replaced by (very elementary) arguments with Leray spectral sequences. This is a relatively well-known idea (it is part of the “dictionary” in Deligne’s survey on Sommes trigonométriques in SGA 4 ½ and quite a few concrete examples are found in papers and books of Katz), but it is the first time we use it ourselves. I will survey and explain this in a later post, since it seems that a good concrete example of the use of spectral sequences in analytic number theory might be a useful thing to have somewhere…

The reader who opens the PDF file of our preprint might be surprised to see that the paper in more than thirty pages long, in comparison with the rather simple-looking discussion above. The length is justified partly by the two motivating discussions we have included (the diophantine argument with S_n, and a self-contained algebraic treatment of the important case of the Fourier transform). But it also turns out that taking care of the “easy” parts of the conductor requires somewhat lengthy elementary arguments with rational functions. Most importantly maybe, we must take into account the fact that, in contrast with our previous works, we now have to handle general constructible \ell-adic sheaves, and not only middle-extension sheaves: there is no reason for our transformed sheaves to be so-well behaved in general. This requires adding a further component to the conductor (roughly, the support and dimension of the fibers of the “punctual part” of the sheaf, e.g., the conductor of a sheaf supported at 0 with fiber of dimension n\geq 1 must increase with n), and we also need to control it before applying the previous ideas. We also prove, both as a useful too and as a by-product, the analogue of the Bombieri bounds for a general input sheaf \mathcal{F}: the Betti numbers
\dim H^i_c(\mathbf{A}^2\times\bar{\mathbf{F}}_p,p_2^*\mathcal{F}\otimes \mathcal{L})
are bounded in terms of the conductor of \mathcal{F} and the degree of the numerator and denominator of f.

Sliding over the Polya-Vinogradov gap

In my series of papers with É. Fouvry and Ph. Michel, we seem to alternate between longer papers and shorter ones. The last one, which we just put up on arXiv, is in some sense the shortest one: even if it goes up to 19 pages in length, the basic idea can be explained extremely quickly, and much of the paper is taken with variations on its basic theme and illustrations.

The context is the all-important problem of estimating short exponential sums, in the specific case of sums over intervals in a cyclic group A=\mathbf{Z}/m\mathbf{Z}, where (for the sake of this post) we define an interval I in A to be the injective image of a set of successive integers under reduction modulo m. An interval is “short” if its length is significantly smaller than m, in the sense that |I|=m^{\theta} for some real parameter with 0\leq \theta<1. We are then looking for non-trivial estimates for
\sum_{x\in I}\varphi(x)
where \varphi\,:\, A\rightarrow \mathbf{C}
is some complex-valued function, which is supposed to oscillate in such a way that we expect that
\sum_{x\in I}\varphi(x)=o(|I|\|\varphi\|_{\infty}).

There is a fundamental technique to attack this problem, which is of constant use in analytic number theory, and which is known as the completion method. Abstractly, one can see it as a case of the Plancherel formula in (discrete) Fourier analysis, namely, one writes
 \sum_{x\in I}\varphi(x)=\sum_{t\in A}\hat{\varphi}(t)\hat{I}(t),
where
\hat{\varphi}(t)=\frac{1}{\sqrt{m}}\sum_{x\in A}\varphi(x)e\Bigl(\frac{tx}{m}\Bigr),
and \hat{I} is the same transform applied to the characteristic function of the interval I. One then deduces the bound
\sum_{x\in I}\varphi(x)\ll \|\hat{\varphi}\|_{\infty}\sqrt{m}(\log m),
where the implied constant is absolute, simply by estimating the L^1-norm of \hat{I} — in our normalization, this is \ll \sqrt{m}(\log m).

In many cases, the Fourier transform of \varphi can be estimated quite well, and in particular, in the context of finite fields A=\mathbf{F}_p, if \varphi is a trace function which is not proportional to an additive character e(x/p), Deligne’s Riemann Hypothesis gives
\|\hat{\varphi}\|_{\infty}\ll 1,
where the implied constant depends only on the conductor of (the sheaf underlying) \varphi.

The resulting estimate
\sum_{x\in I}\varphi(x)\ll \sqrt{m}(\log m)
is known as the general Polya-Vinogradov bound (although the name is sometimes restricted to the case where \varphi=\chi is a Dirichlet character modulo m.)

This is quite an efficient result: it gives non-trivial estimates for all intervals I such that p/(\log p)=o(|I|) (or even |I|=c\sqrt{p}(\log p), for a suitable c>0, if one only desires some cancellation by a multiplicative constant, and not that the sum be of smaller order of magnitude.) With this generality, it is also almost best possible: if we take \varphi(x)=e(x^2/p), then the sum over 1\leq x\leq \sqrt{p} has essentially no cancellation (the phase does not have enough time to turn around sufficiently.)

It seems natural, then, to ask: is the Polya-Vinogradov range (where I is a bit larger than \sqrt{p}(\log p)) best possible? Does the gap between \sqrt{p} and \sqrt{p}(\log p) really represent a different behavior for such generalized exponential sums?

Our paper answers this question quite satisfactorily: there is, in fact, no gap, in the sense that for trace functions (not proportional to an additive character), one can get cancellation in sums over intervals of length \sqrt{p}\beta(p) for any function \beta(p) tending to infinity with p (assuming we do tend to infinity while keeping the conductor bounded.)

As a direct corollary, we get for instance the equidistribution in [0,1] (with respect to Lebesgue measure) of fractional parts \{\frac{f(n)}{p}\} for 1\leq n\leq \sqrt{p}\beta(p), for any fixed polynomial f\in \mathbf{Z}[X] of degree \geq 2, and the equidistribution with respect to the Sato-Tate measure of the angles of Kloosterman sums \theta_p(x) such that
\frac{1}{\sqrt{p}}\sum_{y\in\mathbf{F}_p^{\times}}e\Bigl(\frac{xy+y^{-1}}{p}\Bigr)=2\cos\theta_p(x),
again for 1\leq x\leq \sqrt{p}\beta(p), whenever \beta(p)\rightarrow +\infty. (This last result had been proved in the Polya-Vinogradov regime by Philippe a while ago, and in the full range 1\leq x\leq p-1, it is a celebrated result of Katz.)

As I mentioned, the basic method is very easy, and we call it the “sliding sum method”. It is reminiscent of the van der Corput inequality, and we wouldn’t be surprised to learn that it had already been established by other people. (The applicability to trace functions, on the other hand, requires some rather deep results, as I will discuss below.)

I will give the simplest variant. Consider an interval I in A=\mathbf{Z}/m\mathbf{Z}. We assume that |I|=\sqrt{m}\beta with \beta=\beta(I)\geq 1 (so we are certainly above the square-root length). We then compare upper and lower bounds for the average
\Sigma=\sum_{a\in A}{\Bigl|\sum_{x\in I}{\varphi(x+a)}\Bigr|^2}.

For the upper-bound, we expand the square and exchange the order of summation, obtaining
\Sigma=\sum_{x,y\in I}C(\varphi;x-y)
where
C(\varphi,t)=\sum_{a\in A}\varphi(a)\overline{\varphi(a+t)}
is an additive correlation sum of \varphi (it is a special case of those correlation sums we have considered extensively in our first works.)

We assume that \varphi has the property that, if t is not in a set containing at most c values, we have
|C(\varphi,t)|\leq c \sqrt{m}
(i.e., the correlation exhibits uniform square-root cancellation, except for a few exceptional cases, among which of course we expect to have t=0). Then, using the trivial bound
|C(\varphi,t)|\leq \|\varphi\|_{\infty}^2m
for the exceptional values of t=x-y, we get
\Sigma\ll m|I|+m^{1/2}|I|^2\ll m^{1/2}|I|^{1/2},
where the implied constant depends on the parameter c that we just introduced, and on the L^{\infty}-norm of \varphi (and we use the fact that |I|\geq \sqrt{m}.)

As for the lower bound, we just use the fact that if we shift I by a small amount, the sum over a+I does not change too much (because the interval and its shift overlap significantly; intuitively, we slide the sum over a certain range, hence our name for the method):
\Bigl|\sum_{x\in I}\varphi(x)-\sum_{x\in I}\varphi(x+a)\Bigr|\leq 2|a|\|\varphi\|_{\infty}.

This means that if we consider the sums shifted by a of size up to (roughly) the sum
S=\sum_{x\in I}\varphi(x)
itself, these will have a contribution to \Sigma of size roughly \gg |S|^3. So by comparing the upper and lower bounds (using positivity), we get
|S|^3\ll m^{1/2}|I|^2,
or, in terms of the parameter \beta\geq 1, we get
S\ll |I|^{2/3}m^{1/6}=|I|\beta^{-1/3}
(where the implied constant depends on c and on \|\varphi\|_{\infty}).

Hence, provided we work with functions with bounded L^{\infty}-norm and with uniformly small correlations in the sense we described, then we can get cancellation as long as \beta(I)\rightarrow +\infty, but arbitrarily slowly.

This is elementary, but now the really significant part is that if \varphi is a trace function modulo m=p, and if it is associated to an irreducible sheaf, and is not proportional to an additive character, then it does have the desired properties, the parameter c being bounded in terms of the conductor c(\varphi) of \varphi (which also satisfies \|\varphi\|_{\infty}\leq c(\varphi), so that it serves as a universal complexity parameter, as it did in our previous works.)

We do not need to fight very hard to prove this in our new paper, but this is because we can quote and use various lemmas from the previous ones, and — as usual — rely on Deligne’s fundamental version of the Riemann Hypothesis over finite fields; the result itself is undoubtedly quite deep when applied to trace functions which are not of rank 1.

As a last remark: the method is robust, and we adapt it for instance to sums over proper generalized arithmetic progressions, which are natural generalizations of intervals. Here we get bounds of quality depending also on the dimension of the progression (the saving \beta^{-1/3} is replaced by \beta^{-1/(k+2)} for a k-dimensional progression), but the result is appealing even for large k because the Polya-Vinogradov gap is
\sqrt{m}\leq |B|\leq \sqrt{m}(\log m)^k
for a k-dimensional progression B\subset \mathbf{Z}/m\mathbf{Z}, hence its size also increases with k (this follows from a recent result of X. Shao bounding the L^1-norm of the Fourier transform of the characteristic function of B.)

Fouvry 60: tableaux d’une conférence

I more or less designated myself as the official photographer of the Fouvry60 conference, and took many pictures, quite a few of which turned out rather well. Certainly, when I view them, I think they give an accurate impression of the atmosphere of the conference. In this post I will just preview a very small selection — I and the other organizers are attempting to find the best way to make the full set available for the participants (at least.)

Deshouillers, Iwaniec and Friedlander

Jean-Marc Deshouillers, Henryk Iwaniec and John Friedlander

Although Jean-Marc Deshouillers could only attend the first day of the meeting, I was very happy to be able to take a good picture to remind us all of Kloostermania

Harald Helfgott

Harald Helfgott

Voted best-dressed speaker, and speaking about the ternary Goldbach problem, here is Harald Helfgott.

Mise en abîme

Mise en abîme

The picture for the poster was taken by C.J. Mozzochi around 1999 in Princeton.

Karim Belabas and Henri Cohen

Karim Belabas and Henri Cohen

Any time we use Pari/GP for number-theoretic computations, we can thank Karim and Henri (and a few others) for their work; Henri is currently working on a new modular forms script for Pari…

A friiendly visitor

A friendly visitor

Philippe Michel

Philippe Michel

Étienne Fouvry

Étienne Fouvry

Henryk Iwaniec

Henryk Iwaniec

Henryk Iwaniec presented his work with Brian Conrey in extending the Levinson-Conrey method to allow long mollifiers.

Cécile Dartyge

Cécile Dartyge

Cécile Dartyge, before she presented her very impressive work on the largest prime divisors of values of a polynomial of degree 4.

Tim Browning

“Reach for the skies, punk!”

Tim Browning remembers his first meeting with Fouvry…

Bouillabaisse

Bouillabaisse

Sitting, from left to right, Philippe Michel, Régis (du Moulin) de la Bretèche, Joël Rivat (co-organizers of the meeting) and Étienne Fouvry; on the table, the Bouillabaisse: Congre, Rascasse, Saint Pierre, j’en passe, et des meilleurs.

Jürgen Klüners

Jürgen Klüners

J. Klüners discussed his work with Fouvry on Cohen-Lenstra heuristics and the negative Pell equation; note the last line of the slide: “I can do this”; when collaborating with Fouvry, you have a good chance of receiving such a reassuring message at a tricky point of the work…