An exponential sum of Conrey and Iwaniec

In their remarkable paper on the third moment of special values of twisted L-functions of modular forms, Conrey and Iwaniec require (among many ingredients!) a best-possible estimate (square-root cancellation) for the exponential sums
S(\chi,\eta)=\sum_{u,v}{\chi(uv(u+1)(v+1))\eta(uv-1)}
where \chi and \eta are non-trivial multiplicative characters modulo a prime p. Using a combination of the formalism of such exponential sums (including the consequences of the rationality of the corresponding L-functions over finite fields and of the Riemann Hypothesis of Deligne) together with some elementary averaging arguments inspired by ideas of Bombieri, and special treatment of the case \chi=\eta, they succeed in proving that
S(\chi,\eta)\ll p,
which is the expected estimate.

I looked at this sum recently because I wondered if (like the other sum of Friedlander and Iwaniec mentioned at the end of my previous post) it would turn out to be a special case of the correlation sums that feature in my recent paper with Fouvry and Michel. As far as I can see, it isn’t of this type, but this led me to attempt to find a proof of the estimate of Conrey and Iwaniec based on the philosophy of reducing a character sum in more than one variable to a one-parameter sum which involves more complicated summands than character values.

This does indeed work. I’ve written a short note explaining this (it is not intended for publication, so rather terse at some points). The outline is rather straightforward: we write
S(\chi,\eta)=\sum_{u}R(u)T(u)
where
R(u)=\chi(u(u+1))
is a character and
T(u)=\sum_{v}{\chi(v(v+1))\eta(uv-1)},
which is a more “complicated” summand, but still a one-variable character sum.

It turns out that R(u) and -T(u) are both among these nice algebraic trace functions that occur in the paper with Fouvry and Michel. In particular they have the crucial irreducibility property that make such functions quasi-orthogonal, which means here that we get
\sum_u R(u)T(u)\ll q
unless \overline{R(u)} and T(u) are proportional for all u. (The size q on the left-hand side comes from the fact that T(u) is of weight 1, i.e., a sum of a bounded numbers of algebraic numbers of modulus \sqrt{q}; the one-variable sum over u gives square-root cancellation — by the Riemann Hypothesis — for the bounded summands q^{-1/2}R(u)T(u).)

One should feel that it is unlikely that \overline{R(u)} and T(u) are proportional, and indeed it is easy to show during the construction that they are not (the algebraic reason is that one is the trace of a representation of degree 1, whereas the other is a trace of a representation of degree 2…). To finish the proof, one must still be careful to control the implied constant in the estimate above, since it is a priori dependent on the characteristic of the finite field involved. A true dependency like this would be catastrophic, but one can show (in different ways) that this constant is in fact bounded uniformly (the basic reason is that the invariants measuring the complexity of the functions R(u) and T(u) are bounded independently of p).

This argument is quite clean in outline (and requires no special consideration of special cases like \chi=\eta) but it uses rather deep tools. In addition to three applications of the Riemann Hypothesis (one to understand the summand T(u), another to perform the sum over u, and the last as explained in a second), there is a fair amount of deep formalism involved in checking that T(u) has the desired properties for a nice algebraic function. There is one last nice ingredient: in order to ensure quasi-orthogonality, one needs to know that T(u) is irreducible in some precise sense. This requirement turns out to translate (using a last time the Riemann Hypothesis) to a nice diophantine property of these sums, namely that
\lim_{|k|\rightarrow +\infty}\frac{1}{|k|^2}\sum_{u}{|T_k(u)|^2}=1,
where k runs over finite extensions of \mathbf{F}_q and, denoting by N the form from k to \mathbf{F}_q, the sum
T_k(u)=\sum_{v\in k}{\chi(N(v(v+1)))\eta(N(uv-1))},
is the natural companion of T to such extensions. The proof of this limit formula is a nice exercise in manipulating finite sums in the tradition of analytic number theory…

Algebraic twists of modular forms, III

After a small break while traveling, I continue the discussion of my paper with Fouvry and Michel, continuing from here. But first of all, I’m happy to mention two proofs of the counting identities mentioned in my introductory post: one that D. Zywina has sent me, a nice proof based on explicit computations of modular forms and functions, and another that is in fact just an application of a more general result of Deligne and Flicker on local systems on the projective line minus four points (see Section 7 of this preprint).


Now, if memory serves, at the end of the last post, we saw that analytic techniques reduce the study of our sums

S(f;K)=\sum_{n}\rho_f(n)K(n)V(n/p)

to the study of correlation sums

C(K;\gamma)=\sum_{z}\overline{\hat{K}(\gamma\cdot z)}\hat{K}(z),

where \hat{K} is the normalized Fourier transform modulo p.

For these, we need to show that “most” are small in order to conclude. Fixing a parameter M\geq 1, let us therefore define G_{K,M} to be the set of matrices \gamma in \mathrm{PGL}_2(\mathbf{F}_p) such that

|C(K;\gamma)|>Mp^{1/2}.

The discussion in the previous post shows that if, for some M\geq 1, all matrices in G_{K,M} are upper-triangular, we get

S(f;K)\ll p^{1-1/8+\epsilon}

for any \epsilon>0. But we can deal with some more exceptions. More precisely, we show that this estimate is still valid, with an implied constant depending only on M, if the matrices in G_{K,M} are all either

  1. parabolic (they have a single fixed point in \mathbf{P}^1), or
  2. upper-triangular, or
  3.  if they either fix or permute two distinct points taken in a finite list, say containing at most M pairs \{x,y\}.

Compared with the previous discussion, the subtlety here is that it can indeed happen that such matrices appear in the transformations we apply to S(f;K), and proving that their contributions remain under control involves some rather fun analysis.

This summarizes, in a rather hurried fashion, the first part of our paper. Logically, we obtain statements which are self-contained, but which are only applicable directly in a few cases (the case K(n)=\chi(n) is an excellent example where this can be done). To go further, we need to use algebro-geometric tools.

And here comes a dilemma well-known to anyone who has had to present research involving two relatively distant areas of mathematics, so that specialists of one may not know the other. I can give a concise definition of the class of weights K(n) that we consider, and it will be immediately familiar and natural to algebraic geometers — but not to most analytic number theorists.

So, instead, I will just say (very fast) that these weights, which we call “irreducible trace weights”, are trace functions of geometrically irreducible \ell-adic middle-extension Fourier sheaves pointwise pure of weight 0 on \mathbf{A}^1/\mathbf{F}_p. Then I will defer to a later post a more leisurely run-through this definition, together with more examples of weights and their formalisms, and with some further analytic properties which are of independent interest.

The reason this class of weight “works” can however be quickly summarized in a rather miraculous property, which is essentially a consequence of the Riemann Hypothesis: assuming K has small complexity (in a precise sense, saying basically that it comes from a sheaf with small rank and small ramification), the correlation sum C(K;\gamma) is either \ll p^{1/2}, where the implied constant depends only on the numerical invariant measuring the complexity of the weight, or we have an equality

\hat{K}(z)=\varepsilon(\gamma) \hat{K}(\gamma\cdot z)

for all z and some fixed complex number \varepsilon(\gamma) of modulus 1. From this second part, we see that, for a suitable M, the set G_{K,M} discussed above is contained in the group \mathbf{G}_K of all \gamma (viewed in \mathrm{PGL}_2(\mathbf{F}_p)) for which there exists \varepsilon(\gamma) (of modulus 1) such that the identity above holds. This provides us with enough structure on the set of “bad” matrices (with large correlation sums), from which the bound

S(f;K)\ll p^{1-1/8+\epsilon}

can fairly simply be deduced for irreducible trace weights using the automorphic part of our paper.

Indeed, we distinguish two cases, depending on the structure of the subgroup \mathbf{G}_K\subset \mathrm{PGL}_2(\mathbf{F}_p):

  • If  \mathbf{G}_K has order coprime to p, we use the classification of such subgroups, and see that \mathbf{G}_K is either contained in the normalizer of some maximal torus, i.e., in the stabilizer of a two-point set \{x,y\} (and hence these weights fall under the third case (3) of allowed “bad” matrices) or otherwise \mathbf{G}_K is of order at most 60, and consists only of semisimple elements (which allows us again to apply (3), with possibly more than one pair \{x,y\} involved, but less than 60);
  • If p\mid \mathbf{G}_K, we find some element \gamma_0 in \mathbf{G}_{K} which is of order p, hence unipotent. Thus \gamma_0 acts transitively on \mathbf{F}_p (minus at most a single point); the formula defining the subgroup \mathbf{G}_K above implies easily that \hat{K}(z) is then basically of a very restricted type, namely
    \hat{K}(z)=\varepsilon_0 e(a\gamma\cdot z/p)
    for some fixed matrix \gamma and fixed complex number \varepsilon_0 and integer a. But this weight comes from a specific (Fourier transform of a) Artin-Schreier sheaf (it might not be the one defining K originally, but a fortiori, we can assume it is!). For this sheaf, a rather simple analysis shows that \mathbf{G}_K is a unipotent group isomorphic to \mathbf{F}_p (unless a=0 or \gamma=1, which are exceptional cases). So the bad matrices are either trivial or parabolic, and we can appeal to case (1) to handle these weights…

I will conclude for today with another fact we noticed only very recently: in the special case K(n)=e(\bar{n}/p), it turns out that the correlation sums had already appeared in a paper of Friedlander and Iwaniec on incomplete Kloosterman sums, in the special case of C(K;\gamma) when \gamma is lower-triangular (and non-diagonal). In the Appendix to this paper, Birch and Bombieri give two proofs of the estimate C(K;\gamma)\ll p^{1/2} (for lower-triangular matrices): one is geometric (based on counting points on surfaces over finite fields), but the second one has amusing links to our arguments, with a camouflaged sighting of the group structure of the group B_- of lower-triangular matrices and of the fact that G_{K,M}\cap B_- is contained in a subgroup of B_-… (Interestingly, there is no trace of modular forms in this paper of Friedlander and Iwaniec, so the coincidence is rather unexpected.)

OED clusters

Today’s “word of the day” from the OED was “femme incomprise”. The list of nearby words contains:

  • femme (first quote 1814, from a letter of Byron)
  • femme de chambre (first quote 1741)
  • femme de ménage (first quote 1826)
  • femme du monde (first quote 1849)
  • femme fatale (first quote 1879; one wouldn’t guess that this is taken from an article in that well-known journal of cosmopolitan sophisticates, the St Louis Globe Democrat)
  • femme incomprise (first quote 1841)

I wonder if there is a bigger cluster of foreign words with a common root?

The other one I know and like, though it is not in strictly alphabetic order, is also quite impressive:

  • simpatico, simpatica (first quote 1864, “The Frau Professorin was less ‘simpatica’”, from the memoir of a certain H. Sidgwick)
  • sympathique (first quote 1859, in a letter of Queen Victoria, “The sight of a professor or learned man alarms me, and is not sympathique to me”)
  • sympathisch (first quote 1911)

 

Selberg archive

Looking around the IAS website yesterday, I noticed that a selection of papers from Selberg’s archive is now available online, to be supplemented by a more comprehensive website. Mostly, these seem to be his transparencies and other notes for lectures he gave (eg., this one), rather than drafts of unpublished work. One interesting item is a transcript of an unpublished interview from 1989. Here is a quote I like (beginning after a discussion of the Chowla–Selberg formula and collaboration in general):

Yes, this was something, of course, quite different from what started it. It’s rather typical in many ways that in mathematics very often what you end up with has very little to do with what you start out with. You may start out trying to do something, and as you get into it and learn something either your attention may switch completely, — because you understand something more of the problem, perhaps what you had initially as a goal is quite impossible — or you may come across something as you are going along, quite by accident, that completely throws your attention in a completely different direction.

One can never, I think, predict where one is going when one starts out.

(p. 29 in the the first part of the interview)

Some things I also learnt or realized: Mordell was American (and not English); Selberg got the Fields medal well before he proved the trace formula (1950 against the first results in 1951–1953); Veblen used to organize wood-cutting expeditions in the IAS woods, and apparently Pauli was considered a dangerous participant with axe and saw (page 6 of the second part).

Various updates

Travel and vacation have delayed the writing of the next installment of my series of posts on “Algebraic twists of modular forms”, but I am working on it and hope to have it ready soon… In the meantime, here are two updates on some of my favorite topics from yesterposts.

First, for the bad news: if your proof of the Riemann Hypothesis depended on my earlier claim that the spectral gap for the Lubotzky group modulo primes is at least 2^{-38}, then you’re in trouble. This bound depended on an induction which turns out to be so mistaken that I have promised myself never more to be upset when I see a completely wrong induction proof in a student’s paper. This was found by the referee of my paper on explicit growth and expansion, who also did the most amazing job in checking all the myriad details inherent in a paper of this type. I have put up a corrected version on the web, where the spectral gap becomes again 2^{-2^{45}} or so. (I have also started re-reading and updating my notes on expander graphs, correcting the same mistakes, but also replacing the fully explicit version with a more readable “just show the spectral gap exists”-writeup.)

Now, for the better news: in the last few weeks, I also prepared a written version of the mini-course on “Sieve in discrete groups” that I gave at MSRI in February, on the occasion of the “Hot Topics” Workshop held there concerning “Thin groups and super-strong-approximation” (a Proceedings volume is in preparation; another excellent survey already available is the one by Rapinchuk on Strong Approximation). This was just intended to be a straightforward survey, but while finishing it, I wondered again about a question I had vaguely thought about earlier without conclusion: “Is there an Erdös-Kac Theorem in the context of the ‘affine’ sieve?” More precisely, as in the Bourgain-Gamburd-Sarnak context, let \Gamma\subset \mathrm{SL}_r(\mathbf{Z}) be a finitely generated subgroup, and assume (for simplicity) that its Zariski closure is \mathrm{SL}_r (which means that, for all primes p large enough, the reduction modulo p from \Gamma surjects to \mathrm{SL}_r(\mathbf{F}_p)). Let f be a non-constant, integral-valued, polynomial function on \mathrm{SL}_r(\mathbf{Z}). Can one prove that the number of prime factors of f(g), for a “random” element g\in\Gamma, is approximately gaussian when suitably normalized?

The answer, it turns out, is “Yes”. In fact, as soon as I thought about it a bit seriously for a few minutes, I remembered a very nice paper of Granville and Soundararajan which gives a short and easy proof of the classical Erdös-Kac Theorem, and goes on to explain how their method can be generalized to study the number of prime factors in much more general sequences than the integers. It was then almost immediately clear that one can use this method to get a form of Erdös-Kac Theorem for discrete groups (and I wouldn’t be surprised if this had been noticed earlier by others.) One interesting point is that this seems to be a case where defining “random” elements of \Gamma seems most natural in terms of random walks, instead of looking at balls with respect to some metric or other. In the situation above, if S=S^{-1} is a generating set of \Gamma, we denote by (\gamma_n) a random walk on \Gamma, starting at 1 with steps taken uniformly and independently at random from S. Then one shows that there exists some \kappa>0 such that the random variables
\frac{\omega(f(\gamma_n))- \kappa\log n}{\sqrt{\kappa \log n}}
converge to the standard gaussian as n\rightarrow +\infty, where \omega(n) is the number of primes dividing n, with the convention that it is 0 for n=0.

This applies in other contexts involving sieve in discrete groups. For instance, if (M_n) is a sequence of random Dunfield-Thurston 3-manifolds, and if we denote by \omega(M_n) the number of primes p such that H_1(M_n,\mathbf{F}_p)\not=0, with the convention that \omega(M_n)=0 if M_n has non-zero first rational Betti number, then the sequence
\frac{\omega(M_n)-\log n}{\sqrt{\log n}}
also converges in law to the standard gaussian.

Of course, I can’t help wondering if there could exist, as in the classical case, a finer statement of mod-Poisson convergence concerning the limit behavior of
\mathbf{E}(e^{it \omega(f(\gamma_n))})
for t\in\mathbf{R}. This seems very hard (taking t=\pi gives the average of the Liouville function) and rather mysterious… The renormalized Erdös-Kac Theorem is really about the distribution of “small” prime factors (on logarithmic scale) of integers, as one can see easily by noting that an integer n has a bounded number of prime factors p>n^{\delta} for any fixed \delta>0; since the theorem implies in particular that most integers have about \log\log n prime factors, we see that the limiting distribution arises from integers without prime factors of such size. The mod-Poisson convergence, on the other hand, does take these factors into account, but we have currently no idea whatsoever about the distribution of “large” prime divisors of f(g) in the affine sieve context…