In this post, I will just describe briefly three papers (one long, two short) that É. Fouvry, Ph. Michel and myself have finished in recent weeks and days concerning the properties of trace functions. The last one should be on arXiv tomorrow, the others are there already. I will probably say more about some (or all) of these papers later, but here are quick summaries of what we do…
This is a fairly short note, where we use the quasi-orthogonality of trace functions (in the geometrically irreducible case), which encapsulates Deligne’s general form of the Riemann Hypothesis over finite fields, in order to derive upper-bounds for the number of such functions with bounded conductor over a given finite field. As it turns out, the same quasi-orthogonality implies that we do something more geometrically interesting: for “small enough” conductor, the trace function essentially determines the sheaf, and so we are counting sheaves.
In spirit, this is therefore close to many counting problems of number theory: we have a countable set, a measure of complexity which allows us to write it as an increasing union of finite sets, and we want to know how many elements there are in these finite subsets.
A difference with many more classical problems, however, is that it seems rather difficult to get asymptotics when counting trace functions. If we use a Langlands correspondance, we are trying to count automorphic representations on with some bounds on ramification. We realized only rather late the existence of very striking conjectures and results of Drinfeld and Deligne (among others; see this excellent account of Deligne’s work by Esnault and Kerz) for the precise counting in the vertical direction (fixing a base field, and extending it), which should — under suitable conditions — take a form very similar to a Lefschetz Trace Formula. Our bounds do not really contribute to this question, since they are (probably far off) upper-bounds only, but they are completely explicit and they work in the “horizontal” direction (bounding the conductor and letting go to infinity.)
As to the spherical codes of the title, they arise because the quasi-orthogonality shows that, as vectors in the -dimensional real vector space of complex-valued functions on , the (normalized) trace functions with conductor have a strong angular-separation property, and subsets of unit spheres with this property are precisely called “spherical codes”. The question of giving upper-bounds for the cardinality of spherical codes with given angular separation is quite important, but interestingly we did not find the range given by the Riemann Hypothesis in the literature (this has the effect of making the cardinality grow polynomially as a function of for a fixed bound on the conductor; a polynomial growth is the right answer for our problem, though finding the right exponent is a rather delicate question). We tweaked the ideas of Kabatjanski and Levenshtein (who have the best-known results in general) for our purpose, which involves some fun estimates depending on the location of the first zero of the Airy function…
This is again a fairly short paper, which does pretty much what the title suggests: for a trace function over , and an integer , we find an estimate for the -th Gowers norm of (see Section 3 of this part of T. Tao’s notes on higher-order Fourier analysis for an introduction to these norms). This takes the form
where the implied constant depends only (completely explicitly) on the conductor of and on , except (this is the inverse part in the title) when the sheaf which gives rise to contains at least one Jordan-Hölder factor with trace function of the type
for some polynomial of degree . These functions are the natural obstructions to having small Gowers norms (as already emphasized by Gowers), but one doesn’t usually get to have such strong structural statements as those we get: if is geometrically irreducible, then the only possibility is that it is exactly proportional to a function of the type above (for all ).
None of the three of us can be said to be a great expert on the Gowers norms (which have been studied much more deeply by others, most spectacularly by Gowers and recently by Green, Tao and Ziegler), and this note is basically our attempt at seeing if the (fairly algebraic) definition could be studied using the sheaf formalism and the Riemann Hypothesis. But the final estimate is interesting in that, as far as the dependency on is concerned, it is the same as one would get for a “random” function (in the model where we consider a function modulo such that the are independent uniformly bounded random variables with mean zero; we found this statement in the book of Tao and Vu), and it seems that no “deterministic” examples of such functions had been written down before. From our result, one can see for instance that
for any fixed non-constant polynomial , being the Legendre character modulo , with the implied constant depending only on and (again, completely explicitly).
The longest and deepest of our three papers continues the study of orthogonality of trace functions against other natural arithmetic sequences. After dealing with Fourier coefficients of modular forms in the first paper, we consider sums over primes, and sums against the Möbius function. Precisely, let be a trace function modulo . Say that is -exceptional if it is proportional to
for some Dirichlet character modulo and some (allowing trivial and/or .) Then, if is not -exceptional we have
for any , where the implied constant depends only on the conductor of and on . The “critical” case is when or is a bit smaller, in which case we therefore get cancellation with a power saving. It is well-known that one expects such a bound for a -exceptional also, but that this is essentially equivalent to proving the existence of a zero-free strip for some Dirichlet -function, so that the restriction is natural in the current state of knowledge.
Similarly, we have
with the same conditions.
These estimates are rather sweeping: we can take any of the examples of trace functions explained in the previous post (making sure they are not exceptional, but for instance any irreducible sheaf of rank at least is not exceptional, as is any rank sheaf with a singularity not at or …). Although some specializations to specific trace functions had been already studied (sometimes with stronger exponents), we find the generality to be a really remarkable example of the power of the structural features coming from Deligne’s work. and from the formalism of algebraic geometry, which we use again extensively. Indeed, we need not only all the work of the previous paper on twists of Fourier coefficients of modular forms (applied to Eisenstein series), but we also had to establish some additional sheaf-theoretic properties.
To give an example, we get immediately that if is a Dirichlet character of order , and is a polynomial which is not proportional to an -th power times a monomial (e.g., if is squarefree) we have
where the implied constant depends only on and on . As far as we know, the only case previously treated (going back to Karatsuba) is when , with , is linear…
Among a number of applications, which can be found in the paper (and before we find others…), the following is also fairly nice: given squarefree and non-constant, we have
where denotes in general the error term in the prime number theorem in arithmetic progressions
and the implied constant depends only on and on . In fact, with a whiff of extra formalism, we can replace the sum over residue classes of the form taken with the multiplicity of representation to the corresponding sum over the the residues of this form, without multiplicity.
Given a unitary matrix of finite size, it is a tautology that the column vectors of are orthonormal, and in particular that
for any $j\not=k$. This has an immediate analogue for a unitary operator , if is a separable Hilbert space: given any orthonormal basis of , we can define the “matrix” representing by
and the “column vectors” , for distinct indices , are orthogonal in the -sense: we have
Now assume that is some space, say , and is an integral operator on given by a kernel , so that
Intuitively, the values of the kernel form a kind of “continuous matrix” representing . The question is: are its columns orthogonal? In other words, given in , do we have
If one remembers the fact that “nice” kernels define trace class integral operators in such a way that the trace can be recovered as the integral
over the diagonal (the basis of the trace formula for automorphic forms…), this sounds rather reasonable. There is however a difficulty: it is not so easy to write kernels which both define a unitary operator, and are such that the integrals
are well-defined in the usual sense! For instance, the most important unitary integral operator is certainly the Fourier transform, defined on , and its kernel is
for which the integrals above are all undefined in the Lebesgue sense. This is natural: if the kernel were square integrable on , for instance, the corresponding integral operator on would be compact, and its spectrum could not be contained in the unit circle (excluding the degenerate case of a finite-dimensional -space.)
This probably explains why this question of orthogonality of column vectors is not to be found in standard textbooks. There are some examples however where things do work.
We consider the space , and as in the previous post, we look at the unitary operator
where is the principal series representation with eigenvalue of . The result of Cogdell and Piatetski-Shapiro already mentioned there shows that is, indeed, a unitary operator given by a smooth kernel for some function on . This function is explicit, and (as expected) not very integrable: we have
Since it is classical that for , this function is neither integrable nor square-integrable. But, the function on decays exponentially at infinity! This means that the integrals , which are given by
make perfect sense when and have opposite sign (this requires also knowing that there is no problem at , but that is indeed the case, because the Bessel functions here have just a logarithmic singularity there, and the factors eliminate the in the integral.)
It should not be a surprise then that we have
for . This boils down to an identity for integrals of Bessel functions that can be found in (combinations of) standard tables, or it can be proved more conceptually by viewing
as limit of
which is for the function which is the normalized characteristic function of the interval of radius around , and similarly for . Since
when is small enough, the unitarity gives
and one must take the limit , which is made relatively easy by the exponential decay of at infinity…
This is nice, but here comes a challenge: if one spells out this identity in terms of Bessel functions, what needs to be done is equivalent to showing that the function
defined for , is antisymmetric: we have
Now, this fact is an “elementary” property of classical functions. Can one prove it directly? (By which I mean, without using the operator interpretation, but also without using an explicit formula for the integral…) For the moment, I have not succeeded…
I’ll conclude by correcting a mistake in my previous post (it should not be a surprise to anyone that if I attempt to be as clever as Euler, I may stumble rather badly, and the correction is in some sense rather small compared with one might expect)… There I claimed that the integral transform appearing in the Voronoi formula for the divisor function is given by
But this is not the case: the proper formula is
where if , but if . This affects the final formula: we have
instead of the claimed
(the "proof" using the Fourier transform has the same mistake of using instead of , so there is no contradiction between the informal argument and the rigorous one.)
Continuing after my last post, this one will be a list of examples of trace functions modulo some prime number . For each of the examples, I will give a bound for its conductor, which I recall is the main numerical invariant that allows us to measure the complexity of the trace function (formally, the conductor is attached to the object that gives rise to , but we can define the conductor of a trace function to be the minimal conductor of such a .) These objects will be called sheaves, since this is the language used in the paper(s) of Fouvry, Michel and myself, but one doesn’t need to know anything about sheaves to understand the examples.
I will start with a list of concrete functions which are trace functions, and then explain some of the basic operations one can perform on known trace functions to obtain new ones. All these examples will be (I hope) very natural, but it is usually a deep theorem that the functions come from sheaves.
Throughout, is a fixed prime number. Generically, denotes a non-trivial additive character modulo , for instance
(which may also be viewed casually as an -adic character), and denotes a multiplicative character modulo (non-trivial, unless specified otherwise.)
(1) Characters and mixed characters
Let and be non-zero rational functions in . Let
for which is not a pole of , or a zero or pole of , and in that case. Then is a trace weight. The (or an) associated sheaf is of rank , and its conductor is bounded by the sum of degrees of numerators and denominators of and . However, the size of the conductor arises for different reasons for and : for the “additive” component , singularities are poles of , and the contribution of each pole comes from the Swan conductor, which is bounded by the order of the pole at ; for the “multiplicative” component , the singularities are zeros and poles of , and each only contributes to the conductor: the Swan conductors for are all zero.
For analytic applications, the main point is that, by fixing and over , one obtains for each large enough (so that the reduction modulo makes sense), and each choice of characters and , a trace weight associated to and which has conductor uniformly bounded (depending on and only). Thus any estimates valid for all primes with implied constants depending only on the conductor of the trace functions involved will become an interesting estimate concerning and . This applies to the main theorem of my paper with Fouvry and Michel concerning orthogonality of Fourier coefficients of modular forms and trace functions…
These examples are the most classical, and are very useful. Even the simple case and is full of surprises.
(2) Fiber-counting functions
Another very useful example comes from a fixed non-constant rational function , which is viewed as defining a morphism
This is a trace weight, associated to the direct image sheaf
which in representation theoretic terms is an induced representation from a finite-index subgroup, so that it remains relatively simple.
Here the rank of the sheaf is the degree of as a morphism (i.e., the generic number of pre-images of a point ); the singularities are the finitely many in such that the equation
has fewer than solutions (in ) and, at least if , the Swan conductors vanish everywhere, so that the conductor is bounded in terms of the degrees of the numerator and denominator of only. In particular, if is defined over , varying (large enough) will provide a family of trace functions modulo primes with uniformly bounded conductor, similar to the characters of the previous example with fixed rational functions as arguments.
The main reason this function is useful is that, for any other (arbitrary) function on , we have tautologically
(in other words, it is maybe better to interpret as the image measure of the uniform measure on the finite set under , and this formula is the classical “integration” formula for an image measure…)
One also often takes the function
where is the average of over . This is also a trace function (the sheaf corresponding to contains a trivial quotient, and this is the trace function of the kernel of the map to this trivial quotient). We now have
(3) Number of points on families of algebraic varieties
More generally, we can count points on one-parameter families of algebraic varieties of dimension . For instance, families of elliptic curves or of more general curves are quite common. To be concrete, one may have a polynomial , where is seen as the parameter, and consider the curves
Usually, it is not so much the number of points as the correction term that is most interesting. For instance, if the curves are generically geometrically irreducible, and have a single point at infinity, the size of is (for all but finitely many ) of the form
where satisfies the Weil bound
in terms of the genus of . In fact, once one ensures that the family of curves is such that the genus of the curves is the same (for all but finitely many ), the function
is a trace function on the corresponding dense open set of , for some sheaf which has rank . For the other values of , the trace function of the corresppnding middle-extension sheaf might differ from the value defined as above using the number of points, but since the number of those singularities is bounded by the conductor, one can usually (analytically at least) not worry too much about this. Similarly, in many cases the sheaf is tamely ramified everywhere (i.e., all Swan conductors vanish), and so the conductor is well-controlled.
In contrast with the first two examples, the construction of a sheaf with this trace function is not elementary: it is an example of the so-called “higher direct image sheaves” (with compact support). Since, for every “good” , the Riemann Hypothesis for curves shows that
where the are complex numbers of modulus , we can interpret the existence of this sheaf as saying that the algebraic variation of the “eigenvalues” is itself controlled by an algebraic object. This is one of the main insights that algebraic geometry (and étale cohomology in particular) brings to analytic number theory.
The family of elliptic curves
in my bijective challenge is of this type.
(4) Families of Kloosterman sums
One of the great examples, for analytic number theory, is given by families of Kloosterman sums: for an integer , and a non-zero , we let
The Weil bound for , and the even deeper work of Deligne for larger , prove that
for all invertible modulo . Further work, relying once more on the powerful formalism of étale sheaves and higher direct images in particular, shows that the function
is (the restriction to invertible of) a trace function for an irreducible sheaf, with conductor bounded in terms of only.
(5) The Fourier transform
If we have a function modulo , we define its Fourier transform by
for (the normalization here is convenient, as I will explain). It is now a very deep fact that, if $\latex K$ comes from a sheaf, then so does (the minus sign is natural, but this has to do with rather deep algebraic geometry…) More precisely, one has to be careful because of the fact that the Fourier transform of an additive character (as a function) is a multiple of a delta function. The latter does fit nicely in the framework of étale sheaves, but not as a middle-extension sheaf or Galois representation (because it is zero on a dense open set, so it would have to be zero to be a middle-extension sheaf or to come from a Galois representation). There is a geometric solution to this issue, but it involves speaking of perverse sheaves and related machinery, which we have barely started to understand: the Fourier transform works perfectly well at the level of perverse sheaves, and one can use their trace functions just as well as those of Galois representations. Since, in our current applications, we can always deal separately with additive characters (or delta functions), we have avoided having to deal with perverse sheaves (up to now…)
The existence of the -adic Fourier transform of sheaves was first proved by Deligne, but the theory of the sheaf-theoretic Fourier transform was largely built by Laumon (with further contributions, in particular, from Brylinski and Katz). To illustrate how powerful it is, consider
a relatively simple case of Example (1). We then have
so that the existence of the Fourier transform at the level of sheaves implies the existence of the Kloosterman sheaf parameterizing classical Kloosterman sums as in the previous example.
Other examples that arise from our previous examples are many families of exponential sums, for instance
(arising from Example (1); one must assume either that is not a polynomial of degree or that is non-trivial to have a well-defined sheaf), or
for with equal to the number of poles of (the sum over is over values where the rational function is defined), that arises from Example (2) (applied with the function ).
This operation of Fourier transform has one last crucial feature for applications to the analysis of trace functions: the conductor of is bounded in terms of that of only. This is something we prove in our paper using Laumon’s analysis of the singularities of the Fourier transform, and in fact we show that if the conductor of is at most , then the conductor of is at most . Hence the examples above, if the rational functions (and/or ) are fixed in and then reduced modulo various primes, always have conductor bounded uniformly for all .
(6) Change of variable
Given a non-constant rational function seen as a morphism
and a trace function , one can form the function
This is again, essentially, a trace function: as in Example (3), one may have to tweak the values of at some singularities (because pull-back of middle-extension sheaves do not always remain so), but this is fairly easily controlled. Moreover, one can also control the conductor of in terms of that of , taking into account the degree of latex f$. A specially simple case of great importance is when is an homography
(an automorphism of ) in which case no tweaking is necessary to defined , and the conductor is the same as that of (which certainly seems natural!)
We can now compose these various operations. One construction is the following (a finite-field Bessel transform): start with , apply the Fourier transform, change the variable to , apply again the Fourier transform. If we call the resulting function, the examples above show that if is a trace function with conductor , then will also be one, and its conductor will be bounded solely in terms of (in fact, it will be by the bound discussed in Example (5)).
Trailer! In the next post in this series, I will discuss the Riemann Hypothesis for trace functions and its applications. But probably before I will discuss the more recent works of Fouvry, Michel and myself, since we now have three further papers in our series — two small, and one big.
Am I the last person to notice that for , the even moment
of a standard gaussian random variable (with expectation zero and variance one) is the same as the index of the Weyl group of inside the Weyl group of (in other words, the index of the groups of permutations of elements commuting with a fixed-point free involution among all permutations)?
If “Yes”, what else have I been missing in the same spirit?
Courtesy of the divisor function, here is another fun example of reasoning in the great style of Euler (the last installment is rather old…) A classical tool to study the distribution of values of (the number of positive divisors of ) is the Voronoi summation formula, which expresses a sum
for a nice test function , some positive integer , and some integer coprime to , in terms of a “dual sum”
where is the inverse of modulo , and
is some integral transform of , with kernel involving the classical Bessel functions and . Precisely, we have
and one should add that there is also a main term in the Voronoi formula, but it is irrelevant for today's story. A classical application of this formula is to improve the error term in Dirichlet's asymptotic evaluation of
which was done indeed by Voronoi.
In an ongoing work with É. Fouvry, S. Ganguly and Ph. Michel, we needed to know some unitarity property of the transformation
This is an entirely classical question, but we didn't find a ready-made statement in Watson’s book on Bessel functions. There is however a formal argument that suggests the answer: if we consider the function of two real variables defined by
then it turns out that we have
where is the standard Fourier transform of (this is contained in Section 4.5 of the book of H. Iwaniec and myself.) Hence we have, by the unitarity of the Fourier transform, the identity
Offhandedly, by changing variables, this means that
which would give
(the factor comes from the fact that is extended to an even function on from its original source as a function defined for non-negative real numbers), if not for the fact that the “constant” is the integral
Alas, it diverges, although probably Euler would write it as (two infinities from the divergence at , the other two from the divergence at ), and be happy with the outcome.
One can then prove rigorously the formula by truncation arguments, but here is a more conceptual argument (which offers the advantage of being something we can just quote), which follows from the interpretation of the Voronoi formula in terms of the representation theory of . What happens is that there exists a unitary representation of (the principal series with Casimir eigenvalue ) which can be represented as acting on the Hilbert space (the Kirilov model) in such a way that the unitary operator
is given by an integral operator
for some function , which Cogdell and Piatetski-Shapiro called the Bessel function of (see this note of Cogdell for a short explanation of this, with the analogues for finite fields and -adic fields). Now, by direct inspection of the formula for that Cogdell and Piatetski-Shapiro computed, and comparison with the kernel in the Voronoi formula, one finds that
(in this other short note, Cogdell explains why it is no coincidence that this abstract Bessel function appears in the Voronoi summation formula). Now, from
which holds for all because is unitary on , we deduce exactly …
Remark. There is a completely similar story where the circles replace the hyperbolas , or in other words, if one defines
Then the Fourier transform of is still a radial function , and the map is a Hankel transform (it involves the Bessel function ). Its unitarity follows then immediately from that of the Fourier transform, since the analogue of the divergent integral is now, indeed, a finite constant.
In terms of representation-theory, the story is the same as above, except that the representation is replaced with a discrete series representation. One can also deal similarly with radial functions in higher-dimensional euclidean spaces, which involves other discrete series representations.