Trace functions, I

This is again the first of a series of a few posts in which I will explain (as promised a very long while ago, and as far as I can…) the trace weights that are used in my paper with É. Fouvry and Ph. Michel (henceforth, this paper will be referred-to as FKM). Given a prime number p, these are certain specific functions

K\,:\, \mathbf{F}_p\rightarrow \mathbf{C}

that “come from algebraic geometry”, and that can be studied using both a very rich formalism, and such extraordinarily deep results as Deligne’s “Weil 2” form of the Riemann Hypothesis over finite fields.

In fact, each function of this type is really a kind of “shadow” of a more intrinsic (more algebraic, more geometric, more arithmetic, as you wish) object, and it is rather these objects which algebraic geometry studies. In general, K does not determine this other object: if I call \mathcal{F} the latter, it may well be the case that two distinct objects \mathcal{F}_1 and \mathcal{F}_2 give rise to the same trace function K. However, there is also a basic complexity invariant c(\mathcal{F})\geq 1 defined for a given \mathcal{F} (which is called its “conductor”), and one can show (this uses the Riemann Hypothesis…) that, given p, there is a bound T(p) (which grows with p) such that a given function K can come from at most one object \mathcal{F} with complexity at most T(p). I will come back to this in a later post, since I consider the question of determining precisely T(p) to be quite fundamental and fascinating, but for the basic purpose of FKM, this issue does not really arise.

As a terminological aside, we tend to call these functions K either “trace weights” or “trace functions”. Maybe a better word might be well-deserved for this notion, but we’re not quite sure what might work, though possibly we might use “tracic function”, a good translation of the French fonction tracique that we’ve found ourselves using; this has, at least, some classic ring.

In this first post, I will outline the three possible definitions (or interpretations) of the class of trace functions, going from what is possibly the most closely related to notions known to analytic number theorists, and ending with the most flexible, but maybe least familiar one.

Special Hecke eigenvalues of automorphic forms. In the first picture, one looks at automorphic forms related to the field F=\mathbf{F}_p(T) of rational functions over the finite field \mathbf{F}_p. As is the case for classical modular forms, there are Hecke operators associated to each place of F, in particular to the irreducible polynomials P_x=X-x for x\in\mathbf{F}_p. Given an automorphic form \phi, one can then define a function
the corresponding Hecke eigenvalue for these particular Hecke operators. The complexity of \phi can then be defined as the sum of the “traditional” automorphic conductor and the rank r. Indeed, it is essential here to consider automorphic forms on all groups \mathrm{GL}_r(F), and not just on \mathrm{GL}_1 or \mathrm{GL}_2.

As examples, imitating the correspondance from Dirichlet characters to Hecke characters for \mathrm{GL}_1 over the field \mathbf{Q}, it is not too difficult to construct explicitly some automorphic forms (of rank 1) for which the associated functions are given by
K(x)=e(P(x)/p),\quad\quad\text{ or }\quad\quad K(x)=\chi(P(x)),
for some polynomial P\in\mathbf{Z}[X] and some multiplicative Dirichlet character \chi. These are certainly the most natural-looking “functions of algebraic origin” on a finite field, and indeed this construction of (analogues of) Dirichlet characters is the original, and easiest, way to prove the rationality and functional equation for the associated L-functions over F (since, in order to prove this, one does not even need to mention automorphic forms, the whole argument happening within the realm of Dirichlet characters.)

Despite their many fine qualities, automorphic forms are however a bit inflexible from the point of view of defining generalizations of these basic functions K(x). For instance, it is rather difficult to write down concretely the function attached to an automorphic form of rank at least 2. In fact, I don’t really know how to do it (except for automorphic forms built from the case r=1, like analogues of Eisenstein series) without first applying one of the two other definitions, constructing some object \mathcal{F} and associated trace function K, and then invoking some version of the Langlands correspondence to claim the existence of some automorphic form \phi with Hecke eigenvalues K_{\phi} coinciding with the original K.

Similarly, given two functions K_1(x), K_2(x) arising as Hecke eigenvalues of some automorphic forms \phi_1 and \phi_2, it is a rather big theorem to show that there exists another automorphic form with eigenvalues


(for x unramified for both \phi_1 and \phi_2): this is the general theory of the Rankin-Selberg convolution.

Another serious drawback (which I will amplify later) is that this is — as far as I know, and at current time — strictly a one-variable story. There is no simple definition (that I know) that can be used to easily package a family of automorphic forms \phi_t and, for instance, create a new automorphic form \Phi with Hecke eigenvalues related to some average of the eigenvalues of \phi_t.

Galois representations of function fields. The first alternative to automorphic representation is given by Galois representations, and it is again a customary picture on the side of number fields. The base field is still F=\mathbf{F}_p(T), but we now consider the Galois group
of some separable closure of F, and finite-dimensional representations
\rho\,:\, G\rightarrow \mathrm{GL}(V).
Then, as is customary in algebraic number theory, for any x\in \mathbf{F}_p, we have the associated decomposition and inertia group at the place corresponding to x, and the Frobenius automorphism Fr_x which acts on V if x is unramified for \rho (i.e., if the inertia group at x acts trivially on V) and which acts on the invariants V^{I_x} otherwise. In all cases we can define a function
K(x)=\mathrm{Tr}(\rho(Fr_x)\mid V^{I_x}).
It is immediately clear that such a definition gives a very flexible formalism, because we are now dealing largely with linear algebra. So formally, we can add these functions (taking direct sums of representations), multiply them (taking tensor product; because this operation does not always commute with invariants, the corresponding trace function coincides with the product of the two factors at the unramified x, but may differ at the others.) There is a non-trivial difficulty having to do with topology: to obtain a good theory, since G is an infinite profinite group, we want to consider continuous representations. But then, if V is a \mathbf{C}-vector space with its usual topology, we have the difficulty that there are too few representations: any continuous representation then has finite image. One works around this issue by the well-know device of picking some auxiliary prime number \ell\not=p, and considering continuous representations into \bar{\mathbf{Q}}_{\ell}-vector spaces. There are many representations in that case (in particular, many with large infinite image), but of course the trace function now takes values in an \ell-adic field. Qu’à cela ne tienne (or, as Katz says, ell-adic, schmell-adic), one can pick (with some effort or help from a friendly axiom) an isomorphism
\iota\,:\, \bar{\mathbf{Q}}_{\ell}\rightarrow \mathbf{C},
and consider the function
x\mapsto \iota(\mathrm{Tr}(\rho(Fr_x)\mid V^{I_x})),
which is complex-valued.

The complexity is, here also, easy to define: there is a notion of Artin conductor for such a representation, and we add the dimension of V to take the latter into account.

For applications to constructing interesting function, this business involving \ell shouldn’t be considered as too problematic. In fact, to a large extent, it turns out that the theory is rather independent of \ell. Without wanting to develop this too much, one can already see it by noticing that for any \ell\not=p, one can rather easily construct Galois representations with trace functions equal to
K(x)=e(P(x)/p),\quad\quad K(x)=\chi(P(x)),
the basic examples already considered. In fact, this is rather simpler than the corresponding construction of Dirichlet characters of F, and in particular, it is very easy to go from the construction of representations \rho_a and \rho_m with respective trace functions
K_a(x)=e(x/p),\quad\quad K_m(x)=\chi(x),
to the case involving a general polynomial: we have a map F\rightarrow F by T\mapsto P(T), hence a map of Galois groups P^*\,:\, G\rightarrow G, and we can “just” consider the composites
K(x)=\rho_a\circ P^*,\quad\quad K(x)=\rho_m\circ P^*,
to get the desired representations. (This is really a restriction of representations.)

This theory also has fairly natural extensions to higher-dimensional varieties (though one must assume some smoothness for the theory to work decently). To a large extent, FKM might have been written in this language, as far as the definitions of trace weights are concerned. But we use instead the third approach…

Middle-extension sheaves on the affine line. This last theory is closer in terms of formalism to the previous one, but more geometric in spirit, and it is the most flexible. Indeed, it is the one we use in FKM. But the counterpart to this geometric flexibility is that the basic flavor of the definition is least familiar to analytic number theorists. (Here, I am reminded of Cyrano de Bergerac who, having described six different ways of going to the moon, and being asked “Which one did you choose”, replied “A seventh”; or, in proper subjunctive French, –Mais voilà six moyens excellents !. . .Quel système Choisîtes-vous des six, Monsieur ? — Un septième !)

Here the basic object is an \ell-adic étale sheaf on the affine line over \mathbf{F}_p, with an added “regularity” property. It is a consequence of basic properties of such objects that, for any x\in\mathbf{F}_p, we can look at the “stalk” at x, which is a finite-dimensional \bar{\mathbf{Q}}_{\ell}-vector space \mathcal{F}_x, and that the Frobenius automorphism (in some incarnation) acts on this vector space, allowing us to define a trace function
K(x)=\mathrm{Tr}(Fr\mid \mathcal{F}_x),
and this is how we get our trace weights from this point of view.

To get a feeling for the actual meaning of this, I would like first to refer to my old expository text on Deligne’s first proof of the Riemann Hypothesis over finite fields, where the first part is an introduction to étale cohomology, which might be useful for readers with some basic background in elliptic curves over finite fields, but who haven’t studied the étale topology yet. But here is a more down-to-earth way of seeing things, which mixes fish and fowl to some extent.

A middle-extension sheaf \mathcal{F} on the affine line over \mathbf{F}_p, whatever is the actual definition, comes concretely with some data. One of them is a finite set S\subset \bar{\mathbf{F}}_p of singularities, which is defined over \mathbf{F}_p (in other words, it is the zero set of some non-zero polynomial in \mathbf{F}_p[T]). On the complement U of this set, the sheaf is what is called lisse, which is equivalent to saying that there is a representation of the étale fundamental group \pi_1(U) of U in some finite-dimensional \bar{\mathbf{Q}}_{\ell}-vector space which is “equivalent” to the restriction of the sheaf to U. But this étale fundamental group is, in fact, none other (canonically isomorphic) than the Galois group G=\mathrm{Gal}(F^{sep}/F) of the previous description. And in fact, if we view the representation corresponding to \mathcal{F} as a representation of G, the trace functions are the same.

This allows us at least to describe how one can define the complexity of a middle-extension sheaf: one just takes the complexity of the associated Galois representation (the dimension of the vector space, plus the Artin conductor.)

What is the point then of thinking in terms of sheaves? To my mind, here are some important advantages:

  • The geometric picture that arises is often the easiest way to “see” how to manipulate trace functions to construct new ones;
  • There are different ways of extending a lisse sheaf on U to a sheaf on the affine line, and the “middle-extension” is just one of them. It is, in some sense, the best one, but there are others. In the general theory, these may come out because some construction goes outside of the realm of middle-extension sheaves: for instance, the tensor product of two middle-extension sheaves is not one in general; this accounts in a precise way for the way the product of two trace functions may not be one exactly;
  • The theory of sheaves extends handily to higher-dimensional varieties, where more types of singularities and other behaviors arise because there is “more room” for the dimension of various sets where different behaviors arise (so sheaves on a surface might be supported on a curve, etc). Here it is important to see middle-extension sheaves as just some of the étale sheaves, and to allow more general ones.
  • The formalism is by far the most powerful. Especially crucial to the proofs of the deepest results (including the Riemann Hypothesis) is the existence of the étale cohomology groups of a sheaf, and of so-called “higher-direct images” (with compact support or not), which make sense for étale sheaves, but in general do not preserve such regularity properties as being lisse or middle-extension.
  • As a consequence of the above, this is the language in which the sources concerning the properties of étale sheaves are written; for FKM, this means especially the books of N. Katz, which we have consulted and referenced extensively…

To conclude this first post, here is a concrete illustration of what the sheaf formalism gives that is important to analytic number theorists, and which is completely mysterious (as far as I know, at least) on the level of Galois representations or automorphic forms: the existence of the Fourier transform. In fact, given a trace weight K(x) associated to some sheaf \mathcal{F}, a construction of Deligne delivers another sheaf \mathcal{G}, which is still a middle-extension sheaf, and is such that the associated trace function is
This construction is not obvious; in fact, it involves (1) the fact that sheaves make sense on higher-dimensional varieties, with a wide variety of “functorial” properties; (2) the fact that higher-direct images exist: this is what is needed to obtain results of the type “a sum over y of some trace functions parametrized by x is itself a trace function”…

If we assume the existence of this construction (and most analytic number theorists would argue that, whatever a theory of functions of algebraic origin might do, it should be compatible with Fourier transform…) we immediately expand our range of examples with some highly-interesting ones, starting with the basic cases
K(x)=e(P(x)/p),\quad\quad K(x)=\chi(P(x)),
whose Fourier transforms are extremely interesting: they are values of families of exponential sums in one variable.

For instance, take
K(x)=e(\bar{x}/p),\text{ for } x\not=0\pmod{p},
where we denote by \bar{x} the inverse of x modulo p. Then we find that
is a trace weight! In other words, the family of Kloosterman sums S(x,1;p), as a function of x, is a function of algebraic origin modulo p

Trailer! In the next posts! I will probably next describe many examples of trace functions, and discuss the formalism that allows us to manipulate them conveniently. After this, I will come to their analytic properties, where the key point is the Riemann Hypothesis over finite fields…

Published by


I am a professor of mathematics at ETH Zürich since 2008.