E. Kowalski’s blog

Comments on mathematics, mostly.

Three new papers

without comments

In this post, I will just describe briefly three papers (one long, two short) that É. Fouvry, Ph. Michel and myself have finished in recent weeks and days concerning the properties of trace functions. The last one should be on arXiv tomorrow, the others are there already. I will probably say more about some (or all) of these papers later, but here are quick summaries of what we do…

(1) “Counting sheaves with spherical codes”

This is a fairly short note, where we use the quasi-orthogonality of trace functions (in the geometrically irreducible case), which encapsulates Deligne’s general form of the Riemann Hypothesis over finite fields, in order to derive upper-bounds for the number of such functions with bounded conductor over a given finite field. As it turns out, the same quasi-orthogonality implies that we do something more geometrically interesting: for “small enough” conductor, the trace function essentially determines the sheaf, and so we are counting sheaves.

In spirit, this is therefore close to many counting problems of number theory: we have a countable set, a measure of complexity which allows us to write it as an increasing union of finite sets, and we want to know how many elements there are in these finite subsets.

A difference with many more classical problems, however, is that it seems rather difficult to get asymptotics when counting trace functions. If we use a Langlands correspondance, we are trying to count automorphic representations on \mathrm{GL}_n(\mathbf{F}_p[T]) with some bounds on ramification. We realized only rather late the existence of very striking conjectures and results of Drinfeld and Deligne (among others; see this excellent account of Deligne’s work by Esnault and Kerz) for the precise counting in the vertical direction (fixing a base field, and extending it), which should — under suitable conditions — take a form very similar to a Lefschetz Trace Formula. Our bounds do not really contribute to this question, since they are (probably far off) upper-bounds only, but they are completely explicit and they work in the “horizontal” direction (bounding the conductor and letting p go to infinity.)

As to the spherical codes of the title, they arise because the quasi-orthogonality shows that, as vectors in the 2p-dimensional real vector space of complex-valued functions on \mathbf{F}_p, the (normalized) trace functions with conductor \leq M have a strong angular-separation property, and subsets of unit spheres with this property are precisely called “spherical codes”. The question of giving upper-bounds for the cardinality of spherical codes with given angular separation is quite important, but interestingly we did not find the range given by the Riemann Hypothesis in the literature (this has the effect of making the cardinality grow polynomially as a function of p for a fixed bound on the conductor; a polynomial growth is the right answer for our problem, though finding the right exponent is a rather delicate question). We tweaked the ideas of Kabatjanski and Levenshtein (who have the best-known results in general) for our purpose, which involves some fun estimates depending on the location of the first zero of the Airy function…

(2) “An inverse theorem for Gowers norm of trace functions over prime fields”

This is again a fairly short paper, which does pretty much what the title suggests: for a trace function K over \mathbf{F}_p, and an integer d\geq 1, we find an estimate for the d-th Gowers norm \|K\|_{d} of K (see Section 3 of this part of T. Tao’s notes on higher-order Fourier analysis for an introduction to these norms). This takes the form
\|K\|_{d}^{2^d}\ll p^{-1},
where the implied constant depends only (completely explicitly) on the conductor of K and on d, except (this is the inverse part in the title) when the sheaf \mathcal{F} which gives rise to K contains at least one Jordan-Hölder factor with trace function of the type
e\Bigl(\frac{P(x)}{p}\Bigr),
for some polynomial P of degree \leq d-1. These functions are the natural obstructions to having small Gowers norms (as already emphasized by Gowers), but one doesn’t usually get to have such strong structural statements as those we get: if K is geometrically irreducible, then the only possibility is that it is exactly proportional to a function of the type above (for all x).

None of the three of us can be said to be a great expert on the Gowers norms (which have been studied much more deeply by others, most spectacularly by Gowers and recently by Green, Tao and Ziegler), and this note is basically our attempt at seeing if the (fairly algebraic) definition could be studied using the sheaf formalism and the Riemann Hypothesis. But the final estimate is interesting in that, as far as the dependency on p is concerned, it is the same as one would get for a “random” function (in the model where we consider a function \varphi modulo p such that the \varphi(x) are independent uniformly bounded random variables with mean zero; we found this statement in the book of Tao and Vu), and it seems that no “deterministic” examples of such functions had been written down before. From our result, one can see for instance that
\|x\mapsto \chi(f(x))\|_d^{2^d}\ll p^{-1},
for any fixed non-constant polynomial f\in \mathbf{Z}[T], \chi being the Legendre character modulo p, with the implied constant depending only on \deg(f) and d (again, completely explicitly).

(3) “Algebraic trace functions over the primes”

The longest and deepest of our three papers continues the study of orthogonality of trace functions against other natural arithmetic sequences. After dealing with Fourier coefficients of modular forms in the first paper, we consider sums over primes, and sums against the Möbius function. Precisely, let K be a trace function modulo p. Say that K is p-exceptional if it is proportional to
K_{\chi,a}(x)=\chi(x)e(ax/p),
for some Dirichlet character \chi modulo p and some a\in\mathbf{F}_p (allowing trivial \chi and/or a=0.) Then, if K is not p-exceptional we have
\sum_{n\leq X}\Lambda(n)K(n)\ll X\Bigl(1+\frac{p}{X}\Bigr)^{1/12}p^{-1/48+\varepsilon}
for any \varepsilon>0, where the implied constant depends only on the conductor of K and on \varepsilon. The “critical” case is when X=p or is a bit smaller, in which case we therefore get cancellation with a power saving. It is well-known that one expects such a bound for a p-exceptional K also, but that this is essentially equivalent to proving the existence of a zero-free strip for some Dirichlet L-function, so that the restriction is natural in the current state of knowledge.

Similarly, we have
\sum_{n\leq X}\mu(n)K(n)\ll X\Bigl(1+\frac{p}{X}\Bigr)^{1/12}p^{-1/48+\varepsilon}
with the same conditions.

These estimates are rather sweeping: we can take any of the examples of trace functions explained in the previous post (making sure they are not exceptional, but for instance any irreducible sheaf of rank at least 2 is not exceptional, as is any rank 1 sheaf with a singularity not at 0 or \infty…). Although some specializations to specific trace functions had been already studied (sometimes with stronger exponents), we find the generality to be a really remarkable example of the power of the structural features coming from Deligne’s work. and from the formalism of algebraic geometry, which we use again extensively. Indeed, we need not only all the work of the previous paper on twists of Fourier coefficients of modular forms (applied to Eisenstein series), but we also had to establish some additional sheaf-theoretic properties.

To give an example, we get immediately that if \chi is a Dirichlet character of order h\geq 2, and f\in \mathbf{Z}[T] is a polynomial which is not proportional to an h-th power times a monomial (e.g., if f is squarefree) we have
\sum_{n\leq X}\Lambda(n)\chi(f(n))\ll X\Bigl(1+\frac{p}{X}\Bigr)^{1/12}p^{-1/48+\varepsilon}
where the implied constant depends only on \deg(f) and on \varepsilon. As far as we know, the only case previously treated (going back to Karatsuba) is when f(x)=aX+b, with b\not=0, is linear…

Among a number of applications, which can be found in the paper (and before we find others…), the following is also fairly nice: given f\in\mathbf{Z}[T] squarefree and non-constant, we have
\sum_{0\leq a<p-1} E(X,p,f(a))\ll X\Bigl(1+\frac{p}{X}\Bigr)^{1/12}p^{-1/48+\varepsilon}
where E(X,q,a) denotes in general the error term in the prime number theorem in arithmetic progressions
\sum_{p\leq X,\ p\equiv a\bmod q}1=\frac{\pi(X)}{\varphi(q)}+E(X;q,a)
and the implied constant depends only on \deg(f) and on \varepsilon. In fact, with a whiff of extra formalism, we can replace the sum over residue classes of the form f(a) taken with the multiplicity of representation to the corresponding sum over the the residues of this form, without multiplicity.

Written by Kowalski

November 26th, 2012 at 11:37 pm

Posted in Mathematics

Orthogonality of columns of integral unitary operators: a challenge

without comments

Given a unitary matrix A=(a_{i,j}) of finite size, it is a tautology that the column vectors of A are orthonormal, and in particular that
\sum_{i} a_{i,j} \overline{a_{i,k}} =0
for any $j\not=k$. This has an immediate analogue for a unitary operator U\,:\, H\rightarrow H, if H is a separable Hilbert space: given any orthonormal basis (e_n)_{n\geq 1} of H, we can define the “matrix” (a_{i,j})_{i,j\geq 1} representing U by
U(e_j)=\sum_{i\geq 1}a_{i,j}e_i,
and the “column vectors” (a_{i,j})_{i\geq 1}, for distinct indices j, are orthogonal in the \ell_2-sense: we have
0=\langle e_j,e_k\rangle = \langle U(e_j),U(e_k)\rangle=\sum_{i}a_{i,j}\overline{a_{i,k}}
if j\not=k.

Now assume that H is some L^2 space, say H=L^2(X,\mu), and U is an integral operator on H given by a kernel k\,:\, X\times X\rightarrow \mathbf{C}, so that
U(\varphi)(x)=\int_{X}\varphi(y)k(x,y)d\mu(y)
for \varphi \in L^2(X,\mu).
Intuitively, the values k(x,y) of the kernel form a kind of “continuous matrix” representing U. The question is: are its columns orthogonal? In other words, given y\not=z in X, do we have
\int_{X}k(x,y)\overline{k(x,z)}d\mu(x)=0?

If one remembers the fact that “nice” kernels define trace class integral operators in such a way that the trace can be recovered as the integral
\int_{X}k(x,x)d\mu(x)
over the diagonal (the basis of the trace formula for automorphic forms…), this sounds rather reasonable. There is however a difficulty: it is not so easy to write kernels k(x,y) which both define a unitary operator, and are such that the integrals
(\star)\quad\quad\quad\quad \int_{X}k(x,y)\overline{k(x,z)}d\mu(x)
are well-defined in the usual sense! For instance, the most important unitary integral operator is certainly the Fourier transform, defined on L^2(\mathbf{R},dx), and its kernel is
k(x,y)=e^{2i\pi xy},
for which the integrals above are all undefined in the Lebesgue sense. This is natural: if the kernel k(x,y) were square integrable on X\times X, for instance, the corresponding integral operator on L^2(X,\mu) would be compact, and its spectrum could not be contained in the unit circle (excluding the degenerate case of a finite-dimensional L^2-space.)

This probably explains why this question of orthogonality of column vectors is not to be found in standard textbooks. There are some examples however where things do work.

We consider the space H=L^2(\mathbf{R}^*,|x|^{-1}dx), and as in the previous post, we look at the unitary operator
T=\rho\Bigl(\begin{pmatrix}0&-1\\1&0\end{pmatrix}\Bigr),
where \rho is the principal series representation with eigenvalue 1/4 of \mathrm{PGL}_2(\mathbf{R}). The result of Cogdell and Piatetski-Shapiro already mentioned there shows that T is, indeed, a unitary operator given by a smooth kernel k(x,y)=j(xy) for some function j on \mathbf{R}^*. This function is explicit, and (as expected) not very integrable: we have
j(x)=\begin{cases}-2\pi \sqrt{x}Y_0(4\pi\sqrt{x})\text{ for } x>0,\\4\sqrt{|x|}K_0(4\pi\sqrt{|x|})\text{ for } x<0.\end{cases}.

Since it is classical that Y_0(x)\approx x^{-1/2} for x\rightarrow +\infty, this function is neither integrable nor square-integrable. But, the function K_0 on [0,+\infty[ decays exponentially at infinity! This means that the integrals (\star), which are given by
\int_{\mathbf{R}^*}j(xy)\overline{j(xz)}\frac{dx}{|x|},
make perfect sense when y and z have opposite sign (this requires also knowing that there is no problem at 0, but that is indeed the case, because the Bessel functions here have just a logarithmic singularity there, and the factors \sqrt{|x|} eliminate the |x|^{-1} in the integral.)

It should not be a surprise then that we have
\int_{\mathbf{R}^*}j(xy)\overline{j(xz)}\frac{dx}{|x|}=0
for yz<0. This boils down to an identity for integrals of Bessel functions that can be found in (combinations of) standard tables, or it can be proved more conceptually by viewing
j(xy)=k(x,y)
as limit of
\frac{1}{2\epsilon}\int_{|u-y|<\epsilon} k(x,u)du,
which is T(f_{y,\epsilon}) for the function f_{y,\epsilon} which is the normalized characteristic function of the interval of radius \epsilon around y, and similarly for z. Since
\langle f_{y,\epsilon},f_{z,\epsilon}\rangle =0
when \epsilon is small enough, the unitarity gives
\int_{\mathbf{R}^*} Tf_{y,\epsilon}(x)\overline{Tf_{z,\epsilon}(x)}\frac{dx}{|x|}=0,
and one must take the limit \epsilon\rightarrow 0, which is made relatively easy by the exponential decay of K_0 at infinity…

This is nice, but here comes a challenge: if one spells out this identity in terms of Bessel functions, what needs to be done is equivalent to showing that the function
K(a, b)=\int_{0}^{+\infty}{Y_0(ax)K_0(bx)xdx}
defined for a,b>0, is antisymmetric: we have
K(a,b)=-K(b,a).
Now, this fact is an “elementary” property of classical functions. Can one prove it directly? (By which I mean, without using the operator interpretation, but also without using an explicit formula for the integral…) For the moment, I have not succeeded…

I’ll conclude by correcting a mistake in my previous post (it should not be a surprise to anyone that if I attempt to be as clever as Euler, I may stumble rather badly, and the correction is in some sense rather small compared with one might expect)… There I claimed that the integral transform w\mapsto W appearing in the Voronoi formula for the divisor function is given by
|y|^{1/2}W(y)=T(|x|^{1/2}w(|x|)).
But this is not the case: the proper formula is
|y|^{1/2}W(y)=T(|x|^{1/2}\tilde{w}(x)),
where \tilde{w}(x)=w(x) if x>0, but \tilde{w}(x)=0 if x<0. This affects the final formula: we have
\|W\|^2=\|w\|^2,
instead of the claimed
\|W\|^2=2\|w\|^2
(the "proof" using the Fourier transform has the same mistake of using w(|xy|) instead of \tilde{w}(xy), so there is no contradiction between the informal argument and the rigorous one.)

Written by Kowalski

November 18th, 2012 at 9:43 pm

Posted in Exercise,Mathematics

Trace functions, II: Examples

without comments

Continuing after my last post, this one will be a list of examples of trace functions modulo some prime number p. For each of the examples, I will give a bound for its conductor, which I recall is the main numerical invariant that allows us to measure the complexity of the trace function K(n) (formally, the conductor is attached to the object \mathcal{F} that gives rise to K, but we can define the conductor of a trace function to be the minimal conductor of such a \mathcal{F}.) These objects \mathcal{F} will be called sheaves, since this is the language used in the paper(s) of Fouvry, Michel and myself, but one doesn’t need to know anything about sheaves to understand the examples.

I will start with a list of concrete functions which are trace functions, and then explain some of the basic operations one can perform on known trace functions to obtain new ones. All these examples will be (I hope) very natural, but it is usually a deep theorem that the functions come from sheaves.

Throughout, p is a fixed prime number. Generically, \psi denotes a non-trivial additive character modulo p, for instance
\psi(x)=e^{2i\pi x/p},
(which may also be viewed casually as an \ell-adic character), and \chi denotes a multiplicative character modulo p (non-trivial, unless specified otherwise.)

(1) Characters and mixed characters

Let f and g be non-zero rational functions in \mathbf{F}_p(T). Let
K(x)=\psi(f(x))\chi(g(x)),
for x which is not a pole of f, or a zero or pole of g, and K(x)=0 in that case. Then K is a trace weight. The (or an) associated sheaf is of rank 1, and its conductor is bounded by the sum of degrees of numerators and denominators of f and g. However, the size of the conductor arises for different reasons for f and g: for the “additive” component f, singularities are poles of f, and the contribution of each pole x_0 comes from the Swan conductor, which is bounded by the order of the pole at x_0; for the “multiplicative” component g, the singularities are zeros and poles of g, and each only contributes 1 to the conductor: the Swan conductors for K_g=\chi(g(x)) are all zero.

For analytic applications, the main point is that, by fixing f and g over \mathbf{Q}, one obtains for each p large enough (so that the reduction modulo p makes sense), and each choice of characters \psi and \chi, a trace weight associated to f and g which has conductor uniformly bounded (depending on f and g only). Thus any estimates valid for all primes with implied constants depending only on the conductor of the trace functions involved will become an interesting estimate concerning f and g. This applies to the main theorem of my paper with Fouvry and Michel concerning orthogonality of Fourier coefficients of modular forms and trace functions…

These examples are the most classical, and are very useful. Even the simple case g=1 and f(X)=X^{-1} is full of surprises.

(2) Fiber-counting functions

Another very useful example comes from a fixed non-constant rational function f\in \mathbf{F}_p(T), which is viewed as defining a morphism
f\,:\, \mathbf{P}^1\rightarrow \mathbf{P}^1.
Consider then
K(x)=|\{y\in \mathbf{P}^1\,\mid\, f(y)=x\}|.
This is a trace weight, associated to the direct image sheaf
\mathcal{F}=f_*\bar{\mathbf{Q}}_{\ell},
which in representation theoretic terms is an induced representation from a finite-index subgroup, so that it remains relatively simple.
Here the rank r of the sheaf is the degree \deg(f) of f as a morphism (i.e., the generic number of pre-images of a point x); the singularities are the finitely many x in \mathbf{P}^1 such that the equation
f(y)=x
has fewer than r solutions (in \mathbf{P}^1(\bar{\mathbf{F}}_p)) and, at least if p>\deg(f), the Swan conductors vanish everywhere, so that the conductor is bounded in terms of the degrees of the numerator and denominator of f only. In particular, if f is defined over \mathbf{Q}, varying p (large enough) will provide a family of trace functions modulo primes with uniformly bounded conductor, similar to the characters of the previous example with fixed rational functions as arguments.

The main reason this function is useful is that, for any other (arbitrary) function \varphi on \mathbf{P}^1(\mathbf{F}_p), we have tautologically
\sum_{y}{\varphi(f(y))}=\sum_{x}{K(x)\varphi(x)}
(in other words, it is maybe better to interpret K as the image measure of the uniform measure on the finite set \mathbf{P}^1(\mathbf{F}_p) under f, and this formula is the classical “integration” formula for an image measure…)

One also often takes the function
\tilde{K}(x)=K(x)-1,
where 1 is the average of K over \mathbf{F}_p. This is also a trace function (the sheaf corresponding to K contains a trivial quotient, and this is the trace function of the kernel of the map to this trivial quotient). We now have
\sum_{x}{\tilde{K}(x)\varphi(x)}=\sum_{y}{\varphi(f(y))}-\sum_{x}{\varphi(x)}.

(3) Number of points on families of algebraic varieties

More generally, we can count points on one-parameter families of algebraic varieties of dimension d\geq 1. For instance, families of elliptic curves or of more general curves are quite common. To be concrete, one may have a polynomial f\in \mathbf{F}_p[T,Y,Z], where T is seen as the parameter, and consider the curves
C_t\,:\, f(t,X,Y)=0.
Usually, it is not so much the number of points as the correction term that is most interesting. For instance, if the curves are generically geometrically irreducible, and have a single point at infinity, the size of C_t(\mathbf{F}_p) is (for all but finitely many t) of the form
|C_t(\mathbf{F}_p)|=p-a(C_t),
where a_(C_t) satisfies the Weil bound
|a(C_t)|\leq 2g(C_t)\sqrt{p},
in terms of the genus of C_t. In fact, once one ensures that the family of curves is such that the genus of the curves is the same g\geq 0 (for all but finitely many t), the function
K(t)=a(C_t)
is a trace function on the corresponding dense open set of \mathbf{A}^1, for some sheaf which has rank 2g. For the other values of t, the trace function of the corresppnding middle-extension sheaf might differ from the value a(C_t) defined as above using the number of points, but since the number of those singularities is bounded by the conductor, one can usually (analytically at least) not worry too much about this. Similarly, in many cases the sheaf is tamely ramified everywhere (i.e., all Swan conductors vanish), and so the conductor is well-controlled.

In contrast with the first two examples, the construction of a sheaf with this trace function is not elementary: it is an example of the so-called “higher direct image sheaves” (with compact support). Since, for every “good” t, the Riemann Hypothesis for curves shows that
a_p(C_t)=\sqrt{p}(\theta_{1,t}+\cdots+\theta_{2g,t}),
where the \theta_{i,t} are complex numbers of modulus 1, we can interpret the existence of this sheaf as saying that the algebraic variation of the “eigenvalues” \theta_{i,t} is itself controlled by an algebraic object. This is one of the main insights that algebraic geometry (and étale cohomology in particular) brings to analytic number theory.

The family of elliptic curves
x+x^{-1}+y+y^{-1}+t=0
in my bijective challenge is of this type.

(4) Families of Kloosterman sums

One of the great examples, for analytic number theory, is given by families of Kloosterman sums: for an integer m\geq 1, and a non-zero a\in\mathbf{F}_p, we let
Kl_m(a)=\frac{(-1)^{m-1}}{p^{(m-1)/2}}\sum_{x_1\cdots x_m=a}e\Bigl(\frac{x_1+\cdots +x_m}{p}\Bigr).
The Weil bound for m=2, and the even deeper work of Deligne for larger m, prove that
|Kl_m(a)|\leq m
for all a invertible modulo p. Further work, relying once more on the powerful formalism of étale sheaves and higher direct images in particular, shows that the function
K(a)=Kl_m(a),
is (the restriction to invertible a of) a trace function for an irreducible sheaf, with conductor bounded in terms of m only.

(5) The Fourier transform

If we have a function K(x) modulo p, we define its Fourier transform by
\hat{K}(t)=\frac{1}{\sqrt{p}}\sum_{x\in \mathbf{F}_p}{K(x)e\Bigl(\frac{xt}{p}\Bigr)}
for t\in\mathbf{F}_p (the normalization here is convenient, as I will explain). It is now a very deep fact that, if $\latex K$ comes from a sheaf, then so does -\hat{K} (the minus sign is natural, but this has to do with rather deep algebraic geometry…) More precisely, one has to be careful because of the fact that the Fourier transform of an additive character (as a function) is a multiple of a delta function. The latter does fit nicely in the framework of étale sheaves, but not as a middle-extension sheaf or Galois representation (because it is zero on a dense open set, so it would have to be zero to be a middle-extension sheaf or to come from a Galois representation). There is a geometric solution to this issue, but it involves speaking of perverse sheaves and related machinery, which we have barely started to understand: the Fourier transform works perfectly well at the level of perverse sheaves, and one can use their trace functions just as well as those of Galois representations. Since, in our current applications, we can always deal separately with additive characters (or delta functions), we have avoided having to deal with perverse sheaves (up to now…)

The existence of the \ell-adic Fourier transform of sheaves was first proved by Deligne, but the theory of the sheaf-theoretic Fourier transform was largely built by Laumon (with further contributions, in particular, from Brylinski and Katz). To illustrate how powerful it is, consider
K(x)=e\Bigl(\frac{x^{-1}}{p}\Bigr),
a relatively simple case of Example (1). We then have
\hat{K}(x)=Kl_2(x),
so that the existence of the Fourier transform at the level of sheaves implies the existence of the Kloosterman sheaf parameterizing classical Kloosterman sums as in the previous example.

Other examples that arise from our previous examples are many families of exponential sums, for instance
K(t)=\frac{1}{\sqrt{p}}\sum_{x\in\mathbf{F}_p}{\psi(f(x)+tx)\chi(g(x))},
(arising from Example (1); one must assume either that f(x) is not a polynomial of degree \leq 1 or that \chi is non-trivial to have a well-defined sheaf), or
K(t)=\frac{1}{\sqrt{p}}\sum_{x}{e\Bigl(\frac{tf(x)}{p}\Bigr)},
for t\not=0 with K(0) equal to the number of poles of f (the sum over x is over values where the rational function f is defined), that arises from Example (2) (applied with the function \tilde{K}).

This operation of Fourier transform has one last crucial feature for applications to the analysis of trace functions: the conductor of \hat{K} is bounded in terms of that of K only. This is something we prove in our paper using Laumon’s analysis of the singularities of the Fourier transform, and in fact we show that if the conductor of K is at most M\geq 1, then the conductor of \hat{K} is at most 10M^2. Hence the examples above, if the rational functions f (and/or g) are fixed in \mathbf{Q}(T) and then reduced modulo various primes, always have conductor bounded uniformly for all p.

(6) Change of variable

Given a non-constant rational function f\in\mathbf{F}_p(T) seen as a morphism
\mathbf{P}^1\rightarrow \mathbf{P}^1,
and a trace function K(x), one can form the function
f^*K(x)=K(f(x)).
This is again, essentially, a trace function: as in Example (3), one may have to tweak the values of f^*K at some singularities (because pull-back of middle-extension sheaves do not always remain so), but this is fairly easily controlled. Moreover, one can also control the conductor of f^*K in terms of that of K, taking into account the degree of latex f$. A specially simple case of great importance is when f is an homography
f(x)=\frac{ax+b}{cx+d},\quad\quad\quad ad-bc\not=0,
(an automorphism of \mathbf{P}^1) in which case no tweaking is necessary to defined f^*K, and the conductor is the same as that of K (which certainly seems natural!)

We can now compose these various operations. One construction is the following (a finite-field Bessel transform): start with K, apply the Fourier transform, change the variable t to t^{-1}, apply again the Fourier transform. If we call \check{K} the resulting function, the examples above show that if K is a trace function with conductor \leq M, then \check{K} will also be one, and its conductor will be bounded solely in terms of M (in fact, it will be \leq 100M^4 by the bound discussed in Example (5)).



Trailer! In the next post in this series, I will discuss the Riemann Hypothesis for trace functions and its applications. But probably before I will discuss the more recent works of Fouvry, Michel and myself, since we now have three further papers in our series — two small, and one big.

Written by Kowalski

November 14th, 2012 at 6:22 pm

Posted in Mathematics

On Weyl groups and gaussians

with 3 comments

Am I the last person to notice that for k\geq 0, the even moment
m_{2k}=\frac{(2k)!}{2^kk!}
of a standard gaussian random variable (with expectation zero and variance one) is the same as the index of the Weyl group of \mathrm{Sp}_{2k} inside the Weyl group of \mathrm{GL}_{2k} (in other words, the index of the groups of permutations of 2k elements commuting with a fixed-point free involution among all permutations)?

If “Yes”, what else have I been missing in the same spirit?

Written by Kowalski

November 7th, 2012 at 4:50 pm

Posted in Exercise,Mathematics

Euler style

without comments

Courtesy of the divisor function, here is another fun example of reasoning in the great style of Euler (the last installment is rather old…) A classical tool to study the distribution of values of d(n) (the number of positive divisors of n) is the Voronoi summation formula, which expresses a sum

S(w,c,a)=\sum_{n\geq 1}d(n)w(n)e\Bigl(\frac{an}{c}\Bigr),

for a nice test function w, some positive integer c\geq 1, and some integer a coprime to c, in terms of a “dual sum”

S(W,c,\bar{a})=\sum_{m\in \mathbf{Z}-\{0\}}{d(|m|)W(m/c^2)e\Bigl(\frac{\bar{a}m}{c}\Bigr)},

where \bar{a} is the inverse of a modulo c, and

W(y)=\int w(|x|) k(xy)dx

is some integral transform of w, with kernel k(y) involving the classical Bessel functions Y_0 and K_0. Precisely, we have

k(y)=\begin{cases} -2\pi  Y_0(4\pi \sqrt{y})&\text{ if } x>0\\ 4 K_0(4\pi\sqrt{|y|})&\text{ if } y<0\end{cases},

and one should add that there is also a main term in the Voronoi formula, but it is irrelevant for today's story. A classical application of this formula is to improve the error term in Dirichlet's asymptotic evaluation of

\sum_{n\leq X}d(n),

which was done indeed by Voronoi.

In an ongoing work with É. Fouvry, S. Ganguly and Ph. Michel, we needed to know some unitarity property of the transformation

w \mapsto W.

This is an entirely classical question, but we didn't find a ready-made statement in Watson’s book on Bessel functions. There is however a formal argument that suggests the answer: if we consider the function g(x,y) of two real variables defined by

g(x,y)=w(|xy|),

then it turns out that we have

\hat{g}(u,v)=W(uv),

where \hat{g} is the standard Fourier transform of g (this is contained in Section 4.5 of the book of H. Iwaniec and myself.) Hence we have, by the unitarity of the Fourier transform, the identity

\int \int |w(|xy|)|^2dxdy = \int\int |W(uv)|^2dudv.

Offhandedly, by changing variables, this means that

\int |w(|t|)|^2 dt \times I = \int |W(s)|^2 ds \times I,

which would give

2\|w\|^2= \|W\|^2\quad\quad\quad\quad\quad\quad (\star)

(the factor 2 comes from the fact that w is extended to an even function on \mathbf{R} from its original source as a function defined for non-negative real numbers), if not for the fact that the “constant” I is the integral

I=\int \frac{dx}{|x|}.

Alas, it diverges, although probably Euler would write it as I=4\log (\infty) (two infinities from the divergence at 0^{\pm}, the other two from the divergence at \pm \infty), and be happy with the outcome.

One can then prove rigorously the formula (\star) by truncation arguments, but here is a more conceptual argument (which offers the advantage of being something we can just quote), which follows from the interpretation of the Voronoi formula in terms of the representation theory of G=\mathrm{SL}_2(\mathbf{R}). What happens is that there exists a unitary representation \rho of G (the principal series with Casimir eigenvalue 1/4) which can be represented as acting on the Hilbert space H=L^2(\mathbf{R},|x|^{-1}dx) (the Kirilov model) in such a way that the unitary operator

T=\rho\Bigl(\begin{pmatrix}0&-1\\1&0\end{pmatrix}\Bigr)

is given by an integral operator

(T\varphi)(x)=\int \varphi(y) j(xy)\frac{dy}{|y|}

for some function j, which Cogdell and Piatetski-Shapiro called the Bessel function of \rho (see this note of Cogdell for a short explanation of this, with the analogues for finite fields and p-adic fields). Now, by direct inspection of the formula for j(y) that Cogdell and Piatetski-Shapiro computed, and comparison with the kernel k(y) in the Voronoi formula, one finds that

W(y)=|y|^{-1/2} T( x\mapsto \sqrt{|x|} w(|x|) )

(in this other short note, Cogdell explains why it is no coincidence that this abstract Bessel function appears in the Voronoi summation formula). Now, from

\int |\varphi(x)|^2 \frac{dx}{|x|}=\int |T(\varphi)(x)|^2\frac{dx}{|x|},

which holds for all \varphi\in H because T is unitary on H, we deduce exactly (\star)

Remark. There is a completely similar story where the circles x^2+y^2=a replace the hyperbolas xy=a, or in other words, if one defines
g(x,y)=w(x^2+y^2).

Then the Fourier transform of g is still a radial function W(u^2+v^2), and the map w\mapsto W is a Hankel transform (it involves the Bessel function J_0). Its unitarity follows then immediately from that of the Fourier transform, since the analogue of the divergent integral I is now, indeed, a finite constant.

In terms of representation-theory, the story is the same as above, except that the representation \rho is replaced with a discrete series representation. One can also deal similarly with radial functions in higher-dimensional euclidean spaces, which involves other discrete series representations.

Written by Kowalski

October 28th, 2012 at 2:53 pm

Posted in Mathematics