Stepanov’s method, Bombieri’s version, and the Riemann-Roch theorem

After a very nice week in Göttingen, I continued my course on exponential sums over finite fields by starting the proof of the Riemann Hypothesis for character sums of the type

\sum_{x\in\mathbf{F}_q}{\chi(g(x))},

where χ is a non-trivial multiplicative character and g a polynomial with coefficients in the finite field. My first intention was to present Stepanov’s original method. I even typed (as one can still see in the script) the necessary technical section concerning the Hasse derivatives that he used to detect zeros of high order of polynomials over finite fields, before deciding to switch instead to a specialization of Bombieri’s interpretation of the method. I did this mainly for two reasons:

    (1) The standard Stepanov method is already presented in my book with Iwaniec (at least for the important special case where χ is of order 2), and I prefer to avoid, as much as possible, repeating either what I write or the way I teach some subject;
    (2) Although Bombieri’s version depends, in its full generality, on the basic theory of algebraic curves, especially the Riemann-Roch theorem, the special case involved here can be done essentially by hand, and doing so provides a way to start motivating how and why deeper algebraic geometry can become useful for exponential sums over finite fields.

More precisely, Stepanov’s idea for counting points on an algebraic curve given in the plane (with coordinates x and y) was to construct an auxiliary polynomial in the variable X which vanishes, to high order, at all x which are the first coordinate of a point on the curve. This works particularly well for special equations like

y^d=g(x),

which are precisely those coming into play in the study of the multiplicative exponential sums above, with d equal to the order of the character. In Bombieri’s method, this auxiliary polynomial is replaced with a function which lives directly on the curve, and vanishes to high order at the points with coordinates in the finite field. This adds flexibility, and in particular the order of the zeros is shown to be large by constructing this function as a high-power of another one, whereas Stepanov has to deal with the Hasse derivatives to check that the vanishing is of suitable order.

To ensure that the auxiliary function is “not too big” (and thus bound the number of zeros, and this way the number of rational points), Bombieri has to use an analogue of the degree of a polynomial as measure of complexity. This is provided with the order of the pole of the function at an auxiliary (rational) point on the curve; since he works with projective curves, using the fact that a non-zero rational function has the same number of zeros and poles when multiplicity is taken into account means that controlling the poles gives an upper bound on the number of zeros.

Simplifications arise for the equations above because the auxiliary point can be taken to be at infinity. When the order d and the degree of the polynomial g are coprime, the order of the pole at infinity of a function

f(X,Y)=\sum_{i=0}^{d-1}{f_i(X)Y^i},\quad\quad f_i(X)\in \mathbf{F}_q[X],

on the curve (which is clearly the only possible pole) is given by

\deg(f)=\max\{ d\deg(f_i)+i\deg(g)\,:\, 0\leq i\leq d-1,\ f_i\not=0\},

simply because the “local” degrees in the maximum are distinct when i varies over the indices where the coefficient of Yi is non-zero. To perform the construction, one needs to understand which non-negative integers do appear as degrees of functions on the curve of this type. This is exactly what the Riemann-Roch theorem gives: for k large enough (in terms of the genus of the curve), one has

\dim \{f\,:\, \deg(f)\leq k\} =k+1-\gamma,

where the non-negative constant γ is an invariant of the curve, in fact precisely its genus. This should be compared with the case of polynomials of one variable: the dimension of the space of those polynomials with degree at most k is exactly k+1, corresponding to the projective line being a curve of genus 0. The Riemann-Roch theorem shows, in that special case, that the functions on the curve with a pole only at infinity behave, in this respect, almost like the standard polynomials.

With the definition of the degree above, this result is in fact a quite simple exercise with a combinatorial flavour, and this is what I used in the course.

The functional equation for exponential sums

I am continuing my class on exponential sums (and the current notes are still available on the web page). Before spring break, and a dive into Stepanov’s method to prove the Riemann Hypothesis for one-variable sums, the last important thing to do was to set up the L-functions associated to those sums, so that the fact that it is a polynomial (in q-s, if seen as a complex-analytic function) can be used to bootstrap Stepanov’s upper bound of the right-order of magnitude to the “right” upper bound. (Roughly, one can show using this method results like

|S|=|\sum_{x\in\mathbf{F}_{p^n}}{\chi(N_{\mathbf{F}_{p^n}/\mathbf{F}_p}(x^3+ax+b))|\leq Cp^{n/2},

for some large constant C, assuming the cubic polynomial X3+aX+n is squarefree, and knowing a priori that there exists a complex number α such that

S=\alpha+q\alpha^{-1}

is enough to show that C can be replaced by 2, which is then optimal).

There are a number of proofs of the functional equation that I know, but I ended up using an ad-hoc argument which has its charm. This allowed me to avoid mentioning either adèles (and the Poisson summation formula thereupon) or the Riemann-Roch theorem, in keeping with my focus this semester on elementary methods. Once more, I seriously doubt that the technique is new, but I noticed that Schmidt’s book (which I use as a guiding text for this part of my lectures) does not prove the functional equation for exponential sums in any great generality. He just computes the leading term of the polynomial in some cases. In fact, part of the reason I wanted a different argument is that, by analogy with classical Dirichlet characters, one knows (or expects) this leading term to be more or less a Gauss sum, and although Schmidt’s argument immediately reminded me of the computation of the modulus (square) of a Gauss sum, he does not interpret it explicitly in this way.

More or less, the argument I used is as follows: one has a non-trivial character

\eta\,:\, \mathbf{F}_q[X]\rightarrow \mathbf{C}

of a polynomial ring over a finite field, analogous to a Dirichlet character, defined modulo some principal ideal

I=g\mathbf{F}_q[X],\quad\quad\quad \deg(g)=d\geq 1,

(meaning that η(f) only depends on the class of f modulo g), which is primitive. Then the L-function is defined by the formal power series

L(\eta,T)=\sum_{f\text{ monic}}{\eta(f)T^{\deg(f)}},

and a very simple orthogonality argument (based on the non-triviality of the character) shows that in fact

L(\eta,T)=1+c_1(\eta)T+\cdots + c_{d-1}(\eta)T^{d-1}

is a polynomial of degree at most d-1. Then one “computes” the coefficients by

c_j(\eta)=\quad\sum_{f\text{ monic, } \deg(f)=j}{\ \ \eta(f)},

and the question is to simplify this coefficient — in fact, the functional equation means that it is related to

c_{d-1-j}(\bar{\eta}),

the coefficient of complementary degree for the dual character. The idea is then to express the conditions

f\text{ monic},\quad  \deg(f)=j\leq d-1,

which are conditions on the coefficients of the polynomial f, using additive characters of the finite field to create a sum over all f in the finite ring

R=\mathbf{F}_q[X]/(g),

twisted by the various additive characters. The latter involve

d-1-j

variables (corresponding to the number of conditions to detect among polynomials of degree at most d-1), and those can be parameterized by polynomials of degree d-1-j over the finite field. Such a mixture of additive and multiplicative characters is of course rather conducive to the blooming of Gauss sums, and indeed, it turns out to lead, without much difficulty, to the result… (This is Proposition 4.15, page 45, of my notes).

Pocket Kloostermania

It has been well said that Kloosterman sums are everywhere. Back in August, I showed how to visualize them on an ordinary laptop or desktop computer — software available also on its own page.

However, modern enlightened thought holds that availability on even the smallest of netbooks is not a good measure of ubiquity. Hence, without further ado, I am happy to introduce Pocket Kloostermania, the Android version. An installable package file can be downloaded from this link; it, and the source code, are also available now on the Kloostermania page; the license remains GPLv2.

Here’s what it looks like on the emulator.

Features/user manual are:

* The program has been installed and tested only on a Nexus One; it probably requires at least Android 2.0, but should be adaptable to earlier ones.

* On startup, the program displays the graph for S(1,1;173).

* Changes of orientation of the screen are recognized.

* Pressing the Menu soft key brings three choices (as seen on the picture):

    Modulus and sum displays on the screen the modulus of the current sum (the two other parameters are always 1 and 1) as well as its value;
    New sum shows a text input dialog where a new modulus can be entered; to finish input, press either Done on the soft keyboard, or the Back soft key, and the new sum will be drawn (with a modulus rounded to the nearest prime number);
    About gives the version number and license information.

More Cauchy distribution

I’ve mentioned earlier the first cases I had found of (almost) mod-Cauchy convergence (see this post for the definition).

Yesterday, following from a citation in Sarnak’s note to one in a paper of Guivarc’h and Le Jan, I ended up looking at this paper of F. Spitzer, which implicitly proves a true mod-Cauchy convergence result. Morever, this result is very interesting from the probabilistic point of view, since it concerns that most beautiful of objects, the (complex) Brownian motion.

Here is the result: consider such a complex Brownian motion

W(t)=W_1(t)+iW_2(t),

for non-negative t, where the real and imaginary parts are themselves independent standard Brownian motions on the real line, except that the real part is started at

W_1(0)=R,

for some fixed R> 0 (for instance R=1).

Because almost surely the Brownian motion is not zero, one can define a continuous process

\theta_R(t)=\mathrm{Im} \log W(t),\quad\quad \theta_R(0)=0,

which measures the argument of the Brownian motion and its winding around the origin. Spitzer’s result is then the fact that

\frac{\theta_R(t)}{\log \sqrt{t}}

converges in law, as t goes to infinity, to a Cauchy distribution with parameter 1 (it had already been noticed by P. Lévy that the argument of the Brownian motion is not square-integrable).

However, his argument is more precise: using relations with suitable PDE’s, he manages to compute exactly the Fourier transform of the law of the argument in terms of Bessel functions: for u≥0 (this characteristic function is even), we have

\mathbf{E}(e^{iu\theta_R(t)})=\sqrt{\frac{\pi R^2}{8t}}e^{-R^2/(4t)}\Bigl\{I_{(u-1)/2}\Bigl(\frac{R^2}{4t}\Bigr)+I_{(u+1)/2}\Bigl(\frac{R^2}{4t}\Bigr)\Bigr\}

where Iν(x) is the I-Bessel function of index ν.

Since the first term in the power series expansion (at 0) of the Bessel function is given by

I_{\nu}(x)=\frac{(x/2)^{\nu}}{\Gamma(\nu+1)}+O(x^{\nu+1}),

it follows easily that we have a mod-Cauchy convergence:

\exp(A(t)|u|) \mathbf{E}(e^{iu \theta_R(t)})\longrightarrow  \Phi(u),

as t goes to infinity, with parameters and limiting function given by

A(t)=\log\frac{\sqrt{8t}}{R},\quad\quad \Phi(u)=\frac{\sqrt{\pi}}{\Gamma((|u|+1)/2)}=\frac{\Gamma(1/2)}{\Gamma((|u|+1)/2)}.