Meta-statistics

One of the most unfortunate developments of modern football, and a clear symptom of the decline of civilization, is the regrettable irruption in the comments of a deluge of factoids that manage to simultaneously give a bad name to team sports and to statistics (“This is the first time in twenty-one competitive games played in the Southern Hemisphere that a French Number 10 player’s backward pass from the left foot has been intercepted by a Dutch player born in Amsterdam”). Roger Couderc certainly did not need this to make a game come to life (of course, technically, he was advantaged by the fact that he was commenting the much greater game of Rugby).

I am waiting impatiently for a more refined approach that will include meta-statistics:

Did you know? France has never lost a game where a single US-based newspaper presented more than three human-interest stories concerning the opposing team players in the three days before the game.

Did you know? It is the first time since the invention of the personal computer that the BBC has listed more than 243 statistical facts about a game.

Did you know? Three out of four statistical facts about the Italy-Switzerland game have involved numbers larger than thirteen.

These will be the days…

More things of the day(s)

(1) Today’s Word of The Day in the OED: afanc, which we learn is

In Welsh mythology: an aquatic monster. Also: an otter or beaver identified as such a monster.

Maybe the Welsh otters, like their rugbymen, are particularly fierce?

(2) Yesterday’s Google doodle, in Switzerland at least, celebrated the 57th birthday of Gaston Lagaffe

Gaston

I’ve heard that Gaston

Billard
Billard

is mostly unknown to the US or English, leaving many people with no reaction to the mention of the contrats de Mesmaeker

Contrats
Contrats

or to the interjection Rogntudju!.

Rogntudju

This lack of enlightenment is a clear illustration of the superiority of the continental mind.

James Maynard, auteur du théorème de l’année

How many times in a year is an analytic number theorist supposed to faint from admiration? We’ve learnt of the full three prime Vinogradov Theorem by Helfgott, then of Zhang’s proof of the bounded gap property for primes. Now, from Oberwolfach, comes the equally (or even more) amazing news that James Maynard has announced a proof of the bounded gap property that manages not only to ask merely for the Bombieri-Vinogradov theorem in terms of information concerning the distribution of primes in arithmetic progressions, but also obtains a gap smaller than 700 (in fact, even better when using optimal narrow k-tuples), where the efforts of the Polymath8 project only lead to 4680, using quite a bit of machinery.

(The preprint should be available soon, from what I understand, and thus a full independent verification of these results.)

Two remarks, one serious, one not (the reader can guess which is which):

(1) Again, from friends in Oberwolfach (teaching kept me, alas, from being able to attend the conference), I heard that Maynard’s method leads to the bounded gap property (with increasing bounds on the gaps) using as input any positive exponent of distribution for primes in arithmetic progressions (where Bombieri-Vinogradov means exponent 1/2; incidentally, this also means that the Generalized Riemann Hypothesis is strong enough to get bounded gaps, which did not follow from Zhang’s work). From the point of view of modern proofs, there is essentially no difference between positive exponent of distribution and exponent 1/2, since either property would be proved using the large sieve inequality and the Siegel-Walfisz theorem, and it makes little sense to prove a weaker large sieve inequality than the one that gives exponent 1/2. Question: could one conceivably even dispense with the large sieve inequality, i.e., prove the bounded gap property only using the Siegel-Walfisz theorem? This is a bit a rhetorical question, since the large sieve is nowadays rather easy, but maybe the following formulation is of some interest: do we know an example of an increasing sequence of integers n_k, not sparse, not weird, that satisfies the Siegel-Walfisz property, but has unbounded gaps, i.e., \liminf (n_{k+1}-n_k)=+\infty?

(2) There are still a bit more than two months to go before the end of the year; will a bright PhD student rise to the challenge, and prove the twin prime conjecture?

[P.S. Borgesian readers will understand the title of this post, although a spanish version might have been more appropriate…]

Algebraic twists of modular forms, III

After a small break while traveling, I continue the discussion of my paper with Fouvry and Michel, continuing from here. But first of all, I’m happy to mention two proofs of the counting identities mentioned in my introductory post: one that D. Zywina has sent me, a nice proof based on explicit computations of modular forms and functions, and another that is in fact just an application of a more general result of Deligne and Flicker on local systems on the projective line minus four points (see Section 7 of this preprint).


Now, if memory serves, at the end of the last post, we saw that analytic techniques reduce the study of our sums

S(f;K)=\sum_{n}\rho_f(n)K(n)V(n/p)

to the study of correlation sums

C(K;\gamma)=\sum_{z}\overline{\hat{K}(\gamma\cdot z)}\hat{K}(z),

where \hat{K} is the normalized Fourier transform modulo p.

For these, we need to show that “most” are small in order to conclude. Fixing a parameter M\geq 1, let us therefore define G_{K,M} to be the set of matrices \gamma in \mathrm{PGL}_2(\mathbf{F}_p) such that

|C(K;\gamma)|>Mp^{1/2}.

The discussion in the previous post shows that if, for some M\geq 1, all matrices in G_{K,M} are upper-triangular, we get

S(f;K)\ll p^{1-1/8+\epsilon}

for any \epsilon>0. But we can deal with some more exceptions. More precisely, we show that this estimate is still valid, with an implied constant depending only on M, if the matrices in G_{K,M} are all either

  1. parabolic (they have a single fixed point in \mathbf{P}^1), or
  2. upper-triangular, or
  3.  if they either fix or permute two distinct points taken in a finite list, say containing at most M pairs \{x,y\}.

Compared with the previous discussion, the subtlety here is that it can indeed happen that such matrices appear in the transformations we apply to S(f;K), and proving that their contributions remain under control involves some rather fun analysis.

This summarizes, in a rather hurried fashion, the first part of our paper. Logically, we obtain statements which are self-contained, but which are only applicable directly in a few cases (the case K(n)=\chi(n) is an excellent example where this can be done). To go further, we need to use algebro-geometric tools.

And here comes a dilemma well-known to anyone who has had to present research involving two relatively distant areas of mathematics, so that specialists of one may not know the other. I can give a concise definition of the class of weights K(n) that we consider, and it will be immediately familiar and natural to algebraic geometers — but not to most analytic number theorists.

So, instead, I will just say (very fast) that these weights, which we call “irreducible trace weights”, are trace functions of geometrically irreducible \ell-adic middle-extension Fourier sheaves pointwise pure of weight 0 on \mathbf{A}^1/\mathbf{F}_p. Then I will defer to a later post a more leisurely run-through this definition, together with more examples of weights and their formalisms, and with some further analytic properties which are of independent interest.

The reason this class of weight “works” can however be quickly summarized in a rather miraculous property, which is essentially a consequence of the Riemann Hypothesis: assuming K has small complexity (in a precise sense, saying basically that it comes from a sheaf with small rank and small ramification), the correlation sum C(K;\gamma) is either \ll p^{1/2}, where the implied constant depends only on the numerical invariant measuring the complexity of the weight, or we have an equality

\hat{K}(z)=\varepsilon(\gamma) \hat{K}(\gamma\cdot z)

for all z and some fixed complex number \varepsilon(\gamma) of modulus 1. From this second part, we see that, for a suitable M, the set G_{K,M} discussed above is contained in the group \mathbf{G}_K of all \gamma (viewed in \mathrm{PGL}_2(\mathbf{F}_p)) for which there exists \varepsilon(\gamma) (of modulus 1) such that the identity above holds. This provides us with enough structure on the set of “bad” matrices (with large correlation sums), from which the bound

S(f;K)\ll p^{1-1/8+\epsilon}

can fairly simply be deduced for irreducible trace weights using the automorphic part of our paper.

Indeed, we distinguish two cases, depending on the structure of the subgroup \mathbf{G}_K\subset \mathrm{PGL}_2(\mathbf{F}_p):

  • If  \mathbf{G}_K has order coprime to p, we use the classification of such subgroups, and see that \mathbf{G}_K is either contained in the normalizer of some maximal torus, i.e., in the stabilizer of a two-point set \{x,y\} (and hence these weights fall under the third case (3) of allowed “bad” matrices) or otherwise \mathbf{G}_K is of order at most 60, and consists only of semisimple elements (which allows us again to apply (3), with possibly more than one pair \{x,y\} involved, but less than 60);
  • If p\mid \mathbf{G}_K, we find some element \gamma_0 in \mathbf{G}_{K} which is of order p, hence unipotent. Thus \gamma_0 acts transitively on \mathbf{F}_p (minus at most a single point); the formula defining the subgroup \mathbf{G}_K above implies easily that \hat{K}(z) is then basically of a very restricted type, namely
    \hat{K}(z)=\varepsilon_0 e(a\gamma\cdot z/p)
    for some fixed matrix \gamma and fixed complex number \varepsilon_0 and integer a. But this weight comes from a specific (Fourier transform of a) Artin-Schreier sheaf (it might not be the one defining K originally, but a fortiori, we can assume it is!). For this sheaf, a rather simple analysis shows that \mathbf{G}_K is a unipotent group isomorphic to \mathbf{F}_p (unless a=0 or \gamma=1, which are exceptional cases). So the bad matrices are either trivial or parabolic, and we can appeal to case (1) to handle these weights…

I will conclude for today with another fact we noticed only very recently: in the special case K(n)=e(\bar{n}/p), it turns out that the correlation sums had already appeared in a paper of Friedlander and Iwaniec on incomplete Kloosterman sums, in the special case of C(K;\gamma) when \gamma is lower-triangular (and non-diagonal). In the Appendix to this paper, Birch and Bombieri give two proofs of the estimate C(K;\gamma)\ll p^{1/2} (for lower-triangular matrices): one is geometric (based on counting points on surfaces over finite fields), but the second one has amusing links to our arguments, with a camouflaged sighting of the group structure of the group B_- of lower-triangular matrices and of the fact that G_{K,M}\cap B_- is contained in a subgroup of B_-… (Interestingly, there is no trace of modular forms in this paper of Friedlander and Iwaniec, so the coincidence is rather unexpected.)