Thomas Pynchon, mathematical epigraphist

There might be some readers who are currently desperately looking for a suitable epigraph for their mathematical masterwork. The best advice I can give is to spend some time in the company of Thomas Pynchon’s works, which abound in scientific and mathematical wit. Many, though aware that P.G. Wodehouse’s wonderfully more readable oeuvre is unfortunately sadly lacking for this purpose, will still object by pointing out the reputation for incomprehensibility of, say, “Gravity’s Rainbow”, a heavy volume supposedly barely more understandable than “Finnegans wake”. However, it should be kept in mind that this reputation is the work of literary critics, who — and they are more to be pitied than castigated — are unlikely to find that the veil lifts when, around page 670, the dashing Yashmeen Halfcourt of “Against the day” starts conversing cogently in Göttingen with David Hilbert to propose what is commonly referred to as the Polya-Hilbert idea to solve the Riemann Hypothesis. But this, of course, is exactly where a mathematician will think that, after all, it’s not so bad.

Here are some of my favorite quotable excerpts from Pynchon:

  • From “Gravity’s Rainbow”, which is also full of Poisson processes, if I remember right:

    “The Romans,” Roger and the Reverend Dr. Paul de la Nuit were drunk together one night, or the vicar was, “the ancient Roman priests laid a sieve in the road, and then waited to see which stalks of grass would come up through the holes.”

    (actually, I have to confess, with respect to this citation, to having committed two of the cardinal sins of epigraphists: I’ve used it twice — my excuse being that one time was for my PhD thesis, which was not published as-is –, and I haven’t read the book much further than beyond the place where it appears; and for those who wonder, there is at least one more dreadful faux pas in epigraphing: doctoring a quote to make it just perfect — and I’ve done it at least once).

  • In “Mason & Dixon”, we find

    In the partial light, the immense log Structure seems to tower toward the clouds until no more can be seen.

    This novel was published in 1997; one cannot feel anything but impressed to see Pynchon following so closely the latest developments of post-Grothendieck algebraic geometry…

  • Still in “Mason & Dixon” (which I am currently re-reading, hoping to vault triumphantly above the 50 percent mark of understanding), we have

    He sets his Lips as for a conventional, or Toroidal, Smoke-Ring, but out instead comes a Ring like a Length of Ribbon clos’d in a Circle, with a single Twist in it, possessing thereby but one Side and one Edge….

    which prompts the obvious question: is it really possible to blow a smoke ring in the shape of a Möbius band? Hopefully some experts will comment on this…

Addendum

Here is a quick update to the last post about (restricted) mod-Cauchy convergence; I’ve investigated numerically the behavior of the renormalized averages

\mathbf{E}_N(e^{its(d,c)}) \times \exp(\gamma_N |t|)

(see the post for the notation) for some values of t, to see if the limitation to

|t|<2\pi

in Vardi’s result could be a mere artifact of the method. Here are some graphs representing these empirical averages for

N\leq 5000

(click to see the full pictures):

t=\frac{\pi}{2}

t=\pi

t=2\pi

t=4\pi

In particular, note that in the last picture, the vertical scale runs from -300 to 300, more or less, compared with oscillations between 0.5 and 1.5 in the third. So it seems pretty convincing evidence that the limit as N goes to infinity does not exist when t is large.

(Note: the empirical average for N=5000 involves about 8,000,000 Dedekind sums).

Cauchy joins the club

I’ve already mentioned what my co-authors (J. Jacod, A. Nikeghbali) and myself call “mod-Gaussian convergence”, and “mod-Poisson convergence” (and I will hopefully soon write down some summary of the more recent developments of these notions; for the moment I’ll just mention that the first two papers will appear soon, one in Forum Mathematicum and one in International Math. Research Notices). It was obvious how to extend these definitions to other natural families of standard probability distributions, the only condition required being that the characteristic functions must not have zeros (since we need to divide by them). Precisely, consider any set Λ of probability measures on the real line, with

\phi_{\lambda}(t)=\int_{\mathbf{R}}{e^{itx}d\lambda(x)}\not=0

for any

\lambda\in\Lambda,\quad\quad t\in\mathbf{R}.

Then we can say that a sequence Xn of random variables converges mod-Lambda for some parameters

\lambda_n\in\Lambda,

and some limiting function Φ, if we have limits

\lim_{n\rightarrow +\infty} \quad\phi_{\lambda_n}(t)^{-1}\mathbf{E}(e^{itX_n})=\Phi(t),

for all real t, which we assume to be locally uniform to avoid degeneracies.

There are “easy” examples like

X_n=Y_n+Y,

where Yn has law λn and is independent of the fixed random variable Y: in that case the limiting function is simply the characteristic function of Y. But one may wonder about more interesting examples, and part of the works mentioned above was dedicated in finding such examples in number theory when Λ is either the set of Gaussians or the set of Poisson variables.

Now I’d like to mention two cute examples involving Cauchy distributions; they are interesting for a couple of reasons: (i) again, they involve arithmetic, but of quite a different sort (and one could in fact be interpreted as a geometric or topologic phenomenon); (ii) they are not quite of the type above: indeed, for these two examples, the convergence only holds for t in an open interval around t=0, not for the whole real line (or rather, it’s not entirely clear whether they extend; the proofs don’t, and seem to do so for good reasons, but I haven’t yet looked very deeply at this).

I’ve written a short note with the details (short because, like some of the early cases of mod-Gaussian and mod-Poisson convergence, the results are mostly re-interpretations of earlier works; those, however, are by no means trivial), which can be found here. I’ll describe briefly one of them here, which is related to a result of Vardi on the distribution of Dedekind sums (the second is maybe even more fun: it is related to linking numbers of modular knots, a topic much popularized by É. Ghys, and based on a recent preprint of Sarnak; but I’m less familiar with the underlying objects).

First, here is the definition of the Dedekind sums, defined for coprime positive integers c and d with d< c:

s(d,c)=\sum_{h=1}^{d-1}{\Bigl(\Bigl(\frac{hd}{c}\Bigr)\Bigr)\Bigl(\Bigl(\frac{h}{c}\Bigr)\Bigr)},

where ((x)) is the saw-tooth function, periodic of period 1 with

((x))=x-1/2,\quad\quad 0<x<1,\quad\quad ((0))=0.

These look strange or arbitrary when presented so bluntly, but they are quite important and fairly-ubiquitous in certain areas of mathematics (see for instance here for a report on a recent workshop dedicated to them…)

The question solved by Vardi concerned the distribution of these sums. Precisely, he proved that — suitably normalized –, the sums s(d,c), averaged over all c<N and all allowed values of d, have a limiting Cauchy distribution. To state this formally, let first

F_N=\{(c,d)\,\mid\, 1\leq d<c<N,\quad (c,d)=1\},

and write

\mathbf{P}_N(\text{some property})=\frac{1}{|F_N|}|\{(c,d)\in F_N\,\mid\,\text{the property holds}\}|,

and

\mathbf{E}_N(\alpha(d,c))=\frac{1}{|F_N|}\sum_{(c,d)\in F_N}{\alpha(d,c)},

to get some (finite) probability spaces. Then Vardi proves

\lim_{N\rightarrow +\infty}\quad \mathbf{P}_N(a<\frac{s(d,c)}{(\log N)/(2\pi)}<b)=\mu([a,b]),

for any a<b, where μ is a standard Cauchy distribution, namely

d\mu(x)=\frac{1}{\pi}\frac{1}{1+x^2}dx.

The characteristic function of the more general Cauchy distribution with parameter γ>0, namely

d\mu_{\gamma}(x)=\frac{\gamma}{\pi}\ \frac{1}{\gamma^2+x^2}dx

is given by

\int_{\mathbf{R}}{e^{itx}d\mu_{\gamma}(x)}=\exp(-\gamma|t|),

and since those functions do not vanish, one may wonder about the possibility of getting here some example of mod-Cauchy convergence. And indeed, if one looks at Vardi’s proof with such an ulterior motive, one sees that this is derived elementarily from an asymptotic formula for the characteristic function of s(d,c) on FN. Namely, after cleaning up the notation, Vardi proves that

\mathbf{E}_N(e^{its(d,c)})=\exp(-\gamma_N|t|)\Phi(t)+O(N^{-2/3})

uniformly for

|t|<2\pi,

where

\gamma_N=\frac{1}{2\pi}(\log 4N),

and the limiting function is the rather remarkable expression given by

\Phi(t)^{-1}=\Bigl(1-\frac{|t|}{4\pi}\Bigr)\quad\frac{3}{\pi}\quad\int_{SL(2,\mathbf{Z})\backslash \mathbf{H}}\quad\quad (\sqrt{y}|\eta(z)|)^{t/(2\pi)}y^{-2}dxdy,

in which the function η(z) is the Dedekind eta-function!

This is a restricted mod-Cauchy convergence: because of the size of the parameters, the “main term” is in fact not always dominant, and the limit of

\mathbf{E}_N(e^{its(d,c)})\exp(\gamma_N|t|)

is only guaranteed to be Φ(t) in the range

|t|<\frac{4\pi}{3}.

Since the error term in Vardi’s formula depends crucially on the spectrum of some Laplace-like operator acting on modular forms with a multiplier system depending on t, it is however not clear at all that this can be improved to obtain a larger range. (But this may be tested experimentally, since Dedekind sums can be computed very quickly using the reciprocity relation they satisfy).

A puzzling feature of Vardi’s argument is that, although one is not altogether surprised to see the eta function coming up in studies of Dedekind sums (indeed, Dedekind sums were defined from their occurence in the transformation properties of the eta function!), the precise connection leading to this asymptotic formula is quite indirect.

Compared with our examples of mod-Poisson and mod-Gaussian convergence, one very different feature is the shape of the limiting function. At the moment, I do not know anything really about the behavior of Φ(t), in particular, the behavior of the second expression involving the integral of powers of the eta function. I haven’t found anything about them yet in the literature, but it wouldn’t be surprising if some special identities, for instance, were known…

More conjugacy classes

I’m still thinking aloud (or the bloggerly equivalent thereof) about the topic of my last post, and I’m at this delightful stage of guessing there may well be interesting questions there, and yet not knowing too precisely which ones are easy, which are impossible, or even which are already hidden in the maze of MathSciNet under cleverly disguised search terms.

So consider the case of G=SL(2,Z) again, and assume given a subgroup H. In broadest terms, we’re trying to identify which conjugacy classes in G have representatives in H. We can’t exclude that all of them do; if that happens, we know that (1) H is of infinite index (see the first comment by D. Speyer to the earlier post); (2) but H surjects, by reduction modulo p, to SL(2,Fp) for every p. The latter condition implies in particular that H be Zariski-dense in SL(2) (otherwise, its reduction would be in G(Fp) for some proper algebraic subgroup, and this would be strictly contained in SL(2,Fp) if p is large enough). Nicely enough, such subgroups (especially when finitely generated) are currently the topic of much work in terms of spectral theory, expansion and the like (see for instance these recent preprints by Bourgain, Gamburd and Sarnak, and by Bourgain and Kantorovich).

The conjugacy classes of G have been classified for a long time (for instance, this is needed for the Selberg Trace Formula). The most interesting, or at least those I’m going to look at first, are the so-called hyperbolic ones, which are characterized by the fact that, for some (unique) a>1, they contain a representative which is conjugate in SL(2,R) to

\left(\begin{array}{c} a\quad\quad 0\\ \ 0\quad\quad a^{-1}\end{array}\right),

which acts as a dilation

z\mapsto a^2 z

on the Poincaré upper half-plane. A more direct characterization, in terms of an arbitrary representative g of the conjugacy class, is that

|\mathrm{Tr}(g)|>2.

So, for instance, we can take the conjugacy class of

g=\left(\begin{array}{c} 2\quad\quad 3\\ \ 1\quad\quad 2\end{array}\right)

In the case of a conjugacy class in G, the dilation a is a real quadratic integer (it is the largest eigenvalue of the matrix, and the determinant, which gives the constant term of the minimal polynomial, is 1). In the example above, we get

a=2+\sqrt{3}=3.7305\ldots

In SL(2,R), the dilation is the unique invariant of a hyperbolic conjugacy class (and visibly any a>1 occurs as a dilation). In G, things get a bit more arithmetic (which means more complicated, though the two words are maybe not quite synonyms). Essentially (I am here forgetting or glossing over some important semi-technical issues), for a given discriminant

\Delta=\mathrm{Tr}(g)^2-4=(a+a^{-1})^2-4>0,

there are only finitely many G-conjugacy classes, and the number of them is the class number of the associated real quadratic field. (Precise details are given in this old paper of Sarnak).

From my point of view of conjugacy classes, the following seems the obvious salient features:

(1) to have a chance to find a given hyperbolic conjugacy class in a subgroup H, a necessary condition is that H contains a matrix with a certain trace (up to sign; if we assume that minus the identity is in H, the sign ambiguity disappears); this condition, in turn, is obviously susceptible to local congruence obstructions — but we know that for a Zariski-dense (finitely generated) subgroup of G, all but finitely many of these congruence obstructions modulo primes will vanish by Strong Approximation.

(2) if we have a subgroup where all local obstructions disappear (for instance, all reductions modulo primes are surjective; not I don’t actually have an example of a proper subgroup of infinite index where this holds…), we are led to wonder whether all ideal classes associated with hyperbolic elements of G have representatives in H; this question is reminiscent of the representation problem for integers by ternary definite quadratic forms (where there are fairly simple necessary conditions for this to happen, and those are fairly classically also sufficient for an integer to be representation by some form in the same genus as the given one, which means by some form everywhere locally equivalent to it, while the representability by the given form holds for sufficiently large integers by much deeper work involving Fourier coefficients of half-integral modular forms — a very beautiful story, where crucial work is due to Iwaniec and Duke and Schulze-Pillot).

As before, hopefully more to come…

The conjugacy classes which appear in a subgroup

A fairly well-known fact about finite groups says that if H is a subgroup of G, and H intersects every conjugacy class in G, then in fact H=G. This is quite useful, for instance, for some problems of Galois theory, because one might have to understand a finite group using information only about which conjugacy classes it represents in a bigger group (e.g., a Galois group represented as permutation groups of the roots of an integral polynomial, where the factorization of the polynomial modulo various primes indicates which conjugacy classes of the corresponding symmetric group intersect the Galois group; I’ve already mentioned this type of things here and here).

It is natural to ask what happens with other kinds of groups. The example of compact Lie groups shows that if G is infinite, there may well exist a subgroup H intersecting every conjugacy class; for instance, if G=U(n), every element can be diagonalized, i.e., every element is conjugate to one in the subgroup H of diagonal matrices (which, if n is not 1, is not the same as G…) However, these are quite special groups, and one might suspect that some interesting infinite groups retain this property (which I’ll call the Jordan property here, as suggested by Serre’s nice paper about this theorem of Jordan).

Although I’ve started looking around, I haven’t found much information yet on this. The first groups I’m trying to understand are arithmetic groups like G=SL(n,Z). Here’s one simple example in such a case: if n is at least 3, then G has the Jordan property “with respect to finite index subgroups” (i.e., any finite index subgroup intersecting all conjugacy classes of G is equal to G). This requires a fairly big hammer, but is otherwise very easy: by the Congruence Subgroup Property, any H of finite index satisfies

G(d)\subset H\subset G,

for some integer d, where

G(d)=\{g\in G\,\mid\, g\equiv 1\text{ mod }d\}

is a principal congruence subgroup. This means that, for some subgroup Γ of the finite quotient G/G(d), we have

H=\{g\in G\,\mid\,  g\text{ mod } d \in \Gamma\},

but then it is immediate (by lifting to H) that if H intersects all conjugacy classes of G, then also Γ intersects every conjugacy class in G/G(d), and we get

\Gamma=G/G(d),

from the finite group case, and therefore H=G.

More generally, one sees at least (without using the Congruence Subgroup Property) that if H is a subgroup of G=SL(n,Z) intersecting every conjugacy class, then we have

H\text{ mod } d=SL(n,\mathbf{Z}/d\mathbf{Z}),

for all d (because the reduction modulo d maps G surjectively to SL(n,Z/dZ) for all d). However, this condition is not as stringent as it may look: it is known (the “Strong Approximation Theorem” of Mathews, Vaserstein and Weisfeiler) that this holds, at least for all integers d coprime with some “conductor” f, for any subgroup of SL(n,Z) which is Zariski-dense in SL(n), and such groups can be quite “small”. However, one might intuitively hope that, being “smaller” than finite index subgroups, they would intersect fewer conjugacy classes (?). On the other hand, I also don’t know offhand of a non-trivial subgroup with conductor f=1

For the special case n=2, the Congruence Subgroup Property fails (one way to see it, as explained in this survey of Raghunathan, is to contrast the fact that SL(2,Z) has finite quotients like the alternating group A5, whereas any non-abelian simple quotient of a congruence subgroup is of the type SL(2,Z/pZ) for a prime p, and none of these is isomorphic to A5, simply because none is of order 60). Then it’s not clear to me if some finite index subgroup (not of congruence type) could intersect every conjugacy class of SL(2,Z).

Hopefully, I’ll have the occasion to write more about this as I explore the literature…