On Coscinomancy

Thanks to a query of A. Venkatesh, I have just been looking at the tangled and instructive etymology of the word sieve…

To the Greeks, a sieve is a koskinon (κóσκινον), whether it be of use as a cooking utensil, or as devised by that clever fellow, Eratosthenes, to find prime numbers. But the English word has a completely different etymology, of Germanic origin if the OED is to be believed.

From the same source, I’ve learnt that a coarse-meshed sieve can also properly be called a ridder or a riddle (the latter being an alteration of the former). In this form, it is said to be related to the Indo-European stem kreit-, which is also, as it turns out, (carrying a meaning of to separate, to judge) at the root of many fine words, such as crisis, critic, criterion; through the Latin cribrum, it leads to the French word for sieve, namely crible. (Though, as far as cooking is concerned, the instrument is rather called a tamis in French cuisine; this last word, although considered obsolete, does exist in English, as does the variant temse…)

As for the Greek word, it has left no trace in French, and (apparently) remains in English only under the guise of a delightful word, coscinomancy, which I regret not having known about at the time of deciding on the title of my inaugural lecture:

coscinomancy, n.
Pronunciation: /ˈkɒsɪnəʊˌmænsɪ/

Etymology:
< medieval Latin coscinomantia, < Greek κοσκινόμαντις, < κόσκινον sieve.

Divination by the turning of a sieve (held on a pair of shears, etc.).

1603 C. HEYDON Def. Iudiciall Astrol. xvii. 356 Comparing Astrologie with Aruspicie, Hydromancie, Chiromancie, Choschinomancie, and such like.
1653 H. MORE Antidote Atheism (1712) III. ii. 89 Coskinomancy, or finding who stole or spoiled this or that thing by the Sieve and Shears.
1777 J. BRAND Observ. Pop. Antiq. (1870) III. 301–2.
1871 E. B. TYLOR Primitive Culture I. 116 The so-called coscinomancy, or, as it is described in Hudibras, ‘th' oracle of sieve and shears’.

(To quote the OED again; note the amusing oscillations of the popularity of this word…)

The literary potential of author names

In an age where literary originality is hard to come by, shouldn’t more attention be paid to the great potential of author’s names as literary device? In mathematics, we can enjoy the delightful papers of Nicolas & Sárközy (for instance, this one). Theoretical physicists probably still chuckle when reading the paper of Alpher, Bethe and Gamow (though the story goes that Alpher was pretty upset when Gamow decided to add Bethe as a co-author, purely for euphonical reasons…)

Do you know any other examples?

(As for myself, I am sorry that the traditions of mathematical publication make it highly unlikely that a Stanley — Kowalski paper will ever appear.)

Terminology query: between thick and thin

In the current rapid development of “sieve in orbits”/”affine linear sieve”/”sieve in expansion”, as surveyed in my (upcoming) Bourbaki lecture, one is concerned with working with a finitely generated subgroup

\Gamma\subset \mathrm{SL}_m(\mathbf{Z}),

and one is led to consider its Zariski closure G, in the ambiant special linear group (the relevance of G is due to the fact that it turns out to control the image of Γ modulo almost all primes — a rather surprising fact, at first sight, which is due largely to work of Nori, Matthews-Vaserstein-Weisfeiler, Weisfeiler, Hrushovski-Pillai, in various degree of generality).

Together with G, which is seen as an algebraic group over the rationals, comes the corresponding arithmetic group

\Gamma_{a}=G(\mathbf{Q})\cap \mathrm{SL}_m(\mathbf{Z}),

which contains the given discrete group.

A basic dichotomy, already emphasized in the paper of Bourgain, Gamburd and Sarnak, is between the case where (1) Γ has finite index in Γa; or (2) this index is infinite.

The first case applies, obviously, to Γa itself, and simple examples of the second are the “Apollonian group” that controls Apollonian circle packings (as explained, for instance, in this lecture by Sarnak) or the subgroup of SL2(Z) generated by the matrices

\begin{pmatrix}1\quad 3\\ 0 \quad 1\end{pmatrix},\quad\quad\begin{pmatrix}1\quad 0\\ 3 \quad 1\end{pmatrix},

an example popularized by Lubotzky (the point being that if the two 3 are replaced by two 1 or two 2, the corresponding group is in the finite index situation).

The distinction according to the index is natural and crucial: for instance, if the Zariski closure is SLm itself, and m is at least 3, it follows that Γ has Property (T) of Kazhdan, and that immediately gives the crucial expansion properties of the congruence quotients, which are fundamental in the applications to sieve.

Now, here is the terminological rub: Bourgain, Gamburd and Sarnak suggested to call the second case (infinite index) the “thin” case, and the corresponding discrete groups, “thin” subgroups of Γa. The terminology is appealing, and it is unfortunate that it clashes with an earlier definition of “thin” sets of (rational points of) algebraic varieties, which appears, for instance, in Serre’s book “Topics in Galois theory” (see Chapter 3). (A standard example of an infinite but thin set of integers is the set of squares; the primes, on the other hand, are not thin.)

This notation clash is problematic in a few ways. For instance, a “thin” subgroup (one with infinite index in the corresponding arithmetic group) is not a thin subset of

\mathrm{SL}_m(\mathbf{Q}).

More annoyingly, a basic sieve fact that comes out of the expansion setting (when applicable) is that the subset of Γ where a function f takes integral values with few prime factors can be proved to not be thin (in the sense of Serre), even in the second “thin” case (infinite index). This is a nice qualitative measure of the fact that such such sets with “almost prime” values remain quite large, but as stated, there are too many thins in this sentence…

What to do? In French, it has been suggested by someone (during the conference on exponential sums that just ended here in Zurich, and hopefully I will remember who it was soon) to speak of “un sous-groupe fin” (in contrast with “un ensemble mince”, which is the established translation for the “thin” sets of Serre’s book). This sounds fine to me, but it’s not clear how to adapt that to English. A “fine subgroup” is not the right translation; in fact, I would typically translate (in a pastry context) “une tarte fine” with “a thin pie”. So I am rather scratching my head… Does anyone has a suggestion?

Reading Burnside (and thanking Noether)

In 1905, the famous rower W. Burnside (then aged 52) proved one of the results known as Burnside’s Theorem (the other one being, usually, the striking result that finite groups of order divisible by at most two primes are solvable):

Let k be an algebraically closed field, and let
G\subset GL_n(k)
be a subgroup of the invertible matrices of size n over k. Let k[G] be the span of G in the matrix algebra M(n,k) of size n. Then G acts irreducibly on kn if and only if k[G]=M(n,k).

Here, recall that irreducibility (a notion apparently first introduced by Burnside himself) means that there is no proper non-zero subspace

W\subset k^n,\quad 0\not=W\not=k^n,

such that G leaves W invariant (globally).

This result turns out to play a role in a current research project (with O. Dinai), and since I had never looked properly at the proof(s) before, I’ve been a bit curious about it, and tried recently to understand it. There are very simple proofs known, but the shortest ones seem to be typically not very enlightening when it comes to understand why the result is true. They’re the kind of arguments you might feel you could find once you knew the result, but why would you think of proving it first? So — it was vacation time! — I had a look at Burnside’s original paper. This can be found here; if you do not have access to the Proceedings of the L.M.S, here is a fairly representative extract of the style:

[Extract from Burnside's paper]

As far as I’m concerned, this is barely recognizable as meaningful mathematics, and almost unreadable. I say almost, because (vacation effect) I took it as an intellectual challenge to try to reformulate Burnside’s argument in more modern terms, and I believe that I succeeded. It was a big help that the paper is only four pages long; it turns out that the one page from which the extract is taken, although I can’t explain it in any reasonable way, contains the last step of Burnside’s argument. From the fact that he needed seventeen lines to prove the “obvious” half of his theorem, there was therefore every chance that whatever is done here in one page should not be too difficult to figure out with some thought.

So here is a sketch of my reading of his proof of the non-trivial direction (that, for an irreducible action, we have k[G]=M(n,k); for full details, see this short note). We denote

V=k^n,\quad E=M(n,k),\quad E^\prime=M(n,k)^\prime=Hom(E,k),\quad\text{the dual of } E,

and then we define

R=\{\phi\in E^\prime\,\mid\, \phi(g)=0\text{ for all } g\in G\}\subset E^\prime,

which is the linear space of all linear relations satisfied by the matrices in the group. By linearity and duality, we see that the goal is to show that, if G is irreducible, then the space R is zero. The steps for this are:

  • R is a subrepresentation of E’ for a natural action of G on E’ given by
    \langle g\cdot \phi,A\rangle=\langle \phi,g^{-1}A\rangle,\quad\quad g\in G,\ \phi\in E^{\prime},\ A\in E
    in duality-bracket notation (this is not the same as the usual tensor product of the tautological representation of G on V and its contragredient);
  • We now attempt to analyze this representation on E’, and the next three steps do this (they are completely independent of the problem at hand); first, the representation on E’ is isomorphic to a sum of n copies of the (irreducible) contragredient of the tautological representation;
  • Hence any irreducible subrepresentation in R is obtained as the image of a G-equivariant embedding
    V^\prime\rightarrow E^\prime;
  • But all such maps
    \alpha\,:\, V^\prime\rightarrow E^\prime
    are of the form
    \langle \alpha(\lambda),A\rangle=\lambda(Av)
    for some fixed vector v in V;
  • And we now come back to the problem of understanding the relations R; if R were non-zero, it would contain the image of a map α of this form, for some non-zero vector v;
  • But if we then specialize the definition of α with A being the identity matrix — which is in G –, we find, from the fact that the image of α is in R, that
    0=\alpha(\lambda)(1)=\lambda(v),
    for all λ, and this is a contradiction with the condition that v be non-zero… Hence we must have R=0.
  • It is not obvious here where it is necessary to use the fact that k is algebraically closed, but this is hidden in an application of Schur’s Lemma. Interestingly, it seems that Schur published this result also in 1905. Since Burnside also uses it without any comment (or hint of proof), it must have been known (at least to him) before. It is also amusing to note that, in fact, there is no mention whatsoever of a base field in Burnside’s paper.

    I like this proof, in part because it would make sense to try to proceed in this way, even if the result turned out to be different (say, a characterization of the relation module R instead of a proof that it is zero). Also, I may be influenced by the similarity with the study of relations between roots of polynomials that can also be done using elementary representation theory of the Galois group, as discussed in this old blog post.

    But as I said, there is a part of Burnside’s paper I really don’t understand, even if I suspect it is equivalent or very similar to what I did. And I am forever thankful that Emmy Noether came along some years later to put algebra on a more reasonable track than endless talk of “successive sets of symbols with the same second suffix” (which sounds almost like one of those alliterative exercises used to detect drunkenness…)

    Esperantism expands, but not so quickly

    Today, as it has been for a long time, it can only be a fantastic dream to know and understand all of mathematics, and virtuous mathematicians must perforce look for alternatives. One of the best is to find some analogy between different areas — a brilliant instance, being Vojta’s rapprochement between questions of diophantine approximation (e.g, Roth’s Theorem) and questions of Nevanlinna Theory. Another great satisfaction is to see surprising direct connections between two such areas: I still remember my surprise and delight on learning in a probability class how complex Brownian motion can be used to solve Partial Differential Equations (such as the Dirichlet problem, as shown by Kakutani).

    A very new joint work with J. Ellenberg and C. Hall brought (at least to me!) some of these emotions. The barest summary would be as follows: we describe very strong connections between the combinatorial notion of expander graphs (or, more properly, expander families) and certain types of finiteness statements in arithmetic geometry. There is already a bit of magic here, but the result is even nicer in that the proof depends crucially on another unexpected connection with a result of differential geometry of Li and Yau.

    I won’t describe the arithmetic geometry now (partly because Jordan has already written a very good summary…). Rather I want to explain what are the esperantist graphs that we introduce in the paper, and discuss some vague but enticing “philosophical” questions that this paper suggests.

    Let us start with expanders (a very good place to start); it might be that, with the possible exception of root systems, this is the most amazingly ubiquitous notion of 20th Century Mathematics — amazing in the sense that the definition can look hyper-specialized, until its influence extends (pun intended), and one day you realize that what looked like a practical network-communication question is needed, for instance, to give counterexamples to some form of the Baum-Connes conjecture (about which I don’t know anything, except for what a superb talk by N. Higson taught me about ten years ago). I highly recommend downloading the long survey paper of Hoory, Linial and Wigderson to get an idea of the breadth and importance of this notion.

    Now, there are different equivalent definitions of expanders, and I’ll use the least intuitive, for the simple reason that this is the one that comes in most naturally for our applications: given a sequence of graphs

    (\Gamma_n)_{n\geq 1},

    which we assume — for simplicity — to be connected and k-regular for a fixed k, one says that this sequence is an expander if (1) the number of vertices goes to infinity

    \lim_{n\rightarrow +\infty} |\Gamma_n|=+\infty,

    and (2) there is a uniform spectral gap δ>0 for all n:

    \lambda_1(\Gamma_n)\geq \delta>0,

    for all n, where λ1 is the first non-zero eigenvalue of the square matrix of size n| given by

    \Delta_n=k-A(\Gamma_n),

    in terms of the adjacency matrices of the graphs (this is known as the combinatorial Laplace operator; it is a non-negative symmetric matrix, with first eigenvalue equal to 0 with the constant eigenvector, which is unique up to scalar since the graph is assumed to be connected.)

    As it turns out, for our application, one does not need such a strong condition as uniform spectral gap. Precisely, we say that the family above is an esperantist family if it satisfies

     \lambda_1(\Gamma_n)\geq c(\log 2|\Gamma_n|)^{-A}

    for all n and some constants

    (c,A),\quad\quad c>0,\quad\quad A\geq 0.

    (Note: The factor 2 simply avoids ever talking of 1/0.) Thus expanders correspond to esperantist families where one can take A=0.

    Now, the obvious question: why did we select this name? Mostly, it is a question of alliteration; and we feel it just sounds nice (like étale topology sounds nice, or barreled spaces, or adèles, etc…)

    Then, scientifically, what does it mean, and why is it interesting? For us, the meaning is given in practice by a theorem of Diaconis and Saloff-Coste: if the family is a family of Cayley graphs for some finite groups Gn, with respect to systems of generators with constant size k, then they will form an esperantist family as soon as the diameter of the Cayley graphs grows polylogarithmically in the size of the groups, i.e., if there exists

    (c,B),\quad\quad c>0,\quad B\geq 0,

    such that

    \mathrm{diam}(\Gamma_n)\leq c(\log 2|G_n|)^B.

    And the fact is that showing this type of diameter growth is, at the moment, an indispensable staging point in all the recent developpments concerning growth and expansion in linear groups over finite fields, starting with the breakthrough by Helfgott (for SL(2,Fp). (For our purpose, the results of Pyber-Szabó are the most directly useful, because sometimes we do not control the groups well enough to claim, for instance, that they are given by G(Fp), where p varies, for a fixed algebraic group G; however, other papers with important results in this direction are one of Gill and Helfgott and one of Breuillard, Green and Tao.)

    As a matter of fact (this has been confirmed to us by many people), it is almost certain that all the families we consider in our paper are, really, expanders, and very likely that this will be formally proved and published in the near future. But as long as the proofs of this require showing the esperanto property first, and require additional non-trivial steps (which is the case for the moment), our own applications seem to be more transparent when phrased in terms of esperantism. And there might of course be applications where the graphs do not form expanders (though we have not found one yet).

    Now for the philosophy: the very barest summary of our main result is that, provided a family of finite (unramified) coverings

    U_n\rightarrow U,

    of a fixed algebraic curve over a number field has the property that the associated Cayley-Schreier graphs associated to the sets of cosets

    \pi_1(U)/\pi_1(U_n),

    form an esperantist family, then there are very strong diophantine restrictions on the points of the coverings defined over extensions of the base field of a fixed degree. (There is more information in Jordan’s post.) There is an unsuspected deus ex machina hidden, which makes the proof quite surprising: we use an inequality from global analysis of Li and Yau (already used by Abramovich in the setting of classical modular curves), which seems to come completely out of the blue.

    This seems to suggest that the following general analogue problem might deserve attention: suppose you have some “objects” for which it makes sense to speak of finite coverings, and of Galois groups, etc, and you have a sequence of these (say ?n) with finite coset spaces

    \mathrm{Gal}(?)/\mathrm{Gal}(?_n),

    and some finite generating set of the base Galois group (or something similar). Can one say anything interesting on the “geometry” if one assumes that this family of graphs is esperantist (or expanding)?

    Even the most natural analogues of our setting seem very interesting and quite tricky to think about:

    (1) what about coverings of a base curve defined over a function field of positive characteristic (say, with tame ramification to avoid unpleasantness)? Here one would think that the direct analogue of our statements might hold, but the Li-Yau inequality evaporates, and we are left scratching our heads (though one might hope, maybe, that some p-adic analogue could be true?)

    (2) what about coverings of higher-dimensional base varieties over a number field? Here, we do not even know yet what a reasonable consequence could look like…


    [Note: our results also depend on the comparison between hyperbolic and discrete laplace eigenvalues, specifically on the results of Burger in this direction, which are somewhat sharper than those of Brooks; since Burger’s method is described only sketchily in his papers, and his thesis — which contains the full details — is hard to find, we included the proof of the comparison we require as an Appendix to our paper, in the case of surfaces with finite hyperbolic area. This might be of some interest to some readers.]