Jeeves and the PhD

Since the topic of graduate schools and the choice thereof seems suddenly popular, discussions of the outcome of graduate school, the dreaded PhD thesis, should also start soon. So it seems a good time to link to an old text of mine, entitled Jeeves and the PhD, which describes fairly and realistically some universal aspects of thesis-writing.

Compared with the Adventures of Schlomo Cohen, this text has the advantage of being much shorter (and in English). It also contains no mathematics whatsoever (this blog is about mathematics, but only mostly).

From the literary point of view, just as in the Scholomo Cohen stories, one can not claim that I try to hide my influences. I was, and still am, a great fan of P.G. Wodehouse (indeed, a positive proportion of my English vocabulary derives directly from reading his books, from “flabbergasted” to “flummoxed”, with “tidly om pom pom” in between).

This story was written, over one night in 1996 or 1997, to cheer up a friend of mine (the author of a very nice book of Mathematics for physicists) when he was, like many a graduate student, pulling all-nighters in order to finish typing his thesis (which he did defend brilliantly not long afterwards). It may still be considered as funny by similarly shackled graduate students (in any field), and can be read as cautionary tale by those hesitating to pursue graduate studies…

Modular signs: story of a workaround

Analytic Number Theory, as a discipline, is one of the natural habitat of a wonderful mathematical beast, the workaround. This can be seen, for instance, from the fact that many of the most natural consequences of the Riemann Hypothesis (and even of the Generalized Riemann Hypothesis) are known to be true unconditionally, and some have been for quite a while.

For instance, Hoheisel managed to prove (in 1930) the existence of a prime in any interval of the type N<n<N+Nθ, N large enough, for some fixed θ<1, an impressive result if one knows that the “obvious” way to prove this would be to use the (unproved!) fact that the Riemann zeta function ζ(s) has no zero with real part larger than θ. His trick was to combine known (but weak) zero-free regions for zeros of the zeta function with density theorems, which state that zeros off the critical line, if they exist, are rarer and rarer close to the line Re(s)=1. In other words, a good enough bound (tending to infinity with respect to some variable) for the cardinality of the empty set is essential.

There are many other examples but I want to discuss one today which is of much less import, though (to my mind) quite cute. It allows me to start by mentioning the Sato-Tate Conjecture. This has been proved recently in many cases for elliptic curves by R. Taylor, building on works of himself, Clozel, Harris and Shepherd-Barron; clearly I can not do better here than refer to the brilliant discussion of the statement, its meaning, and the context of the proof that can be found in Barry Mazur’s recent survey paper in the Bulletin of the AMS. I will only recall, for my purposes, that the essence of the conjecture is that, for certain very special sequences of Fourier coefficents of cusp forms, λ(p), which are indexed by prime numbers p, and all lie between -2 and 2 (in the usual analytic normalization which is criminal by algebraist’s standards…), we expect to be able to guess accurately what proportion of them (among all primes) lie in an interval α< x<β with fixed α and β. In particular, this proportion should be positive.
Now the question I want to consider is a variant of the following fairly classical one: suppose we have two sequences λ1(p) and λ2(p), coming from two different cusp forms (two distinct elliptic curves for instance, non-isogenous to be technically accurate). The sequences are known to differ; how large (in terms of the parameters, which are typically two positive integers, the weight and the conductor, though the weight is always 2 for elliptic curves) is the first prime which shows that this is the case?

There have been quite a few works on this problem, which is seen as an analogue of an even older problem of analytic number theory, that of the least quadratic non-residue modulo a prime number, where the two sequences are replaced by that of values of the Legendre symbol of p modulo another fixed prime q, whereas the other is the constant sequence 1. This earlier problem is of considerable importance in algorithmic number theory, and both have been excellent testing and breeding grounds for various important techniques, notably (and this is close to my heart…) leading to the invention and development of the first “large sieve” method by Linnik.

But I said I was interested in a variant; this is motivated by recent work of Lau and Wu, exploring the structure of the set of sequences sharing the same first few terms. For the quadratic non-residue problem, they have essentially found an optimal “threshold” y for which they know quite precisely how many sequences coincide for primes up to y (the number is in terms of the conductor, this being the only parameter left). They have a similar upper bound for the case of cusp forms, but it is unlikely to be sharp, for the simple reason that the coefficients λ(p) may take many more than the two values -1 and 1 (and sometimes 0) taken by Legendre symbols, so that repeated coincidences should occur much more rarely.

And this leads, at last, to the question of interest here: suppose, instead of looking at the values of the Fourier coefficients, we only retain their signs? Because they are real (at least in many cases of interest), this eliminates the difference between the number of values. (We may either take the sign of 0 to be 0, or we may, to make the problem harder, consider that 0 is compatible with both signs).

Before we can try to see if the Lau-Wu threshold is likely to be correct, there is an even simpler question that must be answered first, and that has at least a naive appeal: given two sequences λ1(p) and λ2(p) as above, assume now that their signs coincide (or are compatible if we want to have 0 be of both signs) for all primes p. Are the sequences identical? What about if we allow for the signs to coincide except for a small proportion of the primes?

What is the link with the Sato-Tate Conjecture? Well, one of the standard ways to detect that two modular forms are in fact the same is to use one of the famous corollaries of the Rankin-Selberg method: summing the product λ1(p)λ2(p) over primes p<X leads to a quantity S(X) which is either bounded as X grows, or behaves like π(X) (the number of primes up to X), depending on whether the sequences are distinct or the same. This dichotomy implies that, if we can show that the sum S(X) grows (however slowly!) as X gets large, the first alternative being wrong, we must indeed have started with identical sequences.

The point is that if the signs of the two sequences are compatible, the product λ1(p)λ2(p) is always non-negative. This does not by itself imply that S(X) grows unboundedly: it could be that the absolute value of the two sequences are always balancing so that the product is small enough to define an absolutely convergent series.

But the Sato-Tate law at least immediately implies that for each sequence independently, there is a positive proportion of primes where |λ1(p)|> α for any fixed α>0. If we take α small enough, the proportion will be >1/2, so there will be a (smaller) positive proportion of primes where both sequences are “large” in absolute value. Since S(X) is (by positivity) at least as large as any partial sum, we win.

Now, for the workaround… The Sato-Tate law is only a theorem for a restricted class of modular forms. For non-holomorphic cusp forms in particular, it seems very hard to prove. Can we still show that the sequence of signs of their Fourier coefficients determines uniquely such modular forms? Yes, by adapting slightly an idea of Serre that he used to show that various other consequences of the Sato-Tate Conjecture could be derived from the accumulated known results concerning the existence of symmetric power L-functions (which, since the irruption of the Langlands program, seem to be the most natural way to attack this type of conjectures).

Here the idea is to find some even real polynomial P of small degree (4 if using symmetric fourth powers, 6 if using symmetric sixth powers, and so on) with graph looking like this (for non-negative values):

Plot

The idea is that the sum of P(λi(p)) over p<X is clearly smaller than the sum of the coefficients over primes where λi(p)>α, where α is the real zero of P that we see on the picture. On the other hand, by decomposition P as a sum of Chebychev polynomials X2j, j even (which, evaluated at the Fourier coefficients, represent the coefficients of the 2j-th symmetric power), the sum is asymptotically the same as a0π(X), where a0 is the coefficient of X0=1, if we know that the polynomial only involves symmetric powers for which we know there is no pole at s=1. If moreover this leading coefficient is >1/2, it follows that the set of primes where the Fourier coefficients in both sequences are simultaneously has positive density, and one can conclude as we did under the Sato-Tate Conjecture.

So can we implement this? With 6th powers, we can, but not with 4th powers only! Indeed, one can easily show that there is no polynomial P=a0X0+a2X2+a4X4 such that P(0)≤0, a0>1/2, and P≤1 on [0,2]. But the polynomial P=X2-X4/4 is “borderline”: it only fails because a0=1/2 in this case. Then we can simply add small multiples of X0 and X6 to obtain the graph above, the 6th-power being adjusted to compensate for the increase of a0 above 1/2.

The simplicity of the argument shouldn’t obscure the depth of the underlying tools: the analytic continuation and absence of pole at s=1 of the 4th and 6th symmetric powers is a very recent fact, proved by Kim and Shahidi in 2002.

(To conclude, I should say that it’s very possible that this question has already been considered, although looking in Math. Reviews didn’t turn out any directly related paper; I’d be happy to mention any earlier work, of course; also I’ve disregarded some issues, e.g., having to do with CM forms. For details of the arguments and a few other questions, see this short note).

Evolution of the paper (followup)

As a followup to the earlier post of the evolution of the paper, here’s another piece of information: the word “conjecture”, in the sense currently used (and sometimes abused), doesn’t appear in the Annales de l’ENS before the 1950’s (searching for “conjectur*” in the whole text, and disregarding two early papers of Jacobi and Beltrami which are really discussions of how they were led to some result or other, and some chemistry/mineralogy papers).

The evolution of the paper

One of the nice consequences of the current development of online archives is that reading the classics has really become much easier than before. Of course, “classics” refers here to those mathematical texts which were published in journals since the middle to late 19th century, and are in easily accessible languages (which, for me, means French or English, and I guess I can look a little bit at Italian papers without feeling too lost too quickly).

As a sometime “flâneur” along those roads, my favorite site is NUMDAM, which contains archives of many journals and seminars, mostly French (there are a few Italian journals and seminars included, as well as Compositio Mathematica). Compared with other sites like JSTOR, Project Euclid, or the Goettingen archive, NUMDAM seems much quicker and easier to browse. It is also freely available (except for the last few years of some journals, but since we speak of classics here, this is not an issue). Moreover, besides the standard PDF format, it has copies of the papers in the djvu format, which is much more compact.

During the last snowy Easter week-end, I’ve looked at NUMDAM to check some points of what might be called the natural evolution of the mathematical paper. I used the Annales Scientifiques de l’École Normale Supérieure as a source, because it has been published continously since 1864.

So one finds, in no particular order (nor guarantee of correctness of the dates mentioned…):

  • That proper separate bibliographies did not occur until around 1948, with footnotes then indicating that “Numbers between brackets refer to the bibliography located at the end of the paper”.
  • Except that this was mostly in French; in fact, the first papers not in French in this journal appear in 1968 only (series 4, volume 1; it may be, and this would be easy to check, that this new series was the first where languages other than French were permitted). The second such paper is quite famous: it is John Tate’s Residues of differentials on curves.
  • What about the first joint paper? The honor goes to Castelnuovo and Enriques, in 1906. But this is an outlier: the next two only occur in 1934 (one is also famous, due to Leray and Schauder), and papers with more than one author can be counted on two hands until 1954. (I am disregarding, here, two earlier papers by Pasteur and Raulin, in 1872, which have to do with the fight against silk-worm diseases, and another one on the construction of the official “mètre étalon”, or yardstick; such non-mathematical papers disappeared around that time, although in 1896, the journal included the discourse given by the renowned Désiré Gernez on the occasion of the inauguration of a statue of Pasteur.)
  • Then, what about the first joint paper in English? There’s Tate again, with F. Oort in 1970.
  • OK, and what about the first paper written by a French person in English? Here, there can be some ambiguity, since someone may well be French without having a name that claims it to the world (…), but Alain Connes may be the first one.
  • Another information that could be interesting would be the first article written by a woman, but since first names are typically missing from much of the early tables of contents (and can be ambiguous), this is even harder to decide. The first unambiguous example I saw is due to Jacqueline Ferrand, in 1942.

In an idle moment today, I also looked more quickly at the Bulletin de la Société mathématique de France, which goes back to 1872. The various dates are somewhat similar; there is a single paper not in French before 1952, by Wiener (in 1922). I can envision a vicious fight among the editors to decide if it could be published; at least the first few paragraphs mention that the work on which the paper is based was done in France, and thank profusely Fréchet for his insights… (There are also a few earlier papers translated in French from their original language, for instance Heegaard’s thesis appears in 1916, translated, presumably, from the Norwegian). The first joint paper appears in the same year, 1916, and the next one only in 1930.

Reading the titles of article before 1950 or so can be quite amusing; mixed with terminology that still seems very modern and recognizable, there are certain gems like the anallagmatic metric (“métrique anallagmatique”) of R. Lagrange, in 1942. Strangely enough, this is a word recognized by the OED: “Not changed in form by inversion: applied to the surfaces of certain solids, as the sphere”, with quotations from Clifford, in 1869, and Salmon, in 1874. This is too bad, since I was hoping that the date indicated that this paper was an elaborate coded transmission to the Free French… There is also in most titles an element of high and formal seriousness that can be a bit tiring; all those papers starting with “Sur une équation…”, or “Sur une propriété…”, or “Sur quelque chose…” (“On something…”) do not give a great impression of the fun of doing mathematics.

The authors of the early papers are also divided pretty sharply between names we all know (or have heard about, not only French, but from most countries we think of as having a strong mathematical culture in that period), and completely obscure characters (at least to me). The most magnificent I found is the (probably) redoubtable Gaston Gohierre de Longchamps who published three papers in the Annales de l’ENS between 1866 and 1880 (the second of which concerns Bernoulli numbers, and quotes the equally remarkable M. Haton de la Goupillère).

Finally, one observes with interest that the poor reputation of rigor of these older mathematicians is an unwarranted slander; no erratum is needed for the entire corpus of the Annales de l’ENS until 1953, with the single exceptions of one in 1907 (a paper by Émile Merlin), and of a remonstrance by Brouwer pointing out a few mistakes in a paper of Zoretti in 1910, which the latter rather grudgingly accepts (by claiming that another mathematician had priority in finding those)…

Gossip

Here is a minor post about one of the minor pleasures of life: etymology. I realized recently that “a gossip” (as a person) and “une commère”, which are more-or-less translations of each other in French and English, have the same higher-level etymology. In other words, they evolved in the same way from words having the same meaning, that of “godmather”, or “marraine” respectively, although they do not share a common older Latin, or Greek, or other, root (the English word comes from “God” and “sib”, of teutonic origins related to “kin”; the French comes from Latin “co” and “mater”). In English, the forms “Godsibbas”, “godzybbe”, go as far back as the 11th century, but the modern meaning is attested in the OED only from the late 16th century on. In French, the original meaning is seen from the late 12th century, and the modern one from the 14th century.

(I will leave aside any discussion about the historical or psychological insights provided by this parallel evolution of godmather).

There’s not any mathematics in there, of course, except for having decided to look for the relevant information (in the OED and the Grand Robert de la Langue Française, or the Trésor de la langue française, respectively) after discussions with colleagues at ETH where the question arose of the right translation of “gossips” in French. Note that although “commérages” is in a sense obvious, the undercurrents seem to not be quite the same…