Quoting the great unknown

Most scientists, and mathematicians in particular, must have been confronted at some point with the problem of properly quoting a theorem (or a theory, or a conjecture, …) that they do not truly understand. It may be that they have not read the proof in detail to have checked independently its correctness, or that the result involves concepts, objects and theories with which they are not familiar. Now, how should this be phrased? Surely, this is part of the information to convey to the reader of a scientific paper or book?

Well, as far as I understand (this being one of the many things that one is supposed to learn on the job), this type of doubt or indeterminacy is supposed to be hidden. The only accepted style of citing, at least judging from the typical mathematical text, is both steadfastly Olympian and touchingly Gimpelian. It implies, at the same time, a complete mastery of whatever is involved in the result mentioned, however incorrectly it may be phrased (“Wiles has proved that every modular elliptic curve is associated with a cusp form with integral coefficients”), and a childlike belief that whatever is claimed in a written source must be true (“Grothendieck has proved [SGA V, Exp. 12, Section 4, Théorème 3.1] that 27 is the only prime number which is not squarefree”). I’ve very rarely seen a quote come with a comment that the author does not claim to have understood it (which, it is true, would be somewhat embarassing for a result which is crucial to one’s proof…)

Personally, I often feel a definite awkwardness in citing a result I don’t “know” in the deepest sense of the word, and of course with the current profusion of preprint sources, things tend to become worse as more results are available for use and quotation than ever. One can try being careful (“Grothendieck has announced a proof that 27 is prime”), but what may seem reasonable when speaking of a result barely out in the arXiv, quickly becomes insulting when refering to some result which is published and is really true, but which, not to make bones about it, one has simply not checked personally. Indeed, referees may take this badly, especially if they happened to prove the theorem in question.


To give a concrete example, I have no doubt that the Riemann Hypothesis over finite fields is true, but although I have really done a lot of reading about it, and can claim to have gone in great detail in the first proof of Deligne, I can not yet claim to have mastered the second — which I’ve used much more often in my work (and even with the first, I have certainly not a full mastery of the total amount of background material, such as the complete proof of the Grothendieck-Lefschetz trace formula). This example is not academic at all: many analytic number theory results depend on estimates for exponential sums which are not accessible at all without Deligne’s work and its extensions, but very few are the analytic number theorists who understand the full proof. (My own PhD advisor said that this was essentially the only result he had ever used that he could not, if needed, reprove from scratch).

I think most uses of the Riemann Hypothesis over finite fields are fair, however: many people who use it for analytic number theory have a highly sensitive sense of what estimates should be correct, and indeed they quite often suggest precise conjectures about what they expect (which are then hopefully confirmed by people like N. Katz, using the deep algebraico-geometric framework underlying the Riemann Hypothesis): the point is that there is really an interplay between the analytic number theorists (who suggest the application of the Riemann Hypothesis, quite frequently in highly non-trivial ways) and the algebraic geometers who prove the required estimates hold. (One of the nicest example of this interaction is the beautiful paper of Fouvry and Katz on applications of the Katz-Laumon theory of “stratification of estimates” for families of exponential sums).

Another fairly common situation (at least recently), for analytic number theory, is to use the modularity of elliptic curves. Here, the gap between my understanding and what would be required to claim that I can vouch for the proof, is larger: although I’ve been exposed a lot to the general strategy, I’ve never understood in any depth the crucial final steps. But again, what I (and many others) usually want to use this theorem for is to state a corollary of a result we have proved for more general modular forms. Even if the case of elliptic curves might be the best motivation for the work, the general case is usually interesting enough that it doesn’t feel like cheating, especially if this point is explained in the introduction.

These are cases where it’s possible to phrase things professionnally enough. But sometimes, it’s much more complicated. Indeed, what should be true may be much less obvious, and the result we prove might depend completely on a truth that we can, in all honesty, only say that it’s nice that it’s true, because otherwise (and it could have been otherwise, for all we know), we would be stuck.

Thus I’ve had the occasion to use results that ultimately depend on the classification of finite simple groups (through the type of “strong approximation theorems” that say that the reduction modulo most primes of a Zariski dense subgroup of SL(n,Z), for instance, is the full group SL(n,ZpZ)) , and here it’s quite difficult for me to know how to quote this honestly. It is true that the corollaries of the classification which are involved do not really depend on its finer aspects (e.g., they do not depend on how many exceptional groups there are, provided there are only finitely many), but on the other hand, I don’t really have enough personal experience and intuition about finite groups to feel that the classification should be as it is, with a few known infinite families and finitely many exceptions.


Note that I really have no suggestion about this. Would it be good advice to suggest, say, that any analytic number theorists working on modular forms should spend a few weeks getting acquainted enough with the proof of the Ramanujan-Petersson conjecture, in order to feel better about using it? This might be what I would want to say, but how far can it get? If one starts from scratch — knowing no algebraic geometry for instance –, much more than a few weeks will be needed to feel any confidence about the correctness of the proof.

In the prevailing scientific mindset, this is an investment of time that may well be out of the question for most people (with, possibly, the exception of beginning graduate students — they, I think, should at least spend enough time acquiring the basics of many “languages”, so that they can go further later on if needed in their work; it is much harder to learn the principles of probability theory, or combinatorics, or algebraic number theory, as a post-doc or young professor, than as a student). I must confess that, at the current time, I certainly do not think I will be able to invest significant efforts in trying to understand the classification of finite simple groups…

Published by

Kowalski

I am a professor of mathematics at ETH Zürich since 2008.

7 thoughts on “Quoting the great unknown”

  1. I think it depends to a large extent on how widely known the result is. If it is prominent enough, and one trusts the practitioners of that field, then presumably it has been checked and understood by the experts, and it would be safe to cite – if it turns out later to be wrong, you might be embarrassed a little bit, but the experts in the field would be much more so.

    If it’s an obscure result that has not been followed up much by the experts in the field, though, then I suppose one is obliged to do some checking of the result by oneself (or at least to subject the result to various “sanity checks”, e.g. computing key special cases).

  2. This is a tangent to your topic, but I’m interested in your comment about the two proofs of the Weil conjectures. Can you give a simple example in analytic number theory where Weil II says something quantitatively stronger than Weil I?

  3. Dear Emmanuel,
    1) Thank you for the link to Gimpel The Fool: it is a very, very moving and profound tale.

    2) Terence Tao writes: “If it is prominent enough, and one trusts the practitioners of that field, then presumably it has been checked and understood by the experts, and it would be safe to cite …”.
    This confirms an old feeling of mine that Mathematics is uncomfortably closer to Sociology than I would wish.

    3) My good computer science friends assure me that all these problems will soon be a thing of the past thanks to completely automated proof verification.

    4) I think you are very brave to mention this theme ( quoting results not understood by the author), which is rarely evoked in polite society.

    5) On a lighter note, you write:
    “This example is not academic at all: many analytic number theory results depend on estimates…”
    Of course I understand what you mean and you are completely right.
    But come to think of it, this is also utterly false: what could be more academic than analytic number theory ? Polysémie, quand tu nous tiens…

    6) Polysemy, Schmolysemy, I am looking forward to more of your always thought-provoking posts.
    Yours is a truly wonderful blog: thank you again.

  4. Concerning dt’s query: Weil II is particularly useful because the formalism is much more extended, and for instance I don’t think the Katz-Laumon theory (as used in Fouvry-Katz) could be phrased entirely within Weil I. Not only does the theory work over non-smooth, non-projective bases, but also the coefficients are not constant, and cases where the coefficients are “mixed” are considered on an equal footing. Even if one can, in a given instance, “dévisser” everything to a Weil I situation (compactifying, extending to the boundary, using coverings to trivialize the coefficient sheaf as much as needed…), these are not easy to do compared with a direct application of Weil II.

    To Terry and Tvordy: I think the practice of mathematics is indeed full of human/sociological aspects, and in particular issues of trust are often very important. The way these vary in “space” (the typical practice may be very different in a subfield compared with another), and in time could probably be the subject of many studies (e.g., I would guess that at the beginning of Grothendieck’s work on algebraic geometry, basically anyone who understood knew everyone else and had checked everything or could directly ask someone who had; in a different direction, in a field like analytic number theory, where one often has to adapt ideas and arguments of others instead of quoting a ready-made theorem, the state of “equidistribution of understanding” might be very different from algebraic geometry, where you can’t expect a starting student to read all EGA/SGA before working on a problem…)

    It would/will definitely be nice to have a lot of automated computer checks, but I think it will be a while (again, fields may differ; in analytic number theory, proving the correctness of the Prime Number Theorem — which has been done, I think — will not help that much because most often, one needs variants, variations, extensions, uniform versions, and there are many parameters…

  5. Dear Emmanuel,

    You have raised an interesting issue, with (I believe) no simple answer. I think that Terry’s suggestion on how to deal with the situation is a sensible one. I might add another piece of advice. (Note that, as with Terry’s advice, this is not advice on how to address this issue in one’s writing, but rather, how one should proceed when confronted with this situation in one’s research, so as to avoid blunders.)

    Most pieces of mathematics (including Weil II, for example) fit into a framework (and I don’t here mean a logical framework, but rather a narrative framework), with illustrative analogies to other parts of mathematics (in the case of the Weil conjectures, there are important analogies with algebraic topology and Hodge theory), interconnections between various results in the area, key motivations and heuristics, and so on, and one can often learn these even if learning the actual details of the arguments is out of the question.

    If there is such a narrative that one can learn, I would say it is normally a good idea to learn it, since it will give one a better feeling for the results being cited, and a better feeling for how to apply them correctly. On the other hand, if such a narrative structure isn’t available, it will probably be harder to test the correctness of one’s understanding of the results, since (short of actually reading the proof), there is nothing to check against.
    Perhaps in such a situation, it is probably a good idea, if possible, to verify with an expert that one is really applying the result in a correct manner. Good expository literature can also help a lot (both to learn the narrative, if one is available, or at least to learn one’s way around the results that one wants to apply).

    On the question of how one should phrase the citation in such a situation (of citing a result whose proof one doesn’t know): I think that having a good understanding of how to apply a result is itself a valid and important skill,
    whether or not one knows how to prove the result. (Similarly, we value good drivers/pilots of vehicles, as well as the engineers who build the vehicles themselves.) I don’t think that there is any intellectual dishonesty in citing a result with confidence, if one is genuinely confident that it is true (and trust in a group of established experts
    is a genuine and legitimate source of confidence) and one is genuinely confident that one understands the statement and the ways in which it can be applied.

    On the other hand, if one doesn’t have this genuine confidence with regard to a result that one is applying in some argument, then one could be heading for a blunder, and I would say that caution is required, not just in the citation, but in the construction of the argument itself.

  6. Actually the strong approximation theorems mentioned by Emmanuel DO seem to have a proof which does NOT rely on the Classification of finite simple groups ,see the paper of Richard Pink:Strong approximation for Zariski dense subgroups over arbitrary local fields Comm.Math.Helv.75(2000)608-643.
    The proofs in that paper rely instead on an 1998 preprint of Larsen and Pink (Finite subgroups of algebraic groups) which can be found e.g. on the homepage of Pink .It has NOT appeared even though at one time it was quoted as ˝to appear in the JAMS˝(!!!!)(and my understanding is that experts found the proof essentially correct).
    Larsen and Pink prove a characteristic p
    version of Jordan´s theorem. To quote a consequence:If L is a finite simple subgroup of GL(n,F),F a field of characteristic p, of order |L|> f(n)
    then L is a (known) simple group of Lie-type.
    The general result has been used to prove (or reprove) a large part of finite asymptotic group theory without CFSG.

  7. To M. Emerton: I like the word “narrative”, since it goes well with the need to understand the “language” in which various works are written — and this is, I think, something that can be done and encouraged at the graduate school level.
    However, one observes that languages tend to evolve, in mathematics, and one may have been fairly fluent in understanding a theory outside of one’s field at some point, and become completely lost ten years later. (E.g., Hilbert asking in a conference: “Can you explain to me, what is a Hilbert space?”)

    To L. Pyber: Thanks for the reference to strong approximation! I am vaguely reminded of having heard this or something similar earlier. I’m actually very well placed (i.e., I just need to step to the office next door to ask Richard…) to get more recent information about that preprint, so I’ll ask him next week.

    Interestingly, a colleague told me today of a remark by R. Thom, which can be found in his contribution to

    http://www.ams.org/bull/1994-30-02/S0273-0979-1994-00503-8/S0273-0979-1994-00503-8.pdf

    (p. 25), and states

    “My feeling is that it is unethical for a mathematical researcher to use a result the proof of which he does not “understand” (except for the specific case where he wants to disprove the
    result). In principle, of course, understanding here means a thorough knowledge of all the arguments involved in the written proof.”

    That’s pretty strong…

Comments are closed.