An introductory course in integration and probability

A while ago (in 2002 to be precise), I taught an introductory course of integration, Fourier analysis, and probability in Bordeaux (for third-year university students). While giving the lectures during the first year, I typed them more or less in parallel (in fact, sufficiently close to the course that I could probably have done it also on a weblog, as T. Tao has done this year with his course on Ergodic Theory, and his course on Perelman’s proof of the Poincaré Conjecture…).

Except for using those notes two more years in the same course (or a very similar one the third year), I did not work on them anymore.  But since I had spent a fair amount of time to bring  the text to a reasonable state of polish, and since I’ve found myself using it a few times as a convenient reference for basic facts that I wanted to show to other students, it seems more than reasonable to put this course on the web somewhere.  And so, here it is: Un cours d’intégration.

The text is in French, which diminishes its current interest, but as it is likely that I will teach this topic again in English at ETH, I’ll probably use it as a basis for a translation and/or adaptation.

As the introduction indicates, the only noticeable feature of this integration course (though I don’t think it is at all unique) is that I have treated probability theory in parallel with measure theory, instead of treating probabilistic language and its basic results after the main development, as is often done. I find this is better for a first course, because even students who never go on to another probability course can get, if they are attentive, a good immersion in the special langage and frame of mind of probabilists. (And this was a practical concern in Bordeaux because, typically, there wasn’t another probability course following this one, at that time at least).

The course itself is roughly 140 pages long, and is fairly standard. In measure theory, most of the basic results are treated, though not the Radon-Nykodim theorem. In probability, things go as far as the Central Limit Theorem, but there are no martingales.

The last 40 pages contain the exams given the first two years I gave the course, with corrections.  Some of the problems are dedicated to proofs of quite nice results, including the Radon-Nykodim theorem on [0,1] and Lebesgue’s differentiation theorem, and some are more standard. (Those who have never seen a French-style three/four hours long exam can have a look to get an idea of how this type of things are done in this strange country; with hindsight, it’s clear the exams were quite a bit too difficult for the students I had…)

Tidbits of terminology and other folklore

Two small and independent interrogative remarks:

(1) Nowadays, an extension of a group G by a group K is a group H fitting in a short exact sequence

1\rightarrow K\rightarrow H\rightarrow G\rightarrow 1

in other words, and rather counterintuitively,  the group G is a quotient of the group which is the extension. When did this terminology originate? A paper of Alan Turing (entitled, rather directly, “The extensions of a group”) defines “extension”, in the very first paragraph, exactly in the opposite (naïve) way, quoting Schreier and Baer who, presumably, had the same convention.

(2) There’s a whole lot of discussion here and there about the mystical “field with one element”; usually, papers of Tits from around 1954 are mentioned as being the source of the whole “idea”; however, the following earlier quote from a 1951 paper  of R. Steinberg (“A geometric approach to the representations of the full linear group over a Galois field”, 1951, p. 279, TAMS 71, 274–282) seems to also contain a germ of the often mentioned analogy between the formulas for the order of the Weyl groups and those of groups of Lie type over a field with q elements, the former being obtained by specializing the latter for q=1:

In closing this section, a remark on the analogy between G and H seems to be in order. Instead of considering G as a group of linear transformations of a vector space, we could consider G as a collineation group of a finite (n-1)-dimensional geometry. If q=1, the vector space fails to exist but the finite geometry does exist and, in fact, reduces to the n vertices of a simplex with a collineation group isomorphic to H. “

In this citation, G is GL(n,Fq), and H is the symmetric group on n letters.

(3) Here’s a third question: when did the terminology “Galois field” become more or less obsolete within the pure mathematics community?

Yet another property of quadratic fields with extra units

Here is an amusing exercise that is suitable for a course on basic algebraic number theory: let p be a prime number. Consider integral solutions (a,f) to

4p=a^2+3f^2

with a and f positive. The claim is that, if p is congruent to one mod 3, there are three distinct solutions (a,f), (b,g), (c,h), and if they are ordered

1\leq a<b<c

then we have

c=a+b

For example, in the case p=541,  we find a=17, b=29, c=46:

4p=2164=17^2+3\times 5^5=29^2+3^3\times 7^2=46^2+3\times 2^4

The pedagogic value of the exercise is that while it looks like something that one could prove by a simple brute force computation, this is not so easy to do, while it becomes elementary knowing the basic facts about factorizations of primes in quadratic fields (and units of imaginary quadratic fields, in this case Q(√-3).

Indeed, the equation means that p is the norm of the integral ideal generated by

\frac{a}{2}+ \frac{f\sqrt{-3}}{2}

It is known that only primes congruent to 1 modulo 3 are norms in this field, hence the first condition on p. Then, it is known that the ideal above is unique up to conjugation. So the only possible extra solutions, given one of them, are obtained by multiplying by a unit of the field, and isolating the “coefficient of 1”, or in other words taking the trace to Q. Since the units are

\pm 1,\, \pm j=\frac{\mp 1\pm \sqrt{-3}}{2},\, \pm j^2

simply multiplying shows there are three positive solutions:

a,\,\frac{a+3f}{2},\,\frac{|a-3f|}{2}

Depending on the sign of a-3f, the conclusion follows by considering two cases.

[This minor property of quadratic fields was motivated by the question of finding interesting examples of relations between the zeros of zeta functions of algebraic curves over finite fields; for quite a bit more about this – both results of independence and examples of relations -, see my preprint on the subject, in particular Section 6 for the examples.]

Équidistribution, or équirépartition ?

For years, I have been convinced that the proper French translation of “equidistribution” was not the faux ami (false friend) “équidistribution”, but rather the word “équirépartition”. The latter is for instance used by Serre (and Bourbaki).

But then  I realized recently that Deligne uses “équidistribution” in his great paper containing his second proof of the Riemann Hypothesis over finite fields, which contains in particular his famous equidistribution theorem (see Section 3.5, entitled “Application: théorèmes d’équidistribution”).

Since, in fact, neither word appears in the French dictionaries I have available (unsurprisingly: “equidistribution” is not in the OED), and since moreover “distribution” and “répartition” do appear and are identified as synonyms, it seems now that in fact both words should be acceptable…

There is no hyperbolic Minkowski theorem

Minkowski’s classic theorem of “geometry of numbers” states that any convex subset of Rn which is symmetric (with respect to the origin) and of volume (with respect to Lebesgue measure) larger than 2n contains a non-zero integral point.

This theorem is used, in particular, in the classical treatment of Dirichlet’s Unit Theorem in algebraic number theory. While teaching this topic last year, I wondered whether there was an hyperbolic analogue, in the following sense, where H is the hyperbolic plane in the Poincaré model:

does there exist a constant C such that any geodesically convex subset B of the hyperbolic plane H with hyperbolic area at least C which is geodesically symmetric with respect to the point i contains at least one point z of the form g.i, where g is an element of SL(2,Z) and g.i refers to the usual action by fractional linear transformations, with z not equal to i.

Here, the subset B is geodesically convex if it contains the geodesic segment between any two points, and symmetric if, whenever x is in B, the point on the geodesic from i to x which is at distance d(i,x) from i, but in the opposite direction, is also in B.

It turns out that the answer is “No”. Indeed, C. Bavard gave the following example:

Cone

let B be a euclidean half-cone with base vertex at 0, axis the vertical axis, and angle at the origin small enough, then B does not contain any “integral” point except i, but has infinite hyperbolic area. Moreover, it is easily seen that B is convex and symmetric in the hyperbolic sense, since hyperbolic geodesics are vertical half-lines and half-circles meeting the real line at right angles.

To check the claim, it is enough to show that for any integral point z=g.i distinct from  i, the ratio |x|/y has a positive lower bound, where z=x+iy (this will show that the angle from the vertical axis is bounded from below, so the point is not in a cone like the one above with sufficiently small angle). But z is given by (ai+b)/(ci+d) with a, b, c, d integers and ad-bc=1, and this ratio is simply |ac+bd|. Being an integer, either it is 0, or it is at least 1. Manipulating things, one checks that the first case only occurs for matrices in SL(2,Z) which are orthogonal matrices, and those fix i, so the point is then z=i. Hence, except for this case, the ratio is at least 1 and this concludes the argument.

It is interesting to see what breaks down in the (very simple) proofs of Minkowski’s theorem in the plane. In the first proof found on page 33 of the 5th edition of Hardy and Wright’s “An introduction to the theory of numbers” (visible here), the problem is that there is no way to dilate the convex region B in a homogeneous way compatible with the SL(2,Z) action. In other words, SL(2,Z) is essentially a maximal discrete subgroup of SL(2,R) (maybe it is maximal? I can’t find a reference).