Cartan’s “Sur certains cycles arithmétiques”

Searching on Numdam for the papers of É. Cartan, I noticed one from 1927 entitled “Sur certains cycles arithmétiques”. Although this was not the one I was looking for, natural curiosity immediately had the better of me, and I downloaded the article, wondering what marvels could be there: an anticipation of Heegner cycles? special subvarieties of Shimura varieties?

As usual, the truth was even more surprising than such exalted expectations. Indeed, Cartan, considering “le problème de Mathématiques élémentaires proposé au dernier concours d’agrégation”, raises and solves the following question:

Classify, for any base a\geq 2, the integers n\geq 1 such that, when n is written in base a, the integers obtained by all cyclic permutation of the digits are in arithmetic progression.

This is rather surprising, since I had no idea that É. Cartan had any interest in elementary number theory; considering that he was 58 years old in 1927, I find this quite whimsical and refreshing…

Here is an example: for base 10, take n=148; the cyclic permutations yield the additional integers n_1=481 and n_2=814; and — lo and behold — we have indeed
814-481=333=481-148,
exhibiting the desired arithmetic progression. (One also allows a leading digit being 0, wich can be permuted with the others, so that, for instance, n=037, with companions 370 and 703, is also a solution.)
More impressively consider n=142857; with — in order — the progression
142857, 285714, 428571, 571428, 714285, 857142
with common difference equal to 142857 (which is also the smallest of the 6 integers.)

Cartan finds two distinct sources of such cycles, which he calls “de première [resp. de seconde] catégorie”, and classifies them, for any base a. The original problème d’agrégation asked for cycles of lengths 3 and 6 in base 10 and Cartan finds 3 cycles of length 6 and 6 of length 3. I wonder how many students managed to solve this question…

I won’t write down the solutions here — for the moment at least –, so that those readers who are interested can try their own skill…

Who was Peter?

The fundamental fact about the representation theory of a compact topological group G is the Peter-Weyl Theorem, which can be described as follows: the regular representation of G on L^2(G) (defined using the probability Haar-measure on G) decomposes as an orthogonal Hilbert direct sum of the spaces of matrix coefficients of the finite-dimensional irreducible representations of G. (Books, for instance the one of Knapp on semisimple Lie groups, often include more in what is called Peter-Weyl Theory, but this is the statement that is proved in the original paper.)

As I am currently preparing to write down a proof of the Peter-Weyl theorem for my notes on representation theory, I had a look at this paper. Although I probably won’t follow it quite completely, I found it very interesting — it is quite subtly different from all modern treatments I have seen, in an interesting way, and without being much more complicated than what is found, e.g., in Knapp’s book. (For a short masterful online account, this post of Terry Tao is very good; from a search in the same Göttingen archive web site, it seems that — maybe — the modern treatment dates back to Pontryagin, in 1936.)

In any case, the question for today is: Who was Peter? Only the initial “F.” and the affiliation “in Karlsruhe” identifies this coauthor on the original paper (even the “F.” is misread as “P.” on the PDF cover page…) It seem he was a student of Weyl, but note that there is no Peter on the math genealogy page for Weyl. (A joke here at ETH, is that when the lecture room known as the Hermann Weyl Zimmer is renovated this summer, some unfortunate skeletons will be found in the closets and under the floor, or behind the blackboards…)

The Legendre polynomials are ubiquitous

One can define the j-th Legendre polynomial P_j in many ways, one of the easiest being to use the generating function
\sum_{j\geq 0}{ P_j(x)T^j}=(1-2xT+T^2)^{-1/2}.
Like many “classical” special functions (which one might call “the functions in Whittaker & Watson” — I find it charming, incidentally, that the PDF of this edition has exactly 628 pages), these can also be defined using representation theory. This is done by considering the group G=\mathrm{SU}_2(\mathbf{C}) and its (2j+1)-dimensional irreducible representation on the space V_{2j} of homogeneous polynomials in two variables of degree 2j, where the action of G is by linear change of variable:
(g\cdot P)(X,Y)=P((X,Y)g)=P(aX+cY,bX+dY)
for
g=\begin{pmatrix}a&b\\c&d\end{pmatrix}.
Then, up to normalizing factors, P_j “is” the matrix coefficient of V_{2j} for the vector e=X^jY^j\in V_{2j}. Or, to be precise (since a matrix coefficient is a function on G, which is 3-dimensional, while P_j is a function of a single variable), we have
P_j(\cos\theta)=c_j \langle g_{\theta}\cdot e,e\rangle
for the elements
g_{\theta}=\begin{pmatrix}\cos(\theta/2)&i\sin(\theta/2)\\i\sin(\theta/2)&\cos(\theta/2)\end{pmatrix}
(the inner product \langle \cdot,\cdot\rangle used to compute the matrix coefficient is the G-invariant one on V_{2j}; since this is an irreducible representation, it is unique up to a non-zero scalar; the normalizing constant c_j involves this as well as the normalization of Legendre polynomials.) For full details, a good reference is the book of Vilenkin on special functions and representation theory, specifically, Chapter 3.)

Note also that, since e is, up to a scalar, the only vector in V_{2j} invariant under the action of the subgroup T of diagonal matrices, one can also say that P_j is “the” spherical function for V_{2j} (with respect to the subgroup T).

This seems to be the most natural way of recovering the Legendre polynomials from representation theory. Just a few days ago, while continuing work on the lecture notes for my class on the topic (the class itself is finished, but I got behind in the notes, and I am now trying to catch up…), I stumbled on a different formula which doesn’t seem to be mentioned by Vilenkin. It is still related to V_{2j}, but now seen as a representation of the larger group \mathbf{G}=\mathrm{SL}_2(\mathbf{C}) (the action being given with the same linear change of variable): we have

P_j(1+2t)=d_j \langle u_t\cdot e,u_1\cdot e\rangle

where d_j is some other normalizing constant, and now u_t are unipotent elements given by
u_t=\begin{pmatrix}1&t\\ 0 & 1\end{pmatrix}.

It’s not quite clear to me where this really comes from, though I suspect there is a good explanation. Searching around the web and Mathscinet did not lead, in any obvious way, to earlier sightings of this formula, but it is easy enough to get thoroughly unenlightening proof: just use the fact that
u_t\cdot e=X^j(tX+Y)^j,
expand into binomial coefficients, use the formula
\langle X^iY^{2j-i}, X^kY^{2j-k}\rangle=\binom{2j}{i}^{-1}\delta(i,k),
for the invariant inner-product, and obtain a somewhat unwieldy polynomial which can be recognized as a multiple of the hypergeometric polynomial
{}_2F_1(-j,1+j;1;-t),
which is known to be equal to P_j(1+2t). (Obviously, chances of a computational misake are non-zero; I certainly made some while trying to figure this out, and stopped computing only when I got this nice interesting result…)

Subtile finesse de la notation…

For a long time (since I first heard of it around 1990), I thought that the terminology “Property (T)” was a completely arbitrary name, and no better than the thousand notions of “admissible thingummy” or “good khraboute” which sprinkle too many mathematics papers. Then I learnt a few years ago that the “T” was supposed to refer to the Ttrivial representation, since — in a suitable language — the property is about the trivial representation of a group G.

This was already better. But much more recently, I learnt from S. Mozes that the typography “(T)” itself was not some random choice, but was meant to express the fact that the trivial representation T is supposed to be alone in some open set, incarnated as an open interval (so one should read () as in (a,b))…

A direct corollary is that the right translation in French is, of course, Property ]T[.

By the way, I personally much prefer the ]a,b[, [a,b], ]a,b], etc, notation for intervals, but I’m told that many find this ugly beyond belief and much prefer the (a,b), (a,b], etc, style…

Multiplicativity, where art thou?

The inequality I was pondering during my recent vacations is another result of Burnside: for a finite cyclic group G=\mathbf{Z}/m\mathbf{Z}, denoting by S the set of generators of G, we have
\sum_{x\in S}|\chi_{\rho}(x)|^2\geq |S|=\varphi(m)
for any finite-dimensional representation \rho of G with character \chi_{\rho}, provided the latter does not vanish identically on S (this may happen, e.g., for the regular representation of G). This was used by Burnside (and still appears in many books) in order to prove that if an irreducible representation \rho of a finite group has degree at least 2, its character must be zero on some elements of the group. (The cyclic groups G used are those generated by non-trivial elements of the group, and the representations are the restrictions of \rho to those cyclic groups; thus it is an interesting application of reducible representations of abelian groups to irreducible representations of non-abelian ones… the lion and the mouse come to mind.)

This inequality can not be considered hard: it is dispatched in two lines invoking Galois theory to argue that
\prod_{x\in S}{|\chi_{\rho}(x)|^2}
is a (non-negative) integer, and the arithmetic-geometric inequality to deduce that
\frac{1}{|S|}\sum_{x\in S}{|\chi_{\rho}(x)|^2}\geq \Bigl(\prod_{x\in S}{|\chi_{\rho}(x)|^2}\Bigr)^{1/|S|}\geq 1
unless this product is zero.

However, if we view this as a property of representations of a finite cyclic group, one (or, at least, I) can’t help wonder whether there is a “direct” proof, which only involves the formal properties of these representations, and does not refer to Galois theory. This is what I was trying to understand while basking in the italian spring and gardens. As it turns out, there is is in fact quite a beautiful argument, which is undoubtedly too complicated to seriously replace the Galois-theoretic one, but is directly related to very interesting facts which I didn’t know before — I therefore consider well-spent the time spent thinking about this inequalitet. (Philological query: what is the diminutive of “inequality”? or of “formula”?)

The idea is that a representation \rho of G is a direct sum
\rho=\bigoplus_{a}{n(a)\chi_a},
where a runs over \mathbf{Z}/m\mathbf{Z}, and \chi_a is the character
x\mapsto e(ax/m),
for some integral multiplicities n(a). Hence the left-hand side of Burnside’s inequality, namely
\sum_{x\in S}{|\chi_{\rho}(x)|^2}
can be seen as a quadratic form in m integral variables, and the question we ask is: what is the minimal non-zero value it can take? (Note that the example of the regular representation shows that this quadratic form, though obviously non-negative, is not definite positive.)

Now the funny part of the story is that this quadratic form, say Q_m, is naturally a tensor product
Q_m=\bigotimes_{p^{k_p}\mid\mid m}{Q_{p^{k_p}}}
of the corresponding quadratic forms arising from the primary factors of m. Hence we have a problem about a multiplicatively defined quadratic form, and the answer (given by Burnside) is rhe Euler function \varphi(m), which is also multiplicative; should it not then feel natural to treat the case of m=p^k for some prime p, and claim that the general case follows by multiplicativity?

Alas, a second’s thought shows that it is by no means clear that the minimum of a tensor product of (non-negative, integral) quadratic forms should be the product of the minima of the factors! Denoting by s(Q) the minimum of Q (among non-zero values), we certainly have
s(Q_1\otimes S_2)\leq s(Q_1)s(Q_2),
but the converse inequality should immediately feel doubful when the forms are not diagonalized. And indeed, in general, this is not true! I found examples in the book of Milnor and Husemoller on bilinear forms (which I had not looked at for a long time, and I am very happy to have been motivated to look at it again!); they are attributed to Steinberg, through a result of Conway and Thompson, itself based on the famous Siegel formula for representation numbers of quadratic forms…

However, despite this fact about general forms, it turns out that the quadratic forms Q_{p^k} have the property that
s(Q_{p^k}\otimes Q')=s(Q_{p^k})s(Q')
for all other quadratic forms Q' (always non-negative integral). This I found explained by Kitaoka. Here the nice thing is that it depends on writing Q_p as (essentially) a variance
Q_p(n)=p\sum_{a}{n(a)^2}-\Bigl(\sum_{b}{n(b)}\Bigr)^2
of the coefficients n(a), and using the alternate formula
Q_p(n)=\frac{1}{2}\sum_{a,b}{(n(a)-n(b))^2}\quad\quad\quad\quad\quad\quad (\star)
which corresponds, in probability, to the formula
\mathbf{V}(X)=\frac{1}{2}\mathbf{E}((X-Y)^2)
for the variance of a random variable X, using an independent “copy” Y of X (i.e., Y is has the same probability distribution as X, but is independent.) This probabilistic formula, I have to admit, I didn’t know — or had forgotten –, though it is very useful here! Indeed, the reader may try to prove directly that Q_p(n)\geq p-1 unless it is 0, using any of these formulas for Q_p(n); I succeeded with the other formula
 Q_p(n)=p\sum_{a}{\Bigl(n(a)-\frac{1}{p}\sum_{b}{n(b)}\Bigr)^2}
but the argument was much uglier than the one with (\star) above! (It is also clearer from the latter that Q_p takes integral values for integral arguments, despite the 1/p which crop up here and there…)

For more details, I’ve written this up in a short note that I just added to the relevant page.