What’s special with commutators in the Weyl group of C5?

I have just added to my notes on representation theory the very cute formula of Frobenius that gives, in terms of irreducible characters, the number N(g) of representations of a given element g as a commutator g=[x,y]=xyx^{-1}y^{-1} in a finite group G:
N(g)=|G|\sum_{\chi}\frac{\chi(g)}{\chi(1)},
where \chi runs over the irreducible (complex) characters of G (this is Proposition 4.4.3 on page 118 of the last version of the notes).

I wanted to mention some applications, and had a vague memory that this was used to show that most or all elements in various simple groups are actual commutators. By searching around a bit, I found out easily that, indeed, there was a conjecture of Ore from 1951 to the effect that the set of commutators is equal to G for any non-abelian finite simple group G, and that (after various earlier works) this has recently been proved by Liebeck, O’Brien, Shalev and Tiep.

I mentioned this of course, but then I also wanted to give some example of non-commutator, and decided to look for this using Magma (the fact that I am recovering from a dental operation played a role in inciting me to find something distracting to do). Here’s what I found out.

First, a natural place to look for interesting examples is the class of perfect groups, of course not simple. This is also easy enough to implement since Magma has a database of perfect groups of “small” order. Either by brute force enumeration of all commutators or by implementing the Frobenius formula, I got the first case of a perfect group G, of order 960, which contains only 840 distinct commutators.

Then I wanted to know “what” this group really was. Magma gave it to me as a permutation group acting on 16 letters, with an explicit set of 6 generators, and with a list of 21 relations, which was not very enlightening. However, looking at a composition series, it emerged that G fits in an exact sequence
1\rightarrow (\mathbf{Z}/2\mathbf{Z})^4\rightarrow G\rightarrow A_5\rightarrow 1.
This was much better, since after a while it reminded me of one of my favorite types of groups: the Weyl groups W_{g} of the symplectic groups \mathrm{Sp}_{2g} (equivalently, the “generic” Galois group for the splitting field of a palindromic rational polynomial of degree 2g), which fit in an relatively similar exact sequence
1\rightarrow (\mathbf{Z}/2\mathbf{Z})^g\rightarrow W_g\rightarrow S_g\rightarrow 1.
From there, one gets a strong suspicion that G must be the commutator subgroup of W_5, and this was easy to check (again with Magma, though this is certainly well-known; the drop of the rank of the kernel comes from looking at the determinant in the signed-permutation 5-dimensional representation, and the drop from S_5 to A_5 is of course from the signature.)

This identification is quite nice, obviously. In particular, it’s now possible to identify concretely which elements of G are not commutators. It turns out that a single conjugacy class, of order 120, is the full set of missing elements. As a signed permutation matrix, it is the conjugacy class of
g=\begin{pmatrix} 0& -1 & 0 & 0 & 0\\ 1& 0  & 0 & 0 & 0\\ 0& 0  & 0 & 1 & 0\\ 0& 0 & 1 & 0 & 0\\ 0& 0  & 0 & 0 & -1\end{pmatrix},
and the reason it is not a commutator is that Magma tells us that all commutators in G have trace in \{-3,-2,0,1,2,5\} (always in the signed-permutation representation). Thus the trace -1 doesn’t fit…

At least, this is the numerical reason. I feel I should be able to give a theoretical explanation of this, but I haven’t succeeded for the moment. Part of the puzzlement is that this behavior seems to be special to W_5, the Weyl group of the root system C_5. Indeed, for g\in\{2,3,4\}, the corresponding derived subgroup is not perfect, so the question does not arise (at least in the same way). And when g\geq 6, the derived subgroup G_g of W_g is indeed perfect, but — experimentally! — it seems that all elements of G_g are then commutators.

I haven’t found references to a study of this Ore-type question for those groups, so I don’t know if these “experimental” facts are in fact known to be true. Another question seems natural: does this special fact have any observable consequence, for instance in Galois theory? I don’t see how, but readers might have better insights…

(P.S. I presume that GAP or Sage would be equally capable of making the computations described here; I used Magma mostly because I know its language better.

P.P.S And the computer also tells us that even for the group G above, all elements are the product of at most two commutators, which a commenter points out is also a simple consequence of the fact that there are more than 480 commutators….

P.P.P.S To expand one of my own comments: the element g above is a commutator in the group W_5 itself. For instance g=[x,y] with
x=\begin{pmatrix} 0& 0 & 0 & 0 & -1\\ 0& 1  & 0 & 0 & 0\\ 1& 0  & 0 & 0 & 0\\ 0& 0 & 1 & 0 & 0\\ 0& 0  & 0 & 1 & 0\end{pmatrix},
and
y=\begin{pmatrix} 1& 0 & 0 & 0 & 0\\ 0& 0  & 0 & 0 & -1\\ 0& 1  & 0 & 0 & 0\\ 0& 0 & 1 & 0 & 0\\ 0& 0  & 0 & -1 & 0\end{pmatrix},
where y\notin G.)

Cartan’s “Sur certains cycles arithmétiques”

Searching on Numdam for the papers of É. Cartan, I noticed one from 1927 entitled “Sur certains cycles arithmétiques”. Although this was not the one I was looking for, natural curiosity immediately had the better of me, and I downloaded the article, wondering what marvels could be there: an anticipation of Heegner cycles? special subvarieties of Shimura varieties?

As usual, the truth was even more surprising than such exalted expectations. Indeed, Cartan, considering “le problème de Mathématiques élémentaires proposé au dernier concours d’agrégation”, raises and solves the following question:

Classify, for any base a\geq 2, the integers n\geq 1 such that, when n is written in base a, the integers obtained by all cyclic permutation of the digits are in arithmetic progression.

This is rather surprising, since I had no idea that É. Cartan had any interest in elementary number theory; considering that he was 58 years old in 1927, I find this quite whimsical and refreshing…

Here is an example: for base 10, take n=148; the cyclic permutations yield the additional integers n_1=481 and n_2=814; and — lo and behold — we have indeed
814-481=333=481-148,
exhibiting the desired arithmetic progression. (One also allows a leading digit being 0, wich can be permuted with the others, so that, for instance, n=037, with companions 370 and 703, is also a solution.)
More impressively consider n=142857; with — in order — the progression
142857, 285714, 428571, 571428, 714285, 857142
with common difference equal to 142857 (which is also the smallest of the 6 integers.)

Cartan finds two distinct sources of such cycles, which he calls “de première [resp. de seconde] catégorie”, and classifies them, for any base a. The original problème d’agrégation asked for cycles of lengths 3 and 6 in base 10 and Cartan finds 3 cycles of length 6 and 6 of length 3. I wonder how many students managed to solve this question…

I won’t write down the solutions here — for the moment at least –, so that those readers who are interested can try their own skill…

The Legendre polynomials are ubiquitous

One can define the j-th Legendre polynomial P_j in many ways, one of the easiest being to use the generating function
\sum_{j\geq 0}{ P_j(x)T^j}=(1-2xT+T^2)^{-1/2}.
Like many “classical” special functions (which one might call “the functions in Whittaker & Watson” — I find it charming, incidentally, that the PDF of this edition has exactly 628 pages), these can also be defined using representation theory. This is done by considering the group G=\mathrm{SU}_2(\mathbf{C}) and its (2j+1)-dimensional irreducible representation on the space V_{2j} of homogeneous polynomials in two variables of degree 2j, where the action of G is by linear change of variable:
(g\cdot P)(X,Y)=P((X,Y)g)=P(aX+cY,bX+dY)
for
g=\begin{pmatrix}a&b\\c&d\end{pmatrix}.
Then, up to normalizing factors, P_j “is” the matrix coefficient of V_{2j} for the vector e=X^jY^j\in V_{2j}. Or, to be precise (since a matrix coefficient is a function on G, which is 3-dimensional, while P_j is a function of a single variable), we have
P_j(\cos\theta)=c_j \langle g_{\theta}\cdot e,e\rangle
for the elements
g_{\theta}=\begin{pmatrix}\cos(\theta/2)&i\sin(\theta/2)\\i\sin(\theta/2)&\cos(\theta/2)\end{pmatrix}
(the inner product \langle \cdot,\cdot\rangle used to compute the matrix coefficient is the G-invariant one on V_{2j}; since this is an irreducible representation, it is unique up to a non-zero scalar; the normalizing constant c_j involves this as well as the normalization of Legendre polynomials.) For full details, a good reference is the book of Vilenkin on special functions and representation theory, specifically, Chapter 3.)

Note also that, since e is, up to a scalar, the only vector in V_{2j} invariant under the action of the subgroup T of diagonal matrices, one can also say that P_j is “the” spherical function for V_{2j} (with respect to the subgroup T).

This seems to be the most natural way of recovering the Legendre polynomials from representation theory. Just a few days ago, while continuing work on the lecture notes for my class on the topic (the class itself is finished, but I got behind in the notes, and I am now trying to catch up…), I stumbled on a different formula which doesn’t seem to be mentioned by Vilenkin. It is still related to V_{2j}, but now seen as a representation of the larger group \mathbf{G}=\mathrm{SL}_2(\mathbf{C}) (the action being given with the same linear change of variable): we have

P_j(1+2t)=d_j \langle u_t\cdot e,u_1\cdot e\rangle

where d_j is some other normalizing constant, and now u_t are unipotent elements given by
u_t=\begin{pmatrix}1&t\\ 0 & 1\end{pmatrix}.

It’s not quite clear to me where this really comes from, though I suspect there is a good explanation. Searching around the web and Mathscinet did not lead, in any obvious way, to earlier sightings of this formula, but it is easy enough to get thoroughly unenlightening proof: just use the fact that
u_t\cdot e=X^j(tX+Y)^j,
expand into binomial coefficients, use the formula
\langle X^iY^{2j-i}, X^kY^{2j-k}\rangle=\binom{2j}{i}^{-1}\delta(i,k),
for the invariant inner-product, and obtain a somewhat unwieldy polynomial which can be recognized as a multiple of the hypergeometric polynomial
{}_2F_1(-j,1+j;1;-t),
which is known to be equal to P_j(1+2t). (Obviously, chances of a computational misake are non-zero; I certainly made some while trying to figure this out, and stopped computing only when I got this nice interesting result…)

Reading Burnside (and thanking Noether)

In 1905, the famous rower W. Burnside (then aged 52) proved one of the results known as Burnside’s Theorem (the other one being, usually, the striking result that finite groups of order divisible by at most two primes are solvable):

Let k be an algebraically closed field, and let
G\subset GL_n(k)
be a subgroup of the invertible matrices of size n over k. Let k[G] be the span of G in the matrix algebra M(n,k) of size n. Then G acts irreducibly on kn if and only if k[G]=M(n,k).

Here, recall that irreducibility (a notion apparently first introduced by Burnside himself) means that there is no proper non-zero subspace

W\subset k^n,\quad 0\not=W\not=k^n,

such that G leaves W invariant (globally).

This result turns out to play a role in a current research project (with O. Dinai), and since I had never looked properly at the proof(s) before, I’ve been a bit curious about it, and tried recently to understand it. There are very simple proofs known, but the shortest ones seem to be typically not very enlightening when it comes to understand why the result is true. They’re the kind of arguments you might feel you could find once you knew the result, but why would you think of proving it first? So — it was vacation time! — I had a look at Burnside’s original paper. This can be found here; if you do not have access to the Proceedings of the L.M.S, here is a fairly representative extract of the style:

[Extract from Burnside's paper]

As far as I’m concerned, this is barely recognizable as meaningful mathematics, and almost unreadable. I say almost, because (vacation effect) I took it as an intellectual challenge to try to reformulate Burnside’s argument in more modern terms, and I believe that I succeeded. It was a big help that the paper is only four pages long; it turns out that the one page from which the extract is taken, although I can’t explain it in any reasonable way, contains the last step of Burnside’s argument. From the fact that he needed seventeen lines to prove the “obvious” half of his theorem, there was therefore every chance that whatever is done here in one page should not be too difficult to figure out with some thought.

So here is a sketch of my reading of his proof of the non-trivial direction (that, for an irreducible action, we have k[G]=M(n,k); for full details, see this short note). We denote

V=k^n,\quad E=M(n,k),\quad E^\prime=M(n,k)^\prime=Hom(E,k),\quad\text{the dual of } E,

and then we define

R=\{\phi\in E^\prime\,\mid\, \phi(g)=0\text{ for all } g\in G\}\subset E^\prime,

which is the linear space of all linear relations satisfied by the matrices in the group. By linearity and duality, we see that the goal is to show that, if G is irreducible, then the space R is zero. The steps for this are:

  • R is a subrepresentation of E’ for a natural action of G on E’ given by
    \langle g\cdot \phi,A\rangle=\langle \phi,g^{-1}A\rangle,\quad\quad g\in G,\ \phi\in E^{\prime},\ A\in E
    in duality-bracket notation (this is not the same as the usual tensor product of the tautological representation of G on V and its contragredient);
  • We now attempt to analyze this representation on E’, and the next three steps do this (they are completely independent of the problem at hand); first, the representation on E’ is isomorphic to a sum of n copies of the (irreducible) contragredient of the tautological representation;
  • Hence any irreducible subrepresentation in R is obtained as the image of a G-equivariant embedding
    V^\prime\rightarrow E^\prime;
  • But all such maps
    \alpha\,:\, V^\prime\rightarrow E^\prime
    are of the form
    \langle \alpha(\lambda),A\rangle=\lambda(Av)
    for some fixed vector v in V;
  • And we now come back to the problem of understanding the relations R; if R were non-zero, it would contain the image of a map α of this form, for some non-zero vector v;
  • But if we then specialize the definition of α with A being the identity matrix — which is in G –, we find, from the fact that the image of α is in R, that
    0=\alpha(\lambda)(1)=\lambda(v),
    for all λ, and this is a contradiction with the condition that v be non-zero… Hence we must have R=0.
  • It is not obvious here where it is necessary to use the fact that k is algebraically closed, but this is hidden in an application of Schur’s Lemma. Interestingly, it seems that Schur published this result also in 1905. Since Burnside also uses it without any comment (or hint of proof), it must have been known (at least to him) before. It is also amusing to note that, in fact, there is no mention whatsoever of a base field in Burnside’s paper.

    I like this proof, in part because it would make sense to try to proceed in this way, even if the result turned out to be different (say, a characterization of the relation module R instead of a proof that it is zero). Also, I may be influenced by the similarity with the study of relations between roots of polynomials that can also be done using elementary representation theory of the Galois group, as discussed in this old blog post.

    But as I said, there is a part of Burnside’s paper I really don’t understand, even if I suspect it is equivalent or very similar to what I did. And I am forever thankful that Emmy Noether came along some years later to put algebra on a more reasonable track than endless talk of “successive sets of symbols with the same second suffix” (which sounds almost like one of those alliterative exercises used to detect drunkenness…)

    The Kochen-Specker argument, and the spectral theory script

    Somewhat later than I had hoped, I have updated the script of my spectral theory course. The version currently found online is complete as far as the material I intended to put in is concerned, but there are a few places where I haven’t written down all details (in particular for the proof of the Weyl law for the Dirichlet Laplace operator in an open subset of Euclidean space). I am also aware of quite a few small problems in the last chapter on Quantum Mechanics, due partly to notation problems (for the Fourier transforms, and for “physical” versus mathematical normalizations). I will need to re-read the whole text carefully to correct this; on the other hand, thanks to lists of corrections that I have already received from a few students, the number of typos is much less than before… I will however continue updating the PDF file as I continue checking parts of the text.

    What delayed this version for a long time was the write-up of the last section on “The interpretation of Quantum Mechanics”; of course it’s in some sense an extraneous part of the script, since spectral theory barely enters in it, but I found it important to at least try to connect the mathematical framework with the actual physics. (This partly explains all the reading I’ve done recently about these issues). It is equally obvious that I am not the most knowledgeable person for such a discussion, but after all, there are good authorities that claim that no one really understands this question anyway…

    What I end up discussing contains however one little mathematical result, which is cute and interesting independently of its use in Quantum Mechanics; it is a theorem of S. Kochen and E.P. Specker which states the following:

    There does not exist any map
    f\,:\, \mathbf{S}^2\rightarrow \{0,1\}
    where S2 is the sphere in R3 with the property that, whenever
    x,y,z
    are pairwise orthogonal unit vectors, we have
    f(x)+f(y)+f(z)=2
    or in other words, two of the three values are equal to 1, and the other is equal to 0.

    How this result enters into discussions of the interpretation of Quantum Mechanics is described by M. Jammer in his book on the subject (not the same as his book on the development of Quantum Mechanis, but another one, equally evanescent as far as the internet is concerned); more recently, J. Conway and S. Kochen have combined it with the Einstein-Podolsky-Rosen argument (or paradox) to derive what they call the “Free Will Theorem”, which is an even stronger version of the unpredictability of properties of Spin 1 particles (those to which the Kochen-Specker argument applies). Conway has given lectures in Princeton on this result and its history and consequences, which are available as videos online.

    Coming back to the result above, considered purely from the mathematical point of view, it is interesting to notice that both the original proof and the version used by Conway-Kochen (which is due to A. Peres) show that the hypothetical map does not exist even for some finite sets of points on the sphere. It is of some interest to get a smallest possible set of such points. The proof I gave in the script, however, which is taken from Jammer’s book (who attributes it to R. Friedberg) is maybe theoretically slightly more complicated, but it is also somewhat more conceptual in that one doesn’t have to be puzzled so much at the reason why one finite set of vectors or another is really fundamental.