Peeling onions with Peter and Weyl

I had promised a while ago to say more about the original proof of Peter-Weyl of the “completeness theorem” for compact groups: matrix coefficients of finite-dimensional irreducible representations span a dense subspace of $L^2(G)$, for a compact group $G$. With some delay, here we go…

Before giving just a brief overview of the strategy in the paper of Peter and Weyl, here are some mostly historical or psychological remarks that came to mind while looking at the paper and related sources:

• If one had infinite amount of time to teach everything about representation theory, I guess most mathematicians would treat the Peter-Weyl theory fairly soon after finite groups, and before the representation theory of semisimple Lie groups; interestingly, the history was reversed: Weyl first developed the basic representation theory of the charmingly named “kontinuierlicher halb-einfacher Gruppen” in three papers in Mathematische Zeitschrift in 1925, before going to the proof of the completeness theorem;
• Similarly, since expressing special functions using representations can be done with very concrete examples, and this “works” with the most elementary compact Lie groups, one might expect that this would have been done before the general theory; but history also proceeded in the opposite order: É. Cartan, who according to Vilenkin was the first to make the connection explicitly, writes very clearly in the very first lines of his paper “Sur la détermination d’un système orthogonal complet dans un espace de Riemann symétrique clos” (1928) that “Le présent mémoire a été inspiré par la lecture du beau mémoire où H. WEYL montre que les différentes représentations linéaires irréducibles d’un groupe continu clos fournissent un système orthogonal complet dans l’espace du groupe” [“the present memoir was inspired by reading the beautiful memoir where H. WEYL shows that the various irreducible linear representations of a continuous compact group give a complete orthogonal system in the space of the group”]; the paper which is referenced — note that the poor Peter is here also rather casually ignored, though the footnote gives the full reference — is the one of Peter and Weyl.
• Peter and Weyl amusingly define a representation of a group $G$ as a map $E$ from $G$ to a space of matrices such that $E(st)=E(s)E(t)$ for all $s$ and $t$ in $G$; they do not ask that it be a homomorphism! However, as they observe, the matrix $E(1)$ is a projector, and the $E(s)$ preserve its image, so be restriction to the latter, one obtains a “standard” representation.
• The paper is written in an interesting “intermediate style” of analysis. There are inequalities, estimates, limiting arguments, but no functional analysis. Although functions are integrated freely, there is no mention of measurability or integrability conditions (all functions are freely evaluated at the origin, also…). No function space is identified, no linear operator mentioned explicitly. Although some inequalities would today be immediately interpreted as the standard inequality $\|T(x)\|\leq \|T\|\|x\|$ for a continuous linear map $T$ acting on some normed vector space, there is no trace of such things.
• The paper feels however quite modern in its formalism: a function on the group $G$ is thought of as being a “Gruppenzahl”, and denoted with a lower-case letter like $x$ or $z$; these are multiplied by convolution without a specific notation, etc. (I wonder if there could be here already an influence of quantum mechanics and the “q”-numbers that were infinite matrices?)

So what is the Peter-Weyl argument? I’ll first say how it differs from the modern treatments (at least, those I have seen): either for philosophical reasons (which is conceivable, in view of Weyl’s fairly constructivist ideas) or because abstract Hilbert space theory was not within their frame of thought, they do not use the type of argument that comes the most naturally to mind today: to show that the finite-dimensional matrix coefficients span a dense subspace, one shows that its orthogonal is zero. This reduces, for a given non-zero function $\varphi$ on $G$, to finding a single finite-dimensional unitary representation $\rho$ for which $\varphi$ is not orthogonal to the corresponding space of matrix coefficients. Instead, Peter and Weyl more or less present an algorithmic way to, in principle, decompose $\varphi$ into a combination of matrix coefficients of finite-dimensional representations.

In both cases, however, the basic mechanism to produce the finite-dimensional representations is the same: one uses integral operators on $G$, constructed as convolution operators using the “left” regular representation, and which therefore commute with the “right” regular one. Any non-zero eigenspace of such an operator is a finite-dimensional subrepresentation of the right-regular representation, and basically because any function gives a suitable integral operator, it is not too surprising that this gives enough finite-dimensional unitary representations.

This is the principle. The Peter-Weyl constructive method is maybe best illustrated in the case of the circle, where the theory becomes that of Fourier series (of course, the completeness in that case is older and has many different proofs, but in fact Weyl first implemented the strategy in the related case of Bohr’s almost periodic functions, to get the analogue of Parseval’s identity in that setting; this was in the same volume of Math. Annalen, in fact.) Here, given a 1-periodic function
$\varphi(t)=\sum_{h\in \mathbf{Z}}{c(h)e^{2i\pi h t}},$
what they do is basically to construct (by iterative procedures) the successive approximations
$\varphi_0,\quad \varphi_1,\quad,\ldots,\quad \varphi_k,\quad\ldots$
of the Fourier series such that
$\varphi_j=\sum_{h\in S_j}{c(h)e^{2i\pi h t}},$
where the sets of frequencies $S_j$ are increasing (for inclusion), and involve the successive Fourier coefficients with decreasing magnitude. In other words, $S_0$ is the set of frequencies where $|c(h)|$ is maximal, $S_1$ adds all frequencies for which $|c(h)|$ is the second-largest, etc. This is related to the previous discussion because the $c(h)$ are the eigenvalues of the convolution operator with kernel $\varphi$, and $|c(h)|^2$ are those for $\varphi\star\check\varphi$, the corresponding non-negative kernel.

Using a rather cute argument, Peter and Weyl are able to show that (the analogue for compact gropups of) this process leads to approximations which converge in $L^2$ to $\varphi$, i.e., to a proof of the Parseval identity, which is the completeness theorem.

Altogether, for the circle, this proof might be a bit involved, but it is very nice and conceptual, and would certainly make interesting exercises in a functional analysis class. I had not seen it anywhere before, but it could be just personal ignorance of the literature… I also have no idea if the constructive aspect is actually numerically interesting.

To justify the title of the post: the way I think about this is that they “peel off” the largest contributions to the function $\varphi$, one by one, as one may peel an onion…