Three little things I learnt recently

In no particular order, and with no relevance whatsoever to the beginning of the year, here are three mathematical facts I learnt in recent months which might belong to the “I should have known this” category:

(1) Which finite fields k have the property that there is a “square root” homomorphism
s\ :\ (k^{\times})^2\rightarrow k^{\times},
i.e., a group homomorphism such that s(x)^2=x for all non-zero squares x in k?

The answer is that such an s exists if and only if either p=2 or -1 is not a square in k (so, for k=\mathbf{Z}/p\mathbf{Z}, this means that p\equiv 3\pmod 4).

The proof of this is an elementary exercise. In particular, the necessity of the condition, for p odd, is just the same argument that works for the complex numbers: if s exists and -1 is a square, then we have
1=s(1)=s((-1)\times (-1))=s(-1)^2=-1,
which is a contradiction (note that s(-1) only exists because of the assumption that -1 is a square).

The question, and the similarity with the real and complex cases, immediately suggests the question of determining (if possible) which other fields admit a square-root homomorphism. And, lo and behold, the first Google search reveals a nice 2012 paper by Waterhouse in the American Math. Monthly that shows that the answer is the same: if K is a field of characteristic different from 2, then K admits a homomorphism
s\ :\ (K^{\times})^2\rightarrow K^{\times},
with s(x)^2=x, if and only if -1 is not a square in K.

(The argument for sufficiency is not very hard: one first checks that it is enough to find a subgroup R of K^{\times} such that the homomorphism
t\, :\, R\times \{\pm 1\}\rightarrow K^{\times}
given by t(x,\varepsilon)=\varepsilon x is an isomorphism; viewing K^{\times}/(K^{\times})^2) as a vector space over \mathbf{Z}/2\mathbf{Z}, such a subgroup R is obtained as the pre-image in K^{\times} of a complementary subspace to the line generated by (-1)(K^\times)^2, which is a one-dimensional space because $-1$ is assumed to not be a square.)

It seems unlikely that such a basic facts would not have been stated before 2012, but Waterhouse gives no previous reference (and I don’t know any myself!)

(2) While reviewing the Polymath8 paper, I learnt the following identity of Lommel for Bessel functions (see page 135 of Watson’s treatise:
\int_0^u tJ_{\nu}(t)^2dt=\frac{1}{2}u^2\Bigl(J_{\nu}(u)^2-J_{\nu-1}(u)J_{\nu+1}(u)\Bigr)
where J_{\mu} is the Bessel function of the first kind. This is used to find the optimal weight in the original Goldston-Pintz-Yıldırım argument (a computation first done by B. Conrey, though it was apparently unpublished until a recent paper of Farkas, Pintz and Révész.)

There are rather few “exact” indefinite integrals of functions obtained from Bessel functions or related functions which are known, and again I should probably have heard of this result before. What could be an analogue for Kloosterman sums?

(3) In my recent paper with G. Ricotta (extending to automorphic forms on all GL(n) the type of central limit theorem found previously in a joint paper with É. Fouvry, S. Ganguly and Ph. Michel for Hecke eigenvalues of classical modular forms in arithmetic progressions), we use the identity
 \sum_{k\geq 0}\binom{N-1+k}{k}^2 T^k=\frac{P_N(T)}{(1-T)^{2N-1}}
where N\geq 1 is a fixed integer and
P_N(T)=\sum_{k=0}^{N-1}\binom{N-1}{k}^2T^k.

This is probably well-known, but we didn’t know it before. Our process in finding and checking this formula is certainly rather typical: small values of N were computed by hand (or using a computer algebra system), leading quickly to a general conjecture, namely the identity above. At least Mathematica can in fact check that this is correct (in the sense of evaluating the left-hand side to a form obviously equivalent to the right-hand side), but as usual it gives no clue as to why this is true (and in particular, how difficult or deep the result is!) However, a bit of looking around and guessing that this had to do with hypergeometric functions (because P_N is close to a Legendre polynomial, which is a special case of a hypergeometric function) reveal that, in fact, we have to deal with about the simplest identity for hypergeometric functions, going back to Euler: precisely, the formula is identical with the transformation
{}_2F_1(-(N-1),-(N-1);1;T)=(1-T)^{2N-1}{}_2F_1(N,N;1;T),
where
{}_2F_1(\alpha,\beta;1;z)=\sum_{k\geq 0}\frac{\alpha (\alpha+1)\cdots   (\alpha+k-1)\beta(\beta+1)\cdots \beta+k-1)} {(k!)^2}z^k
is (a special case of) the Gauss hypergeometric function.

Another exercise with characters

While thinking about something else, I noticed recently the following result, which is certainly not new:

Let G be a compact topological group [ADDITIONAL ASSUMPTION pointed out by Y. Choi: connected, Lie group], and let \rho be a finite-dimensional irreducible unitary continuous representation of G on a vector space V. Then the natural representation \pi of G on \mathrm{End}(V) decomposes as a direct sum of one-dimensional characters if and only if \rho is of dimension 1.

One direction is clear: if \rho has dimension one, then \pi is simply the trivial one-dimensional representation. For the converse, here is an argument with character theory.

As a first step, note that if \rho (of dimension d\geq 1, say) has this property, then in fact \pi decomposes as a direct sum of distinct one-dimensional characters: indeed, the multiplicity of a character \chi in \pi is the same as
n_{\chi}=\int_{G}\chi(x)\mathrm{Tr}(\pi(g))dg,
where dg is the probability Haar measure on G, and since
\mathrm{Tr}(\pi(g))=|\mathrm{Tr}(\rho(g))|^2,
we get
n_{\chi}\leq \int_{G}\mathrm{Tr}(\pi(g))dg=1
by the orthogonality relations of characters. (Algebraically, this is just an application of Schur’s lemma).

Thus if we decompose \pi into irreducible representations, we get
\pi=\bigoplus_{1\leq i\leq d^2} \chi_i,
where the \chi_i are distinct one-dimensional characters. We then know by orthogonality that
d^2=\int_{G} |\mathrm{Tr}(\pi(g))|^2 dg=\int_{G} |\mathrm{Tr}(\rho(g))|^4 dg.

Now the last-integral is bounded by
\int_{G} |\mathrm{Tr}(\rho(g))|^4 dg\leq \mathrm{Max}_{g}|\mathrm{Tr}(\rho(g))|^2 \times \int_G|\mathrm{Tr}(\rho(g))|^2dg\leq d^2,
(since |\mathrm{Tr}(\rho(g))|\leq d). Comparing, this means that there must be equality throughout in this estimate, which in turn implies that |\mathrm{Tr}(\rho(g))|=d for all g\in G. Since \rho(g) is unitary of size d, this implies that \rho(g) is scalar for all g, and since it is assumed to be irreducible, it is in fact one-dimensional.

I see two interesting points in this argument: (1) is there a purely algebraic proof of the last part? I haven’t thought very hard about this yet, but it would be nice to have one; (2) the appearance of the fourth moment of \rho is nicely reminiscent of the Larsen alternative (see Section 6.3 of my notes on representation theory, for instance…)

Zeros of Hermite polynomials

In my paper with É. Fouvry and Ph. Michel where we find upper bounds for the number of certain sheaves on the affine line over a finite field with bounded ramification, the combinatorial part of the argument involves spherical codes and the method of Kabatjanski and Levenshtein, and turns out to depend on the rather recondite question of knowing a lower bound on the size of the largest zero x_n of the n-th Hermite polynomial H_n, which is defined for integers n\geq 1 by
H_n(x)=(-1)^n e^{x^2} \frac{d^n}{dx^n}e^{x^2}.

This is a classical orthogonal polynomial (which implies in particular that all zeros of H_n are real and simple). The standard reference for such questions seems to still be Szegö’s book, in which one can read the following rather remarkable asymptotic formula:
x_n=\sqrt{2n}-\frac{i_1}{\sqrt[3]{6}}\frac{1}{(2n)^{1/6}}+o(n^{-1/6})
where i_1=3.3721\ldots>0 is the first (real) zero of the function
\mathrm{A}(x)=\frac{\pi}{3}\sqrt{\frac{x}{3}}\Bigl\{J_{1/3}\Bigl(2\Bigl(\frac{x}{3}\Bigr)^{3/2}\Bigr)+J_{-1/3}\Bigl(2\Bigl(\frac{x}{3}\Bigr)^{3/2}\Bigr)\Bigr\}
which is a close cousin of the Airy function (see formula (6.32.8) in Szegö’s book, noting that he observes the Peano paragraphing rule, according to which section 6.32 comes before 6.4).

(Incidentally, if — like me — you tend to trust any random PDF you download to check a formula like that, you might end up with a version containing a typo: the cube root of 6 is, in some printings, replaced by a square root…)

Szegö references work of a number of people (Zernike, Hahn. Korous, Bottema, Van Veen and Spencer), and sketches a proof based on ideas of Sturm on comparison of solutions of two differential equations.

As it happens, it is better for our purposes to have explicit inequalities, and there is an elementary proof of the estimate
x_n\geq\sqrt{\frac{n-1}{2}},
which is only asymptotically weaker by a factor 2 from the previous formula. This is also explained by Szegö, and since the argument is rather cute and short, I will give a sketch of it.

Besides the fact that the zeros of H_n are real and simple, we will use the easy facts that \deg(H_n)=n, and that H_n is an even function for n even, and an odd function for n odd, and most importantly (since all other properties are rather generic!) that they satisfy the differential equation
y''-2xy'+2ny=0.

The crucial lemma is the following result of Laguerre:

Let P\in \mathbf{C}[X] be a polynomial of degree n\geq 1. Let z_0 be a simple zero of P, and let
w_0=z_0-2(n-1)\frac{P'(z_0)}{P''(z_0)}.
Then if T\subset \mathbf{C} is any line or circle passing through z_0 and w_0, either all zeros of P are in T, or both components of \mathbf{C}-T contain at least one zero of P.

Before explaining the proof of this, let’s see how it gives the desired lower bound on the largest zero x_n of H_n. We apply Laguerre’s result with P=H_n and z_0=x_n. Using the differential equation, we obtain
w_0=x_n-\frac{n-1}{x_n}.
Now consider the circle T such that the segment [w_0,z_0] is a diameter of T.

Now note that -x_n is the smallest zero of H_n (as we observed above, H_n is either odd or even). We can not have w_0<-x_n: if that were the case, the unbounded component of the complement of the circle T would not contain any zero, and neither would T contain all zeros (since -x_n\notin T), contradicting the conclusion of Laguerre's Lemma. Hence we get -x_n\leq w_0=x_n-\frac{n-1}{x_n},
and this implies
x_n\geq \sqrt{\frac{n-1}{2}},
as claimed. (Note that if n\geq 3, one deduces easily that the inequality is strict, but there is equality for n=2.)

Now for the proof of the Lemma. One defines a polynomial Q by
P=(X-z_0)Q,
so that Q has degree n-1 and has zero set Z formed of the zeros of P different from z_0 (since the latter is assumed to be simple). Using the definition, we have
Q'(z_0)=P'(z_0),\quad\quad Q''(z_0)=\frac{1}{2}P''(z_0).
We now compute the value at z_0 of the logarithmic derivative of Q, which is well-defined: we have
\frac{Q'}{Q}=\sum_{\alpha\in Z}\frac{1}{X-\alpha},
hence
\frac{Q'}{Q}(z_0)=\sum_{\alpha\in Z}\frac{1}{z_0-\alpha},
which becomes, by the above formulas and the definition of w_0, the identity
\frac{1}{z_0-w_0}=\frac{1}{n-1}\sum_{\alpha\in Z}\frac{1}{z_0-\alpha},
or equivalently
\gamma(w_0)=\frac{1}{n-1}\sum_{\alpha\in Z}{\gamma(\alpha)},
where \gamma(z)=1/(z_0-z) is a Möbius transformation.

Recalling that |Z|=n-1, this means that \gamma(w_0) is the average of the \gamma(\alpha). It is then elementary that for line L, either \gamma(Z) is contained in L, or \gamma(Z) intersects both components of the complement of L. Now apply \gamma^{-1} to this assertion: one gets that either Z is contained in \gamma^{-1}(L), or Z intersects both components of the complement of \gamma^{-1}(L). We are now done, after observing that the lines passing through \gamma(w_0) are precisely the images under \gamma of the circles and lines passing through w_0 and through z_0 (because \gamma(z_0)=\infty, and each line passes through \infty in the projective line.)

Orthogonality of columns of integral unitary operators: a challenge

Given a unitary matrix A=(a_{i,j}) of finite size, it is a tautology that the column vectors of A are orthonormal, and in particular that
\sum_{i} a_{i,j} \overline{a_{i,k}} =0
for any $j\not=k$. This has an immediate analogue for a unitary operator U\,:\, H\rightarrow H, if H is a separable Hilbert space: given any orthonormal basis (e_n)_{n\geq 1} of H, we can define the “matrix” (a_{i,j})_{i,j\geq 1} representing U by
U(e_j)=\sum_{i\geq 1}a_{i,j}e_i,
and the “column vectors” (a_{i,j})_{i\geq 1}, for distinct indices j, are orthogonal in the \ell_2-sense: we have
0=\langle e_j,e_k\rangle = \langle U(e_j),U(e_k)\rangle=\sum_{i}a_{i,j}\overline{a_{i,k}}
if j\not=k.

Now assume that H is some L^2 space, say H=L^2(X,\mu), and U is an integral operator on H given by a kernel k\,:\, X\times X\rightarrow \mathbf{C}, so that
U(\varphi)(x)=\int_{X}\varphi(y)k(x,y)d\mu(y)
for \varphi \in L^2(X,\mu).
Intuitively, the values k(x,y) of the kernel form a kind of “continuous matrix” representing U. The question is: are its columns orthogonal? In other words, given y\not=z in X, do we have
\int_{X}k(x,y)\overline{k(x,z)}d\mu(x)=0?

If one remembers the fact that “nice” kernels define trace class integral operators in such a way that the trace can be recovered as the integral
\int_{X}k(x,x)d\mu(x)
over the diagonal (the basis of the trace formula for automorphic forms…), this sounds rather reasonable. There is however a difficulty: it is not so easy to write kernels k(x,y) which both define a unitary operator, and are such that the integrals
(\star)\quad\quad\quad\quad \int_{X}k(x,y)\overline{k(x,z)}d\mu(x)
are well-defined in the usual sense! For instance, the most important unitary integral operator is certainly the Fourier transform, defined on L^2(\mathbf{R},dx), and its kernel is
k(x,y)=e^{2i\pi xy},
for which the integrals above are all undefined in the Lebesgue sense. This is natural: if the kernel k(x,y) were square integrable on X\times X, for instance, the corresponding integral operator on L^2(X,\mu) would be compact, and its spectrum could not be contained in the unit circle (excluding the degenerate case of a finite-dimensional L^2-space.)

This probably explains why this question of orthogonality of column vectors is not to be found in standard textbooks. There are some examples however where things do work.

We consider the space H=L^2(\mathbf{R}^*,|x|^{-1}dx), and as in the previous post, we look at the unitary operator
T=\rho\Bigl(\begin{pmatrix}0&-1\\1&0\end{pmatrix}\Bigr),
where \rho is the principal series representation with eigenvalue 1/4 of \mathrm{PGL}_2(\mathbf{R}). The result of Cogdell and Piatetski-Shapiro already mentioned there shows that T is, indeed, a unitary operator given by a smooth kernel k(x,y)=j(xy) for some function j on \mathbf{R}^*. This function is explicit, and (as expected) not very integrable: we have
j(x)=\begin{cases}-2\pi \sqrt{x}Y_0(4\pi\sqrt{x})\text{ for } x>0,\\4\sqrt{|x|}K_0(4\pi\sqrt{|x|})\text{ for } x<0.\end{cases}.

Since it is classical that Y_0(x)\approx x^{-1/2} for x\rightarrow +\infty, this function is neither integrable nor square-integrable. But, the function K_0 on [0,+\infty[ decays exponentially at infinity! This means that the integrals (\star), which are given by
\int_{\mathbf{R}^*}j(xy)\overline{j(xz)}\frac{dx}{|x|},
make perfect sense when y and z have opposite sign (this requires also knowing that there is no problem at 0, but that is indeed the case, because the Bessel functions here have just a logarithmic singularity there, and the factors \sqrt{|x|} eliminate the |x|^{-1} in the integral.)

It should not be a surprise then that we have
\int_{\mathbf{R}^*}j(xy)\overline{j(xz)}\frac{dx}{|x|}=0
for yz<0. This boils down to an identity for integrals of Bessel functions that can be found in (combinations of) standard tables, or it can be proved more conceptually by viewing
j(xy)=k(x,y)
as limit of
\frac{1}{2\epsilon}\int_{|u-y|<\epsilon} k(x,u)du,
which is T(f_{y,\epsilon}) for the function f_{y,\epsilon} which is the normalized characteristic function of the interval of radius \epsilon around y, and similarly for z. Since
\langle f_{y,\epsilon},f_{z,\epsilon}\rangle =0
when \epsilon is small enough, the unitarity gives
\int_{\mathbf{R}^*} Tf_{y,\epsilon}(x)\overline{Tf_{z,\epsilon}(x)}\frac{dx}{|x|}=0,
and one must take the limit \epsilon\rightarrow 0, which is made relatively easy by the exponential decay of K_0 at infinity…

This is nice, but here comes a challenge: if one spells out this identity in terms of Bessel functions, what needs to be done is equivalent to showing that the function
K(a, b)=\int_{0}^{+\infty}{Y_0(ax)K_0(bx)xdx}
defined for a,b>0, is antisymmetric: we have
K(a,b)=-K(b,a).
Now, this fact is an “elementary” property of classical functions. Can one prove it directly? (By which I mean, without using the operator interpretation, but also without using an explicit formula for the integral…) For the moment, I have not succeeded…

I’ll conclude by correcting a mistake in my previous post (it should not be a surprise to anyone that if I attempt to be as clever as Euler, I may stumble rather badly, and the correction is in some sense rather small compared with one might expect)… There I claimed that the integral transform w\mapsto W appearing in the Voronoi formula for the divisor function is given by
|y|^{1/2}W(y)=T(|x|^{1/2}w(|x|)).
But this is not the case: the proper formula is
|y|^{1/2}W(y)=T(|x|^{1/2}\tilde{w}(x)),
where \tilde{w}(x)=w(x) if x>0, but \tilde{w}(x)=0 if x<0. This affects the final formula: we have
\|W\|^2=\|w\|^2,
instead of the claimed
\|W\|^2=2\|w\|^2
(the "proof" using the Fourier transform has the same mistake of using w(|xy|) instead of \tilde{w}(xy), so there is no contradiction between the informal argument and the rigorous one.)

On Weyl groups and gaussians

Am I the last person to notice that for k\geq 0, the even moment
m_{2k}=\frac{(2k)!}{2^kk!}
of a standard gaussian random variable (with expectation zero and variance one) is the same as the index of the Weyl group of \mathrm{Sp}_{2k} inside the Weyl group of \mathrm{GL}_{2k} (in other words, the index of the groups of permutations of 2k elements commuting with a fixed-point free involution among all permutations)?

If “Yes”, what else have I been missing in the same spirit?