The Bohr-Pál Theorem and its friends

In the course of writing our paper on the support of the Kloosterman paths, Will Sawin and I encountered some very beautiful classical questions of Fourier analysis. As discussed in the previous post, we were interested in the following question: given a continuous function f \colon [0,1]\to \mathbf{C}, such that (1) we have f(1)\in [-2,2]\subset \mathbf{R} and (2) the function g(t)=f(t)-tf(1) has purely imaginary Fourier coefficients \hat{g}(h) for h\not=0, does there exist an increasing homeomorphism \sigma\colon [0,1]\to [0,1], such that \sigma(1-t)=1-\sigma(t), and such that the new function f_1(t)=f(\sigma(t)), in addition to the analogue of (1) and (2), also satisfies |\hat{g}_1(h)|\leq 1/(\pi|h|) for h\not=0?

After some searching we found the right keywords, or mathematical attractor, for this type of problem. It starts with a short paper of Gyula, alias Julius, alias Jules, Pál, alias Pal, alias Perl, in 1914, who attributes to Fejér the following question : can one reparameterize a continuous periodic function in such a way that the new function has uniformly convergent Fourier series? Pál shows that he can almost do it: his reparameterized function has Fourier series that converges uniformly on [\delta,1-\delta], where \delta>0 can be arbitrarily small (but the change of variable depends on \delta).

This Theorem of Pál becomes the Bohr–Pál Theorem after Harald Bohr’s full answer to Fejér’s question in Acta Universitatis Szegediensis in 1935. (Note: I had never looked before at the archive of this particular journal; like most mathematical journals of this period, it is amazing how many names, and even theorems, one can recognize in almost any issue !)

Then comes a beautiful very short proof by Salem in 1944 (it is also the proof found in Zygmund’s book on trigonometric series). All three proofs rely on two results to achieve uniform convergence: one is Riemann’s mapping theorem, and the second is a result of Fejér, according to which the conformal map from the unit disc to the “interior” of a simple closed curve has the property that the boundary Fourier series converges uniformly. In fact, one can certainly write beautiful exercises or exams based on Salem’s proof, for a course in complex analysis that goes as far as Riemann’s Theorem…

The story does not stop here. It is amusing to note that none of Pál, Bohr or Salem feel that it is necessary to point out that they work with real-valued periodic functions. But this is essential to the basic structure of their argument (which starts by viewing f(t) as the imaginary part of a complex number that describes a simple closed curve in \mathbf{C}, the trick being to define the real part in a suitable way). Whether the Bohr-Pál Theorem holds for complex-valued functions f was an open problem until the late 1970’s (see for instance p. 128 of this 1974 survey of Zalcman about real and complex problems in analysis), when it was proved by Kahane and Katznelson with a completely different approach. They in fact succeeded in proving that one can find a single change of variable \sigma that transforms simultaneously all functions in a compact set of the space of (real-valued periodic) continuous functions into functions with uniformly convergent Fourier series; applied to the real and imaginary parts of a complex-valued function, this gives the required extension.

Coming back to our original question, the complex-variable proofs give some information on the Fourier coefficients of the reparameterized function g, precisely they imply that
\sum_{h\in\mathbf{Z}}|h|\,|\hat{g}(h)|^2<+\infty,
which is encouraging, even if no pointwise estimate follows. But some variants of the question have been studied that get much closer to what we want. They are discussed (among other things) in a nice survey by Olevskii. I find especially fascinating one theorem of Olevskii himself, published in 1981: it is not always possible to find a change of variable (of a real-valued periodic continuous function) so that the reparameterized function has an absolutely convergent Fourier series. This answers a question of Luzin, and is explained in Section 3 of Olevskii’s survey. Interestingly, this problem is now, of course, easier for complex-valued functions, compared to real-valued functions, and it was also solved by Kahane and Katznelson in the \mathbf{C}-valued case.

The result that turned out to be most useful for our purposes is
due to Sahakyan
(whose name is also spelled Saakjan or Saakyan). He proves the existence of reparameterizations g of a continuous periodic function f satisfying various asymptotic bounds for their Fourier coefficients. The result is here also restricted to real-valued functions, and again for a clear reason: the construction relies crucially on the intermediate value theorem for continuous functions. More precisely, Sahakyan’s idea is to use the Faber-Schauder expansions of continuous functions, and to find a change of variable that leads to a function g with a “sparse” Faber-Schauder expansion, from which he estimates directly the Fourier coefficients using bounds for those of the Faber-Schauder functions. (I actually didn’t know about Faber-Schauder expansions before; I will also certainly make use of them for analysis or functional analysis exercises later…) This works because the Faber-Schauder coefficients are quite simple: they are of the form
f(x_1)-\frac{1}{2}(f(x_0)+f(x_2))
for suitable real numbers x_i\in [0,1]. One can now easily imagine how the intermediate value theorem will make it possible to find a change of variable to make such a coefficient vanish.

Using Sahakyan’s main lemma, with a few modifications to take into account the additional symmetry we require, we were able to solve our reparameterization problem for the support of Kloosterman paths, in the case of real-valued functions. The complex-value case, as far as we know, remains open…

Symmetric powers and the Chinese restaurant process

Here is a simple but cute computation that makes a good exercise in elementary representation theory with a probabilistic flavor — as usual, it’s probably well-known, but I wasn’t aware of it until doing the exercise myself.

We consider random permutations in the symmetric group S_n, and write \omega_n(\sigma) for the number of cycles in the disjoint cycle representation of \sigma\in S_n. It is a well-known fact (which may be due to Feller) that, viewed as a random variable on S_n (with uniform probability measure), there is a decomposition in law
\omega_n=B_1+\cdots+B_n,
where B_i is a Bernoulli random variable taking value 0 with probability 1-1/i and 1 with probability 1/i, and moreover the (B_i)‘s are independent.

This decomposition is (in the sources I know) usually proved using the “Chinese Restaurant Process” to describe random permutations (n guests numbered from 1 to n enter a restaurant with circular tables; they successively sit either at the next available space of one of the tables already occupied, or pick a new one, with the same probability; after all n guests are seated, reading the cycles on the occupied tables gives a uniformly distributed element of S_n.) If one is only interested in \sigma_n, then the decomposition above is equivalent to the formula
\mathbf{E}(z^{\omega_n})=\prod_{j=1}^n\Bigl(1-\frac{1}{j}+\frac{z}{j}\Bigr)
for the probability generating function of \omega_n. (This formula comes up in mod-poisson convergence of \sigma_n and in analogies with the Erdös-Kac Theorem, see this blog post).

Here is an algebraic proof of this generating function identity. First, z\mapsto \mathbf{E}(z^{\omega_n}) is a polynomial, so it is enough to prove the formula when z=m is a non-negative integer. We can do this as follows: let V be a vector space of dimension m over \mathbf{C}. Then define W=V^{\otimes n} and view it as a representation space of S_n where the symmetric group permutes the factors in the tensor product. Using the “obvious” basis of W, it is elementary that the character of this representation is the function g\mapsto m^{\sigma_n(g)} on S_n (the matrix representing the action of g is a permutation matrix). Hence, by character theory, \mathbf{E}(m^{\sigma_n}) is the dimension of the S_n-invariant subspace of W. Since the representation is semisimple, this is the dimension of the coinvariant space, which is the n-th symmetric power of V. This in turn is well-known (number of monomials of degree n in m variables), and we get
\mathbf{E}(m^{\omega_n})=\binom{m+n-1}{n}=\prod_{j=1}^n\Bigl(1-\frac{1}{j}+\frac{m}{j}\Bigr),
as claimed!

Update on a bijective challenge

A long time ago, I presented in a post a fun property of the family of curves given by the equations
x+1/x+y+1/y=t,
where t is the parameter: if we consider the number (say N_p(t)) of solutions over a finite field with p\geq 3 elements, then we have
N_p(t)=N_p(16/t),
provided t is not in the set \{0, 4, -4\}. This was a fact that Fouvry, Michel and myself found out when working on our paper on algebraic twists of modular forms.

At the time, I knew one proof, based on computations with Magma: the curves above “are” elliptic curves, and Magma found an isogeny between the curves with parameters t and 16/t, which implies that they have the same number of points by elementary properties of elliptic curves over finite fields.

By now, however, I am aware of three other proofs:

  • The most elementary, which motivates this post, was just recently put on arXiv by T. Budzinski and G. Lucchini Arteche; it is based on the methods of Chevalley’s proof of Warning’s theorem: it computes N_p(t) modulo p, proving the desired identity modulo p, and then concludes using upper bounds for the number of solutions, and for its parity, to show that this is sufficient to have the equality as integers.  Interestingly, this proof was found with the help of high-school students participating in the Tournoi Français des Jeunes Mathématiciennes et Mathématiciens. This is a French mathematical contest for high-school students, created by D. Zmiaikou in 2009, which is devised to look much more like a “real” research experience than the Olympiads: groups of students work together with mentors on quite “open-ended” questions, where sometimes the answer is not clear (see for instance the 2016 problems here).
  • A proof using modular functions was found by D. Zywina, who sent it to me shortly after the first post (I have to look for it in my archives…)
  • Maybe the most elegant argument comes by applying a more general result of Deligne and Flicker on local systems of rank 2 on the projective line minus four points with unipotent tame ramification at the four missing points (the first cohomology of the curves provide such a local system, the missing points being 0, 4, -4, \infty). Deligne and Flicker prove (Section 7 of their paper, esp. Prop. 7.6), using a very cute game with products of matrices and invariant theory, that if \mathcal{V} is such a local system and \sigma is any automorphism of the projective line that permutes the four points, then \sigma^*\mathcal{V} is isomorphic to \mathcal{V}.

Not too bad a track record for such a simple-looking question… Whether there is a bijective proof remains open, however!

Kummer extensions, Hilbert’s Theorem 90 and judicious expansion

This semester, I am teaching “Algebra II” for the first time. After “Algebra I” which covers standard “Groups, rings and fields”, this follow-up is largely Galois theory. In particular, I have to classify cyclic extensions.

In the simplest case where L/K is a cyclic extension of degree n\geq 1 and K contains all n-th roots of unity (and n is coprime to the characteristic of K), this essentially means proving that if L/K has cyclic Galois group of order n, then there is some b\in L with L=K(b) and b^n=a belongs to K^{\times}.

Indeed, the converse is relatively simple (in the technical sense that I can do it on paper or on the blackboard without having to think about it in advance, by just following the general principles that I remember).

I had however the memory that the second step is trickier, and didn’t remember exactly how it was done. The texts I use (the notes of M. Reid, Lang’s “Algebra” and Chambert-Loir’s delightful “Algèbre corporelle”, or rather its English translation) all give “the formula” for the element b but they do not really motivate it. This is certainly rather quick, but since I can’t remember it, and yet I would like to motivate as much as possible all steps in this construction, I looked at the question a bit more carefully.

As it turns out, a judicious expansion and lengthening of the argument makes it (to me) more memorable and understandable.

The first step (which is standard and motivated by the converse) is to recognize that it is enough to find some element x in L^{\times} such that \sigma(x)=\xi x, where \sigma is a generator of the Galois group G=\mathrm{Gal}(L/K) and \xi is a primitive n-th root of unity in L. This is a statement about the K-linear action of G on L, or in other words about the representation of G on the K-vector space L. So, as usual, the first question is to see what we know about this representation.

And we know quite a bit! Indeed, the normal basis theorem states that L is isomorphic to the left-regular representation of G on the vector space V of K-valued functions \varphi\,:\, G\longrightarrow K, which is given by
(\sigma\cdot \varphi)(\tau)=\varphi(\sigma^{-1}\tau).
(It is more usual to use the group algebra K[G], but both are isomorphic).

The desired equation implies (because G is generated by \sigma) that Kx is a sub-representation of L. In V, we have an explicit decomposition in direct sum
V=\bigoplus_{\chi} K\chi,
where \chi runs over all characters \chi\,:\, G\longrightarrow K (these really run over all characters of G over an algebraic closure of K, because K contains all n-th roots of unity and G has exponent n). So x (if it is to exist) must correspond to some character. The only thing to check now is whether we can find one with the right \sigma eigenvalue.

So we just see what happens (or we remember that it works). For a character \chi\in V such that \chi(\sigma) = \omega, and x\in L^{\times} the element corresponding to \chi under the G-isomorphism L\simeq V, we obtain \sigma(x)=\omega^{-1}x. But by easy character theory (recall that G is cyclic of order n) we can find \chi with \chi(\sigma)=\xi^{-1}, and we are done.

I noticed that Lang hides the formula in Hilbert’s Theorem 90: an element of norm 1 in a cyclic extension, with \sigma a generator of the Galois group, is of the form \sigma(x)/x for some non-zero x; this is applied to the n-th root of unity in L. The proof of Hilbert’s Theorem 90 uses something with the same flavor as the representation theory argument: Artin’s Lemma to the effect that the elements of G are linearly independent as linear maps on L. I haven’t completely elucidated the parallel however.

(P.S. Chambert-Loir’s blog has some recent very interesting posts on elementary Galois theory, which are highly recommended.)

A parity lemma of A. Irving

In his recent work on the divisor function in arithmetic progressions to smooth moduli, A. Irving proves the following rather amusing lemma (see Lemma 4.5 in his paper):

Lemma Let p be an odd prime number, let k\geq 1 be an integer and let h=(h_1,\ldots,h_k) be a k-tuple of elements of \mathbf{F}_p. For any subset I of \{1,\ldots, k\}, denote
h_I=\sum_{i\in I}{h_i},
and for any x\in\mathbf{F}_p, let
\nu_h(x)=|\{I\subset \{1,\ldots, k\}\,\mid\, h_I=x\}|
denote the multiplicity of x among the (h_I).
Then if none of the h_i is zero, there exists some x for which \nu_h(x) is odd.

I will explain two proofs of this result, first Irving’s, and then one that I came up with. I’m tempted to guess that there is also a proof using some graph theory, but I didn’t succeed in crafting one yet.

Irving’s proof. This is very elegant. Let \xi be a primitive p-th root of unity. We proceed by contraposition, hence assume that all multiplicities \nu_h(x) are even. Now consider the element
N=\prod_{i=1}^k(1+\xi^{h_i})
of the cyclotomic field K_p=\mathbf{Q}(\xi). By expanding and using the assumption we see that
N=\sum_{x\in\mathbf{F}_p} \nu_h(x)\xi^{x}\in 2\mathbf{Z}[\xi].
In particular, the norm (from K_p to \mathbf{Q}) of N is an even integer, but because p is odd, the norm of 1+\xi^{h_i} is known to be odd for all h_i\not=0. Hence some factor must have h_i=0, as desired.

A second proof. When I heard of Irving’s Lemma, I didn’t have his paper at hand (or internet), so I tried to come up with a proof. Here’s the one I found, which is a bit longer but maybe easier to find by trial and error.

First we note that
\sum_{x\in \mathbf{F}_p} \nu_h(x)=2^k
is even. In particular, since p is odd, there is at least some x with \nu_h(x) even.

Now we argue by induction on k\geq 1. For k=1, the result is immediate: there are two potential sums 0 and h_1, and so if h_1\not=0, there is some odd multiplicity.

Now assume that k\geq 2 and that the result holds for all (k-1)-tuples. Let h be a k-tuple, with no h_i equal to zero, and which has all multiplicities \nu_h(x) even. We wish to derive a contradiction. For this, let j=(h_1,\ldots,h_{k-1}). For any x\in\mathbf{F}_p, we have
\nu_h(x)=\nu_j(x)+\nu_j(x-h_k),
by counting separately those I with sum x which contain k or not.

Now take x such that \nu_j(x) is odd, which exists by induction. Our assumptions imply that \nu_j(x-h_k) is also odd. Then, iterating, we deduce that \nu_j(x-nh_k) is odd for all integers n\geq 0. But the map n\mapsto x-nh_k is surjective onto \mathbf{F}_p, since h_k is non-zero. Hence our assumption would imply that all multiplicities \nu_j(y) are odd, which we have seen is not the case… Hence we have a contradiction.