Mathematics – Page 7 – E. Kowalski's blog

Jordan blocks

Here is yet another definition in mathematics where it seems that conventions vary (almost like the orientation of titles on the spines of books): is a Jordan block an upper-triangular or lower-triangular matrix? In other words, which of the matrices

$A_1=\begin{pmatrix}\alpha & 1\\0&\alpha\end{pmatrix},\quad\quad A_2=\begin{pmatrix} \alpha & 0\\1&\alpha\end{pmatrix}$

is a Jordan block of size 2 with respect to the eigenvalue $\alpha$ ?

I have the vague impression that most elementary textbooks in Germany (I taught linear algebra last year…) use $A_1$ , but for instance Bourbaki (Algèbre, chapitre VII, page 34, définition 3, in the French edition) uses $A_2$ , and so does Lang’s “Algebra”. Is it then a cultural dichotomy (again, like spines of books)?

I have to admit that I tend towards $A_2$ myself, because I find it much easier to remember a good model for a Jordan block: over the field $K$ , take the vector space $V=K[X]/X^nK[X]$ , and consider the linear map $u\colon V\to V$ defined by $u(P)=\alpha P+XP$ . Then the matrix of $u$ with respect to the basis $(1,X,\ldots,X^{n-1})$ is the Jordan block in its lower-triangular incarnation. The point here (for me) is that passing from $n$ to $n+1$ is nicely “inductive”: the formula for the linear map $u$ is “independent” of $n$ , and the bases for different $n$ are also nicely meshed. (In other words, if one finds the Jordan normal form using the classification of modules over principal ideal domains, one is likely to prefer the lower-triangular version that “comes out” more naturally…)

A geometric interpretation of Schinzel’s Hypothesis

Schinzel’s “Hypothesis” for primes is (update: actually, not really, see the remark at the end…) the statement that if $F$ is an irreducible polynomial (say monic) in $\mathbf{Z}[Y]$ , and if there is no “congruence obstruction”, then the sequence of values $F(n)$ for integers $n\geq 1$ contains infinitely many primes. More precisely, one expects that the number $\pi_F(x)$ of integers $n\leq X$ such that $F(n)$ is prime satisfies
$\pi_F(x)\sim c_F\frac{x}{\log x},$
for some (explicitly predicted) constant $c_F>0$ , called sometimes the “singular series”.

Except if $F$ has degree one, this problem is very much open. But it makes sense to translate it to a more geometric setting of polynomials over finite fields, and this leads (as is often the case) to problems that are more tractable. The translation is straightforward: instead of $\mathbf{Z}$ , one considers the ring $A=\mathbf{F}_q[X]$ of polynomials over a finite field $\mathbf{F}_q$ with $q$ elements, instead of $F$ , one considers a polynomial $F\in A[Y]=\mathbf{F}_q[X,Y]$ , and then the question is to determine asymptotically how many polynomials $f\in A$ of given degree $n\geq 1$ are such that $F(f)\in A$ is an irreducible polynomial.

The reason the problem becomes more accessible is that there is an algebraic criterion for a polynomial $f$ with coefficients in a finite field $\mathbf{F}_q$ to be irreducible: if we look at the natural action of the Frobenius automorphism $x\mapsto x^q$ on the set of roots of the polynomial, then $f$ is irreducible if and only if this action “is” a cycle of length $\deg(f)$ . This is especially useful for the variant of the Schinzel problem where the size of the finite field is varying, whereas the degree $n$ of the polynomials $f$ remains fixed, since in that case the variation of the action of the Frobenius on the roots of the polynomial is encoded in a group homomorphism from the Galois group of the function field of the parameter space to the symmetric group on $n$ letters. (This principle goes back at least to work of S.D. Cohen on Hilbert’s Irreducibility Theorem).

If we apply this principle in the Schinzel setting, this means that we consider specialized polynomials $F(f)$ for some fixed polynomial $F\in A[Y]=\mathbf{F}_q[X][Y]$ , where $f$ runs over polynomials of a fixed degree $d\geq 1$ , but $q$ ranges over powers of a fixed prime. “Generically”, the polynomial $F(f)$ has some fixed degree $n$ , and is squarefree. If we interpret the parameter space $X_d$ geometrically, the content of the previous paragraph is that we have a group homomorphism
$\rho\colon \pi_1(X_d)\to \mathfrak{S}_n,$
from the fundamental group of $X_d$ to the symmetric group. Then the Chebotarev Density Theorem solves, in principle, the problem of counting the number of irreducible specializations in the large $q$ limit: essentially (omitting the distinction between geometric and arithmetic fundamental groups), the asymptotic proportion of $f\in X_d(\mathbf{F}_q)$ such that $F(f)$ is irreducible converges as $q\to+\infty$ to the proportion, in the image of $\rho$ , of the elements that are $n$ -cycles in $\mathfrak{S}_n$ . If the homomorphism $\rho$ is surjective, then this means that the probability that $F(f)$ is irreducible is about $1/n$ . This is the expected answer in many cases, because this is also the probability that a random polynomial of degree $n$ is irreducible.

All this has been used by a number of people (including Hall, Pollack, Bary-Soroker, and most successfully Entin). However, there is a nice geometric interpretation that I haven’t seen elsewhere. To see it, we go back to $F(f)$ and the action of Frobenius on its roots that will determine if $F(f)$ is irreducible. A root of $F(f)$ is an element $x$ such that
$F(f)(x)=F(x,f(x))=0$
where we view $F$ as a two-variable polynomial. In other words, $x$ is the first coordinate of a point $(x,f(x))$ that belongs to the intersection of the graph of $f$ in the plane, and the plane affine curve $S$ with equation $F(x,y)=0$ . Since the Frobenius will permute these intersection points in the same way that it permutes the roots of $F(f)$ , we can interpret the Schinzel Problem, in that context, as asking about the “variation” of this Galois action as $f$ varies and the curve $S$ is fixed.

This point of view immediately suggests some generalizations: there is no reason to work over a finite field (any field will do), the base curve (which is implicitly the affine line where polynomials live) can be changed to another (open) curve $U$ ; the point at infinity, where polynomials have their single pole, might also be changed to any effective divisor with support the complement of $U$ in its smooth projective model (e.g., allowing poles at $0$ and $\infty$ on the projective line); and $S$ may be any (non-vertical) curve in $U\times\mathbf{A}^1$ . For instance (to see that this generalization is not pointless), take any curve $U$ , and define $S=U\times\{0\}$ . Then the intersection of the graph of a function $f$ on $U$ and $S$ is the set of zeros of $f$ . The problem becomes something like figuring out the “generic” Galois group of the splitting field of this set of zeros. (E.g., the Galois group of a complicated elliptic function defined over $\mathbf{Q}$ …)

In fact this special case was (with different motivation and terminology) considered by Katz in his book “Twisted L-functions and monodromy” (see Chapter 9). Katz shows that if the (fixed) effective divisor used to define the poles of the functions considered has degree $\geq 2g+1$ , where $g$ is the genus of the smooth projective model of $U$ , then the image of Galois is the full symmetric group (his proof is rather nice, using character sums on the Jacobian…)

The general case, on the other hand, does not seem to have been considered before. In the recent note that I’ve written on the subject, I use quite elementary arguments with Lefschetz pencils / Morse-like functions (again inspired by results of Katz and Katz-Rains) to show that in very general conditions, the image of the fundamental group is again the full symmetric group. This gives the asymptotic for this geometric Schinzel problem in this generality over finite fields. (In the classical case, this was essentially done by Entin, though the conditions of applicability are not exactly the same).

I recently gave a talk about this in Berlin, and the slides might be a good introduction to the ideas of the proof for interested readers…

As I mention at the end of those slides, the next step is of course to think about the fixed finite field case, where the degree of the polynomials tends to infinity. This seems, even geometrically, to be quite an interesting problem…

[Update: after I wrote this post, I remembered that in fact the (qualitative) problem of representing primes with one polynomial that I consider here is actually Bunyakowski’s Problem, and that the Schinzel Hypothesis is the qualitative statement for a finite set of polynomials… The quantitative versions of both are usually called the Bateman-Horn conjecture. So my terminology is multiply inaccurate…]

L-functions database!

People studying automorphic forms, automorphic representations, number fields, diophantine equations, function fields, algebraic curves, equidistribution and many other arithmetic objects (j’en passe, et des meilleurs), often end up with some “L-function” to deal with — indeed, probably equally often, with a whole family of them, sometimes not so well-behaved… These objects are fascinating, mystifying, exhilarating, random and possibly spooky. Where they really come from is still a mystery, even with buzzwords aplenty ringing around our ears. But one remarkable thing was already known to Euler and to Riemann: one can compute with L-functions. One impressive research project has been building, for quite a few years, a very sophisticated website presenting enormous amounts of data about L-functions of many kinds. The L-functions Database is now out of its beta status: go see it, and have a look at the list of editors to see who should be thanked for this amazing work!

Bagchi’s Theorem

Bagchi’s Theorem is a functional version of earlier results of Bohr and Jessen related to the statistical properties of the Riemann zeta function on a vertical line between the critical line and the region of absolute convergence. It seems that it is not as well-known as it could, partly because Bagchi proved it in his thesis, and did not publish a paper with this result (his only related paper explicitly states that he removed the probabilistic language that a referee did not like). It seems therefore useful to describe the result. I will then sketch the proof I gave last semester…

Consider an open disc $D$ contained in the region $1/2<\mathrm{Re}(s)< 1$ (other compact regions may be considered, for instance an open rectangle). For any real number $t$ , we can look at the function $\zeta_t\colon s\mapsto \zeta(s+it)$ on $D$ . This is a holomorphic function on $D$ , continuous on the closed disc $\bar{D}$ . What kind of functions arise this way? Bagchi proved the following (this is essentially Theorem 3.4.11 in his thesis):

Theorem. Let $H$ denote the Banach space of holomorphic functions on $D$ which are continuous on the closed disc. For $T>0$ , define a probability measure $\mu_T$ on $H$ to be the law of the random variable $t\mapsto \zeta_t$ , where $t$ is uniformly distributed on $[-T,T]$ . Then $\mu_T$ converges in law, as $T\to +\infty$ , to the random holomorphic function
$Z(s)=\prod_{p}(1-X_pp^{-s})^{-1}$ ,
where $(X_p)$ is a sequence of independent random variables indexed by primes, all uniformly distributed on the unit circle.

This is relatively easy to motivate: if we could use the Euler product
$\zeta(s+it)=\prod_p (1-p^{-s-it})^{-1}$
in $D$ , then we would be led to an attempt to understand the probabilistic behavior of the sequence $(p^{-it})_p$ , viewed as a random variable on $[-T,T]$ with values in the infinite product $\widehat{U}$ of copies of the unit circle indexed by primes. This is a compact topological group, and the easy answer (using the Weyl criterion) is simply that this sequence converges to the Haar measure on $\widehat{U}$ . In other words, the random sequence $(p^{-it})$ converges in law to a sequence $(X_p)$ of independent, uniform, random variables on the unit circle. Then it is natural to expect that $Z_t$ should converge to the random function $Z(s)$ , which is obtained formally by replacing $(p^{-it})$ by its limit $(X_p)$ .

Bagchi’s proof is somewhat intricate, in comparison with this heuristic justification, especially if one notices that if $D$ is replaced by a compact region in the domain of absolute convergence, then the same idea applies, and is a completely rigorous proof (one need only observe that the assignment of an Euler product
$\prod_p (1-x_pp^{-s})^{-it}$
to a sequence $(x_p)$ of complex numbers of modulus one is a continuous operation in the region of absolute convergence.)

The proof I give in my script tries to remain closer to the basic intuition, and is indeed less involved (it avoids both a use of the pointwise ergodic theorem that Bagchi required and any use of tightness or weak-compactness). It makes it easy to see exactly what arithmetic ingredients are needed, beyond the convergence in law of $(p^{-it})_p$ to the Haar measure on $\widehat{U}$ . Roughly speaking, it goes as follows:

One checks that the random Euler product $Z(s)$ does exist (as an $H$ -valued random variable), and that it has the Dirichlet series expansion
$Z(s)=\sum_{n\geq 1} X_nn^{-s}$
converging for $\mathrm{Re}(s)> 1/2$ almost surely, where $(X_n)_{n\geq 1}$ is defined as the totally multiplicative extension of $(X_p).$ This is done as Bagchi did using fairly standard probability theory and elementary facts about Dirichlet series.
One shows that $Z(s)$ has polynomial growth on vertical lines for $\mathrm{Re}(s)> 1/2$ . This is again mostly elementary probability with a bit of Dirichlet series theory.
Consider next smoothed partial sums of $Z(s)$ , of the type
$Z^{(N)}(s)=\sum_{n\geq 1}X_n\varphi(n/N)n^{-s},$
where $\varphi$ is a compactly supported test function with $\varphi(0)=1$ . Using again standard techniques (including Cauchy’s formula for holomorphic functions), one proves that
$\mathbf{E}(\sup_{s\in D}|Z(s)-Z^{(N)}(s)|)\ll N^{-\delta}$
for some $\delta>0$ .
One next shows that the smoothed partial sums of the zeta function
$\zeta^{(N)}(s)=\sum_{n\geq 1}\varphi(n/N)n^{-s}$
satisfy
$\mathbf{E}_T(\sup_{s\in D}|\zeta(s+it)-\zeta^{(N)}(s+it)|)\ll N^{-\delta}+NT^{-1}$
(the second term arises because of the pole), where $\mathbf{E}_T(\cdot)$ denotes the expectation with respect to the uniform measure on $[-T,T]$ . This step is also in Bagchi’s proof, and is essentially the only place where a specific property of the Riemann zeta function is needed: one requires the boundedness on average of $\zeta(s)$ in vertical strips to the right of the critical line. The standard proof of this uses the Cauchy inequality and the mean-value property
$\frac{1}{2T}\int_{-T}^T|\zeta(\sigma+it)|^2dt\to \zeta(2\sigma)$
for any fixed $\sigma$ with $\sigma> 1/2$ . It is here that the bottleneck lies if one wishes to generalize Bagchi’s Theorem to any “reasonable” family of $L$ -functions.
Finally, we just use the definition of convergence in law: for any continuous bounded function $f\colon H\to\mathbf{C}$ , we should prove that
$\mathbf{E}_T(f(\zeta_T))\to \mathbf{E}(f(Z)),$
where $\zeta_T$ is the $H$ -valued random variable giving the translates of $\zeta(s)$ , and $Z$ is the random Dirichlet series. The minor tweak that is useful to notice (and that I wasn’t consciously aware of before) is that one may assume that $f$ is Lipschitz: there exists a constant $C$ such that
$|f(g_1)-f(g_2)|\leq C\sup_{s\in D}|g_1(s)-g_2(s)|$
(this is hidden in standard references — e.g., Billingsley’s — in the proof that one may assume that $f$ is uniformly continuous; the functions used to prove this are in fact Lipshitz…).

Now pick some parameter $N>0$ , and write
$|\mathbf{E}_T(f(\zeta_T))-\mathbf{E}(f(Z))|\leq A_1+A_2+A_3$ ,
where
$A_1=|\mathbf{E}_T(f(\zeta_T))\to \mathbf{E}_T(f(\zeta_T^{(N)}))|\leq C\ \mathbf{E}_T(\sup_{s\in D}|\zeta(s+it)-\zeta^{(N)}(s+it)|),$
$A_2=|\mathbf{E}_T(f(\zeta_T^{(N)}))\to \mathbf{E}(f(Z^{(N)}))|,$
$A_3=|\mathbf{E}(f(Z^{(N)}))\to \mathbf{E}(f(Z))|\leq C\ \mathbf{E}(\sup_{s\in D}|Z(s)-Z^{(N)}(s)|).$
Fix $\varepsilon>0$ . For some fixed $N=N_0$ big enough, $A_3$ is less than $\varepsilon$ by Step 3, and $A_1$ is at most $\varepsilon+N_0T^{-1}$ . For this fixed $N_0$ , $A_2$ tends to $0$ as $T$ tends to infinity because of the convergence in law of $(p^{-it})$ to $(X_p)$ — the sum defining the truncations are finite, so there is no convergence issue. So for all $T$ large enough, we will get
$|\mathbf{E}_T(f(\zeta_T))\to \mathbf{E}(f(Z))|\leq 4\varepsilon.$

Jacques Ménard, author of Nicolas Bourbaki

My punning title about James Maynard must have given me somewhere the undeserved reputation of a Borges specialist, since I’ve just received a curious reworking of the story of Pierre Ménard.

The email address from which it came ( jlb@limbo.ow ) is probably not genuine, so I wonder who the author could be (the final note “Translated, from the Spanish, by H.A.H” is of course suggestive, but one would then like to see the original Spanish…)