Here is yet another definition in mathematics where it seems that conventions vary (almost like the orientation of titles on the spines of books): is a Jordan block an upper-triangular or lower-triangular matrix? In other words, which of the matrices
is a Jordan block of size 2 with respect to the eigenvalue ?
I have the vague impression that most elementary textbooks in Germany (I taught linear algebra last year…) use , but for instance Bourbaki (Algèbre, chapitre VII, page 34, définition 3, in the French edition) uses , and so does Lang’s “Algebra”. Is it then a cultural dichotomy (again, like spines of books)?
I have to admit that I tend towards myself, because I find it much easier to remember a good model for a Jordan block: over the field , take the vector space , and consider the linear map defined by . Then the matrix of with respect to the basis is the Jordan block in its lower-triangular incarnation. The point here (for me) is that passing from to is nicely “inductive”: the formula for the linear map is “independent” of , and the bases for different are also nicely meshed. (In other words, if one finds the Jordan normal form using the classification of modules over principal ideal domains, one is likely to prefer the lower-triangular version that “comes out” more naturally…)
Schinzel’s “Hypothesis” for primes is (update: actually, not really, see the remark at the end…) the statement that if is an irreducible polynomial (say monic) in , and if there is no “congruence obstruction”, then the sequence of values for integers contains infinitely many primes. More precisely, one expects that the number of integers such that is prime satisfies
Except if has degree one, this problem is very much open. But it makes sense to translate it to a more geometric setting of polynomials over finite fields, and this leads (as is often the case) to problems that are more tractable. The translation is straightforward: instead of , one considers the ring of polynomials over a finite field with elements, instead of , one considers a polynomial , and then the question is to determine asymptotically how many polynomials of given degree are such that is an irreducible polynomial.
The reason the problem becomes more accessible is that there is an algebraic criterion for a polynomial with coefficients in a finite field to be irreducible: if we look at the natural action of the Frobenius automorphism on the set of roots of the polynomial, then is irreducible if and only if this action “is” a cycle of length . This is especially useful for the variant of the Schinzel problem where the size of the finite field is varying, whereas the degree of the polynomials remains fixed, since in that case the variation of the action of the Frobenius on the roots of the polynomial is encoded in a group homomorphism from the Galois group of the function field of the parameter space to the symmetric group on letters. (This principle goes back at least to work of S.D. Cohen on Hilbert’s Irreducibility Theorem).
If we apply this principle in the Schinzel setting, this means that we consider specialized polynomials for some fixed polynomial , where runs over polynomials of a fixed degree , but ranges over powers of a fixed prime. “Generically”, the polynomial has some fixed degree , and is squarefree. If we interpret the parameter space geometrically, the content of the previous paragraph is that we have a group homomorphism
from the fundamental group of to the symmetric group. Then the Chebotarev Density Theorem solves, in principle, the problem of counting the number of irreducible specializations in the large limit: essentially (omitting the distinction between geometric and arithmetic fundamental groups), the asymptotic proportion of such that is irreducible converges as to the proportion, in the image of , of the elements that are -cycles in . If the homomorphism is surjective, then this means that the probability that is irreducible is about . This is the expected answer in many cases, because this is also the probability that a random polynomial of degree is irreducible.
All this has been used by a number of people (including Hall, Pollack, Bary-Soroker, and most successfully Entin). However, there is a nice geometric interpretation that I haven’t seen elsewhere. To see it, we go back to and the action of Frobenius on its roots that will determine if is irreducible. A root of is an element such that
where we view as a two-variable polynomial. In other words, is the first coordinate of a point that belongs to the intersection of the graph of in the plane, and the plane affine curve with equation . Since the Frobenius will permute these intersection points in the same way that it permutes the roots of , we can interpret the Schinzel Problem, in that context, as asking about the “variation” of this Galois action as varies and the curve is fixed.
This point of view immediately suggests some generalizations: there is no reason to work over a finite field (any field will do), the base curve (which is implicitly the affine line where polynomials live) can be changed to another (open) curve ; the point at infinity, where polynomials have their single pole, might also be changed to any effective divisor with support the complement of in its smooth projective model (e.g., allowing poles at and on the projective line); and may be any (non-vertical) curve in . For instance (to see that this generalization is not pointless), take any curve , and define . Then the intersection of the graph of a function on and is the set of zeros of . The problem becomes something like figuring out the “generic” Galois group of the splitting field of this set of zeros. (E.g., the Galois group of a complicated elliptic function defined over …)
In fact this special case was (with different motivation and terminology) considered by Katz in his book “Twisted L-functions and monodromy” (see Chapter 9). Katz shows that if the (fixed) effective divisor used to define the poles of the functions considered has degree , where is the genus of the smooth projective model of , then the image of Galois is the full symmetric group (his proof is rather nice, using character sums on the Jacobian…)
The general case, on the other hand, does not seem to have been considered before. In the recent note that I’ve written on the subject, I use quite elementary arguments with Lefschetz pencils / Morse-like functions (again inspired by results of Katz and Katz-Rains) to show that in very general conditions, the image of the fundamental group is again the full symmetric group. This gives the asymptotic for this geometric Schinzel problem in this generality over finite fields. (In the classical case, this was essentially done by Entin, though the conditions of applicability are not exactly the same).
As I mention at the end of those slides, the next step is of course to think about the fixed finite field case, where the degree of the polynomials tends to infinity. This seems, even geometrically, to be quite an interesting problem…
[Update: after I wrote this post, I remembered that in fact the (qualitative) problem of representing primes with one polynomial that I consider here is actually Bunyakowski’s Problem, and that the Schinzel Hypothesis is the qualitative statement for a finite set of polynomials… The quantitative versions of both are usually called the Bateman-Horn conjecture. So my terminology is multiply inaccurate…]
Bagchi’s Theorem is a functional version of earlier results of Bohr and Jessen related to the statistical properties of the Riemann zeta function on a vertical line between the critical line and the region of absolute convergence. It seems that it is not as well-known as it could, partly because Bagchi proved it in his thesis, and did not publish a paper with this result (his only related paper explicitly states that he removed the probabilistic language that a referee did not like). It seems therefore useful to describe the result. I will then sketch the proof I gave last semester…
Consider an open disc contained in the region (other compact regions may be considered, for instance an open rectangle). For any real number , we can look at the function on . This is a holomorphic function on , continuous on the closed disc . What kind of functions arise this way? Bagchi proved the following (this is essentially Theorem 3.4.11 in his thesis):
Theorem. Let denote the Banach space of holomorphic functions on which are continuous on the closed disc. For , define a probability measure on to be the law of the random variable , where is uniformly distributed on . Then converges in law, as , to the random holomorphic function
where is a sequence of independent random variables indexed by primes, all uniformly distributed on the unit circle.
This is relatively easy to motivate: if we could use the Euler product
in , then we would be led to an attempt to understand the probabilistic behavior of the sequence , viewed as a random variable on with values in the infinite product of copies of the unit circle indexed by primes. This is a compact topological group, and the easy answer (using the Weyl criterion) is simply that this sequence converges to the Haar measure on . In other words, the random sequence converges in law to a sequence of independent, uniform, random variables on the unit circle. Then it is natural to expect that should converge to the random function , which is obtained formally by replacing by its limit .
Bagchi’s proof is somewhat intricate, in comparison with this heuristic justification, especially if one notices that if is replaced by a compact region in the domain of absolute convergence, then the same idea applies, and is a completely rigorous proof (one need only observe that the assignment of an Euler product
to a sequence of complex numbers of modulus one is a continuous operation in the region of absolute convergence.)
The proof I give in my script tries to remain closer to the basic intuition, and is indeed less involved (it avoids both a use of the pointwise ergodic theorem that Bagchi required and any use of tightness or weak-compactness). It makes it easy to see exactly what arithmetic ingredients are needed, beyond the convergence in law of to the Haar measure on . Roughly speaking, it goes as follows:
One checks that the random Euler product does exist (as an -valued random variable), and that it has the Dirichlet series expansion
converging for almost surely, where is defined as the totally multiplicative extension of This is done as Bagchi did using fairly standard probability theory and elementary facts about Dirichlet series.
One shows that has polynomial growth on vertical lines for . This is again mostly elementary probability with a bit of Dirichlet series theory.
Consider next smoothed partial sums of , of the type
where is a compactly supported test function with . Using again standard techniques (including Cauchy’s formula for holomorphic functions), one proves that
for some .
One next shows that the smoothed partial sums of the zeta function
(the second term arises because of the pole), where denotes the expectation with respect to the uniform measure on . This step is also in Bagchi’s proof, and is essentially the only place where a specific property of the Riemann zeta function is needed: one requires the boundedness on average of in vertical strips to the right of the critical line. The standard proof of this uses the Cauchy inequality and the mean-value property
for any fixed with . It is here that the bottleneck lies if one wishes to generalize Bagchi’s Theorem to any “reasonable” family of -functions.
Finally, we just use the definition of convergence in law: for any continuous bounded function , we should prove that
where is the -valued random variable giving the translates of , and is the random Dirichlet series. The minor tweak that is useful to notice (and that I wasn’t consciously aware of before) is that one may assume that is Lipschitz: there exists a constant such that
(this is hidden in standard references — e.g., Billingsley’s — in the proof that one may assume that is uniformly continuous; the functions used to prove this are in fact Lipshitz…).
Now pick some parameter , and write
Fix . For some fixed big enough, is less than by Step 3, and is at most . For this fixed , tends to as tends to infinity because of the convergence in law of to — the sum defining the truncations are finite, so there is no convergence issue. So for all large enough, we will get