Somewhat later than I had hoped, I have updated the script of my spectral theory course. The version currently found online is complete as far as the material I intended to put in is concerned, but there are a few places where I haven’t written down all details (in particular for the proof of the Weyl law for the Dirichlet Laplace operator in an open subset of Euclidean space). I am also aware of quite a few small problems in the last chapter on Quantum Mechanics, due partly to notation problems (for the Fourier transforms, and for “physical” versus mathematical normalizations). I will need to re-read the whole text carefully to correct this; on the other hand, thanks to lists of corrections that I have already received from a few students, the number of typos is much less than before… I will however continue updating the PDF file as I continue checking parts of the text.
What delayed this version for a long time was the write-up of the last section on “The interpretation of Quantum Mechanics”; of course it’s in some sense an extraneous part of the script, since spectral theory barely enters in it, but I found it important to at least try to connect the mathematical framework with the actual physics. (This partly explains all the reading I’ve done recently about these issues). It is equally obvious that I am not the most knowledgeable person for such a discussion, but after all, there are good authorities that claim that no one really understands this question anyway…
What I end up discussing contains however one little mathematical result, which is cute and interesting independently of its use in Quantum Mechanics; it is a theorem of S. Kochen and E.P. Specker which states the following:
There does not exist any map
where S2 is the sphere in R3 with the property that, whenever
are pairwise orthogonal unit vectors, we have
or in other words, two of the three values are equal to 1, and the other is equal to 0.
How this result enters into discussions of the interpretation of Quantum Mechanics is described by M. Jammer in his book on the subject (not the same as his book on the development of Quantum Mechanis, but another one, equally evanescent as far as the internet is concerned); more recently, J. Conway and S. Kochen have combined it with the Einstein-Podolsky-Rosen argument (or paradox) to derive what they call the “Free Will Theorem”, which is an even stronger version of the unpredictability of properties of Spin 1 particles (those to which the Kochen-Specker argument applies). Conway has given lectures in Princeton on this result and its history and consequences, which are available as videos online.
Coming back to the result above, considered purely from the mathematical point of view, it is interesting to notice that both the original proof and the version used by Conway-Kochen (which is due to A. Peres) show that the hypothetical map does not exist even for some finite sets of points on the sphere. It is of some interest to get a smallest possible set of such points. The proof I gave in the script, however, which is taken from Jammer’s book (who attributes it to R. Friedberg) is maybe theoretically slightly more complicated, but it is also somewhat more conceptual in that one doesn’t have to be puzzled so much at the reason why one finite set of vectors or another is really fundamental.
Who remembers the Mills number?
02.04.2009
One of the undetermined numbers in Les nombres remarquables is the Mills number (or numbers; this is not uniquely defined, as will be clear from the description below). I had somehow forgotten all about it, although I have now the memory that it was quite popular in the olden days (at least, I seem to remember that it cropped up in every other conversation back when I was reading that book 20 or more years ago), and I had not heard anything about it for about that long.
So, the Mills number is that (or any of the) amazing real number A>1 with the property that
is a prime number for every positive integer n.
As one can expect, the doubly-exponential growth means that it would be pointless to try to use this to produce prime numbers. And one may guess that the proof of the existence of such a number has little to do with primes, and should apply to many other sequences of positive integers.
This is indeed so, but not in a completely trivial manner. More precisely, what the proof shows is that, given an infinite subset S of integers, and a real number c>1, one can find B, depending of course on the set and on c, such that
provided the set S has the property that, for some real number
and all large enough x, the intersection
is not empty. In other words, since θ<1, there must be some element of the set in all “short” intervals (from some point on), where “short” has the usual meaning in analytic number theory: the length is a power less than 1 of the left-hand extremity.
(Note that the relation between c and θ shows that, if we know a suitable value of θ for S, then we can always find a value of c that works, always assuming θ<1.)
What about primes, then? Do primes exist in short intervals? The answer is, indeed, yes, and it has been known to be so since the work of Hoheisel in 1930, but this is by no means a triviality! Indeed, if one looks at the problem from too far away, analyzing the number of primes in such an interval with the “explicit formulas” in terms of zeros of the Riemann zeta function, then one gets the impression that one will prove
(which is the expected answer, because of the Prime Number Theorem) only for θ>1/2, and only by knowing that
which we know only for θ=1. This means that, from the point of view of immediate consequences of the location of zeros of the zeta function, having primes in short intervals is comparable with having a zero-free strip.
From this point of view, we see that the existence of the Mills number is quite an interesting fact. Moreover, the smaller the value of c one can take, the shorter the intervals we manage to find primes in. The value c=3 which I quoted at the beginning is possible because the current best result about primes in short intervals states that, for x large enough,
contains the “right number” of primes. (In fact, this allows any c>12/5). This result is due to Huxley, and hasn’t been improved since 1972; however, if one wants only the existence of a positive proportion among the right number of primes, Baker and Harman have the record value 0.534 (this was in 1996, and allows c>2.14…).
All the proofs since Hoheisel’s time depend crucially on a way to get around the Riemann Hypothesis known as “density theorems” for the zeros. This is a fairly inconvenient name, since “density” might suggest “lots and lots of zeros everywhere”, whereas the intent and purpose of density theorems is to show that, although there might be zeros off the critical line, or even close to 1 (which is were they would fight against the pole of the Riemann zeta function, which is the White Knight that tries to produce primes, glorious primes), there can not be too many. The precise argument is presented in Chapter 10 of my book with H. Iwaniec. Note that density theorems have many other applications: certain particularly subtle ones for Dirichlet characters (”log-free density theorems”), the first of which was proved by Linnik, are crucial to the known proofs of his marvelous theorem according to which, for some absolute constant C>0, the smallest prime P(q,a) congruent to a modulo q, for a coprime with q, satisfies
(The best result here allows you to take any C>5.5 for q large enough, due to Heath-Brown; the Generalized Riemann Hypothesis gives this for any C>2). If this remains too mundane — some people do not like primes in arithmetic progressions –, note that you need similar theorems for cusp forms to give an upper bound of the right order of magnitude for the rank of the Jacobian J0(q) of the modular curve X0(q) for q prime, a result of P. Michel and myself.
Now for the proof of the existence of the Mills number, in the generality of a set S containing elements in short intervals. I won’t give all details, but here’s a sketch:
(1) define b(1) to be the smallest element of S above the point after which all short intervals contain at least one element of S;
(2) define inductively b(n+1) to be such that
(3) show, using the condition
that if we define
we then have
and deduce that the limit B of xn exists, and gives the desired general Mills number…
Problems from the archive
20.08.2008
(The title of this post is based on WBGO’s nice late-Sunday show “Jazz from the archive”, which I used to follow when I was at Rutgers; the physical archive in question was that of the Institute of Jazz Studies of Rutgers University; the post itself is partly motivated by seeing Gil Kalai’s examples of his own “early” problems…)
The following problem of classical analysis of functions of one-complex variable has puzzled me for a long time (between 15 and 20 years), though I never really spent a lot of time on it — it has never had anything to do with my “real” research:
Let f,g be entire functions, and assume that for all r>0 we have
What can we say about f and g? Specifically, do there exist two real numbers a and b such that
for all z?
(Actually, the “natural” conclusion to ask would be: does there exist a linear operator T, from the space of entire functions to itself, such that Tf=g and such that
for all entire functions φ; indeed, the function g=Tf satisfies the condition any such “joint isometry” for the vector space of entire functions equipped with the sup-norms on all circles centered at the origin; but I have the impression that it is known that any such operator is of the form stated above).
My guess has been that the answer is Yes, but I must say the evidence is not outstandingly strong; mostly, the analogue is known when one uses L2 norms on the circles, since
if an are the Taylor coefficients of f. If those coincide with the same values for another entire function g (with coefficients bn) we get by unicity of Taylor expansions that
for all n, and the linear operator
is of course a joint isometry for the L2 norms on all circles, mapping f to g.
The only result I’ve seen that seems potentially helpful (though I never looked particularly hard; it was mentioned in an obituary notice of the Bulletin of the LMS, I think, that I read rather by chance around 1999) is a result of O. Blumenthal — who is today much better known for his work on Hilbert modular forms, his name surviving in Hilbert-Blumenthal abelian varieties. In this paper from 1907, Blumenthal studies the structure of the sup norms in general, and computes them explicitly for (some) polynomials of degree 2. Although the uniqueness question above is not present in his paper, one can deduce by inspection that it holds for these polynomials at least.
(I wouldn’t be surprised at all if this question had in fact been solved around that period of time, where complex functions were extensively studied in France and Germany in particular; that the few persons to whom I have mentioned it had not heard of it is not particularly surprising since this is one mathematical topic where what was the height of fashion is now become very obscure).
This is a follow-up to a comment I made on Tim Gowers’s post concerning the use of Zorn’s lemma (which I encourage students, in particular, to read if they have not done so yet). The issue was whether it is possible to write down concretely an unbounded (or in other words, not continuous) linear operator on a Banach space. I mentioned that I had been explained a few years ago that this is not in fact possible, in the same sense that it is not possible to “write down” a non-measurable function: indeed, any measurable linear map
where U is a Banach space and V is a normed vector space, is automatically continuous.
Incidentally, I had asked this to my colleague É. Matheron in Bordeaux because the question had arisen while translating W. Appel’s book of mathematics for physicists: in the chapters on unbounded linear operators (which are important in Quantum Mechanics), he had observed that those operators could often be written down, but only in the sense of a partially defined operator, defined on a dense subspace, and we wondered if the dichotomy “either unbounded and not everywhere defined, or everywhere defined, and continuous”, was a real theorem or not. In the sense that measurability is much weaker than the (not well defined) notion of “concretely given”, it is indeed a theorem.
Not only did Matheron tell me of this automatic continuity result, he gave me a copy of a short note of his (”A useful lemma concerning subseries convergence”, Bull. Austral. Math. Soc. 63 (2001), no. 2, 273–277), where this result is proved very quickly, as a consequence of a simple lemma which also implies a number of other well-known facts of functional analysis (the Banach-Steinhaus theorem, Schur’s theorem on the coincidence of weak and norm convergence for series in l1, and a few others). On the other hand, I don’t know who first proved the continuity result (Matheron says it is well-known but gives no reference).
The proof is short enough that I will present it; it is a nice source of exercises for a first course in functional analysis, provided some integration theory has been seen before (which I guess is always the case).
Here is the main lemma, due to Matheron, in a probabilistic rephrasing, and a slightly weaker version:
Main Lemma: Let G be a topological abelian group, and let An be an arbitrary sequence of measurable (Borel) subsets of G, and (gn) a sequence of elements of G. Assume that for every n and every g in G, either g is in An, or g-gn is in An.
Let moreover be given a sequence of independent Bernoulli random variables (Xn), defined on some auxiliary probability space.
Then, under the condition that the series
converges almost surely in G, there exists a subsequence (hn) of (gn) such that
converges and belongs to infinitely many An.
This is probably not easy to assimilate immediately, so let’s give the application to automatic continuity before sketching the proof. First, we recall that Bernoulli random variables are such that
Now, let T be as above, measurable. We argue by contradiction, assuming that T is not continuous. This implies in particular that for all n, T is not bounded on the ball with radius 2-n, so there exists a sequence (xn) in U such that
We apply the lemma with
The sets An are measurable, because T is assumed to be so. The triangle inequality shows that if x is not in An, then
so that x-xn is in An. (This shows where sequences of sets satisfying the condition of the Lemma arise naturally).
Finally, the series formed with the xn is absolutely convergent by construction, so the series “twisted” with Bernoulli coefficients are also absolutely convergent. Hence, all the conditions of the Main Lemma are satisfied, and we can conclude that there is a subsequence (yn) of the (xn) such that
exists, and is in An infinitely often; this means that
for infinitely many n. But this is impossible since T is defined everywhere!
Now here is the proof of the lemma. Consider the series
as a random variable, which is defined almost surely by assumption. Note that any value of Y is nothing but a sum of a subseries of the original series with terms gn. Let
so that the previous observation means that the desired conclusion is certainly implied by the condition
The event to study is
The sets CN are decreasing, so their probability is the limit of the probability of CN, and each contains (hence has probability at least equal to that of) BN. So if we can show that
(or any other positive constant) for all n, we will get
which gives the desired result. (In other words, we argue from a particularly simple case of the “difficult” direction in the Borel-Cantelli lemma).
Now let’s prove (*). We start with the identity
(for any n), which is a tautology. From the yet-unused assumption
we then conclude that
say. Therefore
.
But we claim that
Indeed, consider the random variables defined by
Then we obtain
but clearly the sequence (Zm) is also a sequence of independent Bernoulli random variables, so that
as desired. We are now done, since we have found that
which is (*).
(In probabilistic terms, I think the trick of using Zm has something to do with “exchangeable pairs”, but I’m not entirely sure; in analytic terms, it translates to an instance of the invariance of Haar measure by translation on the compact group (Z/2Z)N, as can be seen in the original write-up of Matheron).
Continuing the series of previous posts on what I called the Chebotarev invariant, here are some (easy) results on cyclic groups, which are certainly the simplest groups to consider!
Starting from the formula for c(G)
we obtain by a straightforward translation between maximal subgroups of Z/nZ and prime divisors of n (the subgroup corresponding to some p|n being given by pZ/nZ), the formula
which is not without some interest — it is both familiar-looking (a sum over divisors with a Möbius function involved –, yet subtly different — the factor 1/(1-1/d) which is not multiplicative in d.
To go further, the natural step is to expand this last term in geometric series, since each of the terms dk which then appear is multiplicative (this is the analogue of obtaining directly the formula explained in T. Tao’s comment on one of the earlier posts on combinatorial cancellation); exchanging the resulting sums and transforming the inner sums using multiplicativity, we get
which is rewritten most conveniently by isolating the first two terms:
and the point is that a trivial estimate follows
where the last series is convergent.
In fact, this last series, plus 1 (which can be considered to be the natural term k=1 since the Riemann zeta function has a pole at s=1), is a “known” constant, called the Niven constant N: its defining property is that it is the average value of the maximal power of a prime dividing n:
where
It is more or less obvious that the upper estimate above is best possible: if we consider n given by the product of the first k primes, and let k tend to infinity, we get a sequence of integers where c(Z/nZ) converges to 1+N.
The appearance of N here is, a posteriori at least, easy to explain: it is because
is both the probability that a “random” vector in Zk not be primitive (i.e., its coordinates are not coprime in their totality; this is what lies behind our computation), and the probability that a “random” integer is not k-power free (i.e, it is divisible by a k-th power; this is why N appears in the average of α(n)).
The numerical value of N is very easy to compute (using Pari for instance):
(and to show how mathematical knowledge has degenerated, I did not recognize the value N from a well-rounded knowledge of analytic number theory, but by searching for 0.70521, 1.70521 and 2.70521 with Google…)
So, the conclusion is that, for a cyclic group with many subgroups, we may have to get close to 3 elements before we can conclude that we have generators of the whole group. Whether that is surprising or not will of course depend on the reader.
Now, it is natural to go somewhat further if we consider the definition of the Chebotarev invariant as the expectation of a random variable. As is emphasized in any probability theory course, the expectation is very far from sufficient to get a good idea, in general, of a random variable: the basic question is whether its values are rather concentrated around the average, or if instead they tend to spread out a lot. A much better feeling of the variable (though still allowing a lot of variation among all possible distributions) arises if the variance V (or the second moment M2) is known; recall that
One can compute, in principle, the second moment of the waiting time whose expectation is the Chebotarev invariant by a formula similar to the one described above. In the special case of a cyclic group, and assuming I did not make a mistake in the computation (which I have yet to check by consulting probabilists), we get
(where I write c2(G) for the second moment, and where [d,e] is the lcm of d and e).
This sum is not much more difficult to approach than the previous one, and after some computations it reduces to a single divisor sum
from which the estimates
follow; as before, they are best possible, but this time the new constant
does not seem to have appeared before — at least, according to Google.
All this gives a variance, in the case of n with many small prime factors, which is close to
(hence a standard deviation around 1.18, and is therefore not negligible at all compared with the expectation).
Now what about the field with 6 elements? Well, the point is that both for the Chebotarev invariant and for the secondary one, the expression as a divisor sum is an inclusion-exclusion (or sieve) formula over the non-trivial subgroups of Z/nZ, where the terms corresponding to each d are the same as if we computed the corresponding invariant for a hypothetical group with d elements having no subgroup except the trivial one and the whole group, such as the additive groups of finite prime fields do. So for n=6, things behave in this respect as if a field with 6 elements existed. (Yes, this is far-fetched…)