Zeros of Hermite polynomials

In my paper with É. Fouvry and Ph. Michel where we find upper bounds for the number of certain sheaves on the affine line over a finite field with bounded ramification, the combinatorial part of the argument involves spherical codes and the method of Kabatjanski and Levenshtein, and turns out to depend on the rather recondite question of knowing a lower bound on the size of the largest zero x_n of the n-th Hermite polynomial H_n, which is defined for integers n\geq 1 by
H_n(x)=(-1)^n e^{x^2} \frac{d^n}{dx^n}e^{x^2}.

This is a classical orthogonal polynomial (which implies in particular that all zeros of H_n are real and simple). The standard reference for such questions seems to still be Szegö’s book, in which one can read the following rather remarkable asymptotic formula:
where i_1=3.3721\ldots>0 is the first (real) zero of the function
which is a close cousin of the Airy function (see formula (6.32.8) in Szegö’s book, noting that he observes the Peano paragraphing rule, according to which section 6.32 comes before 6.4).

(Incidentally, if — like me — you tend to trust any random PDF you download to check a formula like that, you might end up with a version containing a typo: the cube root of 6 is, in some printings, replaced by a square root…)

Szegö references work of a number of people (Zernike, Hahn. Korous, Bottema, Van Veen and Spencer), and sketches a proof based on ideas of Sturm on comparison of solutions of two differential equations.

As it happens, it is better for our purposes to have explicit inequalities, and there is an elementary proof of the estimate
which is only asymptotically weaker by a factor 2 from the previous formula. This is also explained by Szegö, and since the argument is rather cute and short, I will give a sketch of it.

Besides the fact that the zeros of H_n are real and simple, we will use the easy facts that \deg(H_n)=n, and that H_n is an even function for n even, and an odd function for n odd, and most importantly (since all other properties are rather generic!) that they satisfy the differential equation

The crucial lemma is the following result of Laguerre:

Let P\in \mathbf{C}[X] be a polynomial of degree n\geq 1. Let z_0 be a simple zero of P, and let
Then if T\subset \mathbf{C} is any line or circle passing through z_0 and w_0, either all zeros of P are in T, or both components of \mathbf{C}-T contain at least one zero of P.

Before explaining the proof of this, let’s see how it gives the desired lower bound on the largest zero x_n of H_n. We apply Laguerre’s result with P=H_n and z_0=x_n. Using the differential equation, we obtain
Now consider the circle T such that the segment [w_0,z_0] is a diameter of T.

Now note that -x_n is the smallest zero of H_n (as we observed above, H_n is either odd or even). We can not have w_0<-x_n: if that were the case, the unbounded component of the complement of the circle T would not contain any zero, and neither would T contain all zeros (since -x_n\notin T), contradicting the conclusion of Laguerre's Lemma. Hence we get -x_n\leq w_0=x_n-\frac{n-1}{x_n},
and this implies
x_n\geq \sqrt{\frac{n-1}{2}},
as claimed. (Note that if n\geq 3, one deduces easily that the inequality is strict, but there is equality for n=2.)

Now for the proof of the Lemma. One defines a polynomial Q by
so that Q has degree n-1 and has zero set Z formed of the zeros of P different from z_0 (since the latter is assumed to be simple). Using the definition, we have
Q'(z_0)=P'(z_0),\quad\quad Q''(z_0)=\frac{1}{2}P''(z_0).
We now compute the value at z_0 of the logarithmic derivative of Q, which is well-defined: we have
\frac{Q'}{Q}=\sum_{\alpha\in Z}\frac{1}{X-\alpha},
\frac{Q'}{Q}(z_0)=\sum_{\alpha\in Z}\frac{1}{z_0-\alpha},
which becomes, by the above formulas and the definition of w_0, the identity
\frac{1}{z_0-w_0}=\frac{1}{n-1}\sum_{\alpha\in Z}\frac{1}{z_0-\alpha},
or equivalently
\gamma(w_0)=\frac{1}{n-1}\sum_{\alpha\in Z}{\gamma(\alpha)},
where \gamma(z)=1/(z_0-z) is a Möbius transformation.

Recalling that |Z|=n-1, this means that \gamma(w_0) is the average of the \gamma(\alpha). It is then elementary that for line L, either \gamma(Z) is contained in L, or \gamma(Z) intersects both components of the complement of L. Now apply \gamma^{-1} to this assertion: one gets that either Z is contained in \gamma^{-1}(L), or Z intersects both components of the complement of \gamma^{-1}(L). We are now done, after observing that the lines passing through \gamma(w_0) are precisely the images under \gamma of the circles and lines passing through w_0 and through z_0 (because \gamma(z_0)=\infty, and each line passes through \infty in the projective line.)

Published by


I am a professor of mathematics at ETH Zürich since 2008.

5 thoughts on “Zeros of Hermite polynomials”

  1. Ah, but what about the *second* largest zero? Eyeballing the numbers suggests that the asymptote might be of exactly the same form, but with a different constant i_2 rather than i_1. My eyeballs are not big enough to hazard a guess concerning whether i_2 is the *second* smallest root of A(x). [Looking up Szego’s book via your link, it seems that this is indeed correct, but I may as well leave this comment here anyway.]

  2. Yes, that’s what Szegö writes, and similarly for the third, etc, (positive) zeros, with i_3, …. I vaguely wonder what kind of uniformity there is — asymptotically, all positive zeros are \sim \sqrt{2n}, how close do they really cluster??

  3. Does one really expect that they cluster? Suppose one normalizes the zeroes to lie in [-1,1], and then asks for a limit distribution of the zeros for large N. Then the k-th largest zero is approximately, for some constant j_k,

    1 – j_k/N^(2/3).

    (Supposing that A(x) looks something like sin(x^(3/2)) as far as zeros goes, one would even have j_k ~ k^(2/3) as well.)
    But this suggests that the zeros are being *repelled* from 1 in comparison with a flat distribution, where one would expect the k-th zero to be somewhere near 1 – k/N. In particular, if the zeros become uniformly distributed with respect to some measure F(x)dx, then one might even guess that F(1-t) = O(t^(1/2)) as t->0.

  4. I was thinking of “clustering” in a less precise sense, meaning only that the first k normalized zeros are all converging to 1. But that limiting distribution property is quite nice! I’m always happy to know a new examples where the arcsine law arises…

Leave a Reply

Your email address will not be published. Required fields are marked *