In the development of the large sieve, leading to its arithmetic applications, there is one step which — I must admit shamefully — I never completely understood as a natural argument instead of one with a somewhat mysterious clever trick. Or, at least, until now: just this week-end, one vaguely related thing leading to another, I’ve just stumbled on a very transparent proof of this result.

I’ve written a short note with the details, but here is the outline. For background, it may be useful to read my earlier post on the subject on T. Tao’s blog, though I’ll use the classical setting of the sieve for integers below (as I did in the note, although it extends *mutatis mutandis* to the general setting described in my recent book).

The point is to provide the link between the *analytic* and *arithmetic* large sieve inequalities. The first one of these is the inequality

which is valid for all complex numbers *(a _{n})*; here the sum over

*a*is over all residue classes modulo

*q*, coprime with

*q*, and the resulting exponential is independent of the choice of representatives of those classes. This inequality is discussed in great detail in Montgomery’s survey article; I’ll just say here that the constant

*N-1+Q*, which is not particularly easy to obtain, is of no importance for our purpose, and I will just write

^{2}below to emphasize this.

The second inequality, the *arithmetic large sieve inequality,* states that, for any choice of subsets *Ω _{p}* for primes

*p*, we have

where

This is thus, recognizably, a sieve estimate: we bound from above the number of integers remaining in a segment after removing all those which fail to satisfy some constraints on their reductions modulo primes, constraints which are imposed for all primes *p* up to *Q*. This *Q* is a parameter which is adjusted for applications, depending on *N*.

—————————————————————–

**Example.** If sieve is completely unfamiliar, here is one of the most traditional examples: let *Ω _{p}* be the residue classes 0 and -2 modulo

*p*. Among the integers surviving the sieve process are then all prime numbers

*r*larger than

*Q*such that

*r+2*is also prime. Hence the arithmetic large sieve inequality implies an

*upper bound*for the number of twin primes up to

*N*, namely

(where in fact the coefficients 2 must be replaced by 1 when *p=2*).

Using elementary techniques (see this treatment by Ben Green for example), or general results on sums of multiplicative functions, it follows that

for *Q> 2* and some constant *c>0*. This leads to an estimate for the number of twin primes which is (conjecturally) of the right order of magnitude

and by a partial summation, to the (weaker, but still spectacular) conclusion that the series of inverses of twin primes converges:

which was one of V. Brun’s first achievements in building the beginnings of the modern theory of sieve methods in the early 20th Century.

————————————————————————–

So I will now describe how to derive (*) from (**). Or rather, I will derive (**) from the *dual inequality*

where the complex coefficients *β(q,a)* are still arbitrary, and *Δ* has the same value as before. The equivalence of (*) and (***) is a consequence of the elementary duality theory in finite-dimensional Hilbert spaces. But more importantly, (*) is often proved by means of (***) and this duality principle; in more general contexts, this is because (***) is usually easier to approach. So it is a natural starting point to argue towards the arithmetic inequality (**).

But before we begin, a notational convention: below, *q* (and sums over *q*) will always refer to squarefree (positive) integers.The idea is quite simple, in particular if you’ve ever seen the Chebychev inequality in probability: let *S* denote the sifted set

.

Comparing with the structure of (***), we are going to describe an “amplifier” *A(n)* which is of the form

and has the property that *|A(n)|* is “large” whenever *n* is in *S*. If we have

for *n* in *S*, then by positivity we deduce

and if the last sum can be expressed conveniently and is not too large, we get an upper bound for *|S|* in this manner.

To build the amplifier, consider first a single prime *p* less than *Q*: if *n* is in *S*, we know that the reduction of *n* modulo *p* is constrained to not be in *Ω _{p}*. We can express this analytically by saying that the characteristic function of this set, evaluated at

*n*, is zero. But now expand this characteristic function (say

*f*) in terms of the additive characters (the “discrete Fourier transform”): we have

_{p}with

Thus for *n* in *S*, we have

and we rewrite this by isolating the contribution of the 0-th harmonic (which is the constant function 1, which encodes the probability that an integer modulo *p* is in *Ω _{p}*):

This is the basis of our detector! First, this defines the coefficients

and then, to extend this to all squarefree integers *q* up to *Q*, we use the Chinese Remainder Theorem and multiply the detectors for primes dividing *q*, getting coefficients *β(q,a)* for *a* coprime with *q*, such that

For notational simplicity, we write

Thus the final detector for *S* is

and we have

Calling this quantity *B*, as previously, the resulting inequality after applying (***), from the sketch above, is

with

To compute this, we invoke the orthogonality of additive characters, and their compatibility with the Chinese Remainder Theorem: we have first

and then, using the (discrete) Parseval formula

we get by removing the contribution of *a=0* that

Now we have an inequality

which is similar, but not quite identical, to (**). In general, it is somewhat weaker — though barely so for some problems like the twin-prime example above, where the sets *Ω _{p}* are quite small (i.e., what are called small sieves…)

To improve this inequality and get (**), we notice that we still can try other amplifiers. In particular, it is natural to notice that the inequality we got is not homogeneous if we replace the coefficients *β(q,a)* by multipliying by constants (depending only on *q*, multiplicatively). Indeed, if we consider now

we replace *B* with

while *A* is replaced with

and we can try to minimize the ratio *A _{1}/B_{1}^{2}* when the coefficients

*λ*are allowed to vary. This is the same classical problem that occurs in the Selberg sieve (see for instance the write-up by Ben Green of the application of the Selberg sieve to the twin-prime problem, already mentioned, for an introduction): minimize a quadratic form with respect to a linear constraint (with a different quadratic form), and it is easily and elegantly solved: we casually Cauchy

_{p}so that, first of all, we can not do better (i.e., make the ratio smaller) than

and second, by the equality case in Cauchy’s inequality, we *can* do something this good by taking

This leads to the optimal value of the ratio and therefore proves the arithmetic large sieve inequality (**), as described previously.

A final remark: the simplicity of the argument suggests that it is probably not new. There are (in Montgomery’s survey article, for instance) a number of earlier derivations of the arithmetic large sieve inequality from the dual form of the analytic inequality. However, those I have seen (I must say I had not really looked at any of them until after writing the gist of my argument…) are based on, or inspired by, the connection with the Selberg sieve, and are therefore less elementary (and motivated). Still, the ingredients look always much the same, and I think this write-up is more a question of re-arranging them, instead of a really new proof.