Here is a sketch of the argument mentioned in the previous post (which arose from the discussions with Étienne Fouvry, Philippe Michel, Paul Nelson, etc, but presentation mistakes are fully mine…).
Theorem. We have
(1) we put
and denote by the expected main term;
(2) the parameters are , , the modulus is , the moduli and are coprime and squarefree, and is the usual polynomial associated to an admissible tuple.
If we take and , we get a non-trivial result for as large as .
(In fact, the special shape of will play no role in this argument, and any non-constant polynomial will work just as well.)
More precisely, I will give a proof which is — except for its terseness — essentially complete for and of special type, and we anticipate only technical adjustments to cover the general case — we will write this down carefully of course.
Before starting, a natural question may come to mind: given that qui peut le plus, peut le moins, can one give an analogue result for the usual divisor function? Recall that, for the latter, the (individual) exponent of distribution has been known to be at least for a long time (by work of Linnik and Selberg independently, both using the Weil bound for Kloosterman sums.) This exponent has not been improved, even on average over (although Fouvry succeeded, on average over , in covering the range ) despite much effort. However, Fouvry and Iwaniec (with an Appendix by Katz to treat yet another complete exponential sum over finite fields) proved already twenty years ago that one could improve it to if one averages for a fixed over special moduli with — this gives in particular a nice earlier illustration of the usefulness of factorable moduli for this type of questions.
So, to work. For fixed and , we begin by applying the Poisson summation formula to the three “smooth” variables (the smoothing is hidden in the notation ); the simultaneous zero frequencies give the main term, as they should, and the other degenerate cases are easier to handle than the contribution of the non-zero so the main secondary term for a given and is given by
where the dual lengths are , so that the total number of frequencies is , and where is a normalized hyper-Kloosterman sum modulo :
(Below I will usually not repeat the range of summations when they are unchanged from one line to the next.)
Now we sum over and , move the sum over outside, and apply the Cauchy-Schwarz inequality to the sum, for a fixed , where is modulo . To prepare for this step, we use the Chinese Remainder Theorem to split the condition , and to factor the hyper-Kloosterman sum as a sum modulo times one modulo .
The contribution of a fixed is
and we can bound
for some coefficients which are bounded.
The point of this is that we have smoothed the variable by eliminating its multiplicity, and that the range of this variable can be quite long; as long as , completing the sum in the Polya-Vinogradov (or Poisson) style will be useful.
Now we continue with . It is here that it simplifies matters to have and to do as if and were primes (this is a technicality which experience shows should give no loss in the final, complete, analysis.)
So we consider the sum over . The diagonal contribution where is .
In the non-diagonal terms, we distinguish whether or not. If not, we complete the two hyper-Kloosterman sums. This gives two complete exponential sums modulo and modulo . The latter is the Friedlander-Iwaniec sum, in its incarnation as "Borel" correlations of hyper-Kloosterman sums (see the remark at the end of our note on these sums; this identification is already in Heath-Brown's paper, and Philippe realized recently that he had also encountered them in a paper on lower bounds for exponential sums.)
Both sums give square root cancellation (using Deligne’s work, of course) except for the sum if . But we may just push these to the second case. Thus, the contribution of these non-exceptional terms gives
(counting the number of complete sums in the -interval).
On the other hand, the exceptional are still controlled by the diagonal terms (because of the condition ; there is a minor trick involved here if the cubic roots of unity exist modulo , but I'll gloss over that.)
Now we can gather everything, and one checks that we end up with a bound
We need this to be for any , and we see that we succeed as long as
as stated in the theorem (the second condition is implied by the first if ). And as mentioned just afterwards, if we have , this gives a good distribution up to (note that epsilons may change from one inequality to the next.)
Remark. As the reader can see, we do not use either Weyl shifts, or cancellation in Ramanujan sums. The latter might appear in a more precise analysis, however, and give some extra gain.