{"id":120,"date":"2008-10-04T17:25:54","date_gmt":"2008-10-04T16:25:54","guid":{"rendered":"https:\/\/wpethzprd.ethz.ch\/kowalski\/2008\/10\/04\/projection-to-subspaces-and-a-theorem-of-chebychev\/"},"modified":"2024-02-20T13:40:57","modified_gmt":"2024-02-20T11:40:57","slug":"projection-to-subspaces-and-a-theorem-of-chebychev","status":"publish","type":"post","link":"https:\/\/blogs.ethz.ch\/kowalski\/2008\/10\/04\/projection-to-subspaces-and-a-theorem-of-chebychev\/","title":{"rendered":"Projection to subspaces and a theorem of Chebychev"},"content":{"rendered":"<p>I just taught at ETH the basic result of the theory of Hilbert spaces that states that, given such a space <em>H<\/em>, a vector <em>v<\/em> in <em>H <\/em>and a closed, non-empty, <em>convex<\/em> subset <em>C<\/em>, there exists a unique vector <em>y<\/em> in <em>C<\/em> such that<\/p>\n<p>$latex ||v-y||=\\inf_{w\\in C}{||v-w||},$<\/p>\n<p>and which deserves to be called the orthogonal projection of <em>v<\/em> on <em>C<\/em>, as the picture tries to illustrate:<\/p>\n<p><a href=\"https:\/\/blogs.ethz.ch\/kowalski\/files\/2008\/10\/projection.png\" title=\"Projection\"><img decoding=\"async\" src=\"https:\/\/blogs.ethz.ch\/kowalski\/files\/2008\/10\/projection.png\" alt=\"Projection\" width=\"500\" \/><\/a><\/p>\n<p>Having done this, I looked for counterexamples to illustrate that this natural statement does not extend to the Banach space setting; it is easy enough to find situations where the minimum is achieved more than once (in fact, infinitely often), but slightly more challenging to find one where the minimum is <em>not<\/em> achieved at all. Here is one (it was in one of the books I use for my course): consider the Banach space <em>c<sub>0<\/sub><\/em> of sequences (indexed by <em>n&gt;0<\/em>) of complex numbers converging to 0, with the sup norm<\/p>\n<p>$latex ||(u_n)||=\\sup_{n\\geq 1}{|u_n|}.$<\/p>\n<p>Consider then the linear map<\/p>\n<p>$latex \\lambda(u)=\\sum_{n\\geq 1}{2^{-n}u_n}$<\/p>\n<p>on <em>c<sub>0<\/sub><\/em>. It is clear that this is a continuous linear map since it is bounded by 1 on the unit ball, and a moment&#8217;s thought shows that in fact we have<\/p>\n<p>$latex |\\lambda(u)|&lt;1\\quad\\text{ if } \\quad ||u||\\leq 1$<\/p>\n<p>because <em>equality<\/em> would require that <em>u<\/em> be a constant sequence, and this is impossible if the sequence is to converge to zero, except (of course) when the constant is zero, but then <em>\u03bb(0)=0<\/em> anyway&#8230;.<\/p>\n<p>This is a case where a supremum is not achieved, and so it is easily transformed to what we want by taking, e.g., <em>v=0<\/em> and the convex set<\/p>\n<p>$latex C=\\{v\\in c_0\\,\\mid\\, \\lambda(v)=1\\}.$<\/p>\n<p>Then there does not exist a vector in <em>C<\/em> minimizing the distance to the origin.<\/p>\n<p>Now, this is probably not too surprising to anyone, but while preparing these counterexamples for exercises for the students of my course, I was reminded of a beautiful approximation result of Chebychev which is probably not well-known (outside the numerical analysis community, where it seems to be standard knowledge) and which gives a particular (important) case <em>outside of the Hilbert space context<\/em> where the minimization process works.<\/p>\n<p>Consider the (real) Banach space <em>B<\/em> of real-valued continuous functions on [0,1] with the supremum norm (i.e., the norm of uniform convergence of functions). Fix then a degree <em>d&gt;0<\/em>. In <em>B<\/em>, the subspace <em>Pol<sub>d<\/sub><\/em> of real polynomials of degree at most <em>d<\/em> is a closed (non-empty) convex subset. The approximation problem is therefore, for a given continuous function <em>f<\/em>, to find the polynomials <em>P<\/em> of degree at most <em>d<\/em> which minimize the distance to <em>f<\/em>:<\/p>\n<p>$latex ||f-P||=\\sup_{0\\leq x\\leq 1}\\ \\ |f(x)-P(x)|$<\/p>\n<p>must be equal to<\/p>\n<p>$latex D_d=\\inf_{P\\in \\text{Pol}_d}{||f-P||}.$<\/p>\n<p>It is not very difficult to prove the existence of at least one such polynomial <em>P<\/em> (the idea being to show that the minimization can be restricted to a <em>compact<\/em> subset of the subspace <em>Pol<sub>d<\/sub><\/em>). It is more difficult to prove unicity, and even more subtle to find the following characterization (which was found by Chebychev): a polynomial <em>P<\/em> (of degree at most <em>d<\/em>) is the best approximation to <em>f<\/em> in this sense <em>if and only if<\/em>, there exist at least <em>d+2<\/em> points, say<\/p>\n<p>$latex x_1&lt;x_2&lt;\\ldots&lt;x_n,\\quad n\\geq d+2,$<\/p>\n<p>such that<\/p>\n<p>$latex |f(x_i)-P(x_i)|=||f-P||,\\quad\\text{ for all } i,$<\/p>\n<p>and moreover the signs of <em>f(x<sub>i<\/sub>)-P(x<sub>i<\/sub>)<\/em> alternate when <em>i<\/em> increases. Below, I call the existence of points with these two properties the <em>&#8220;d+2<\/em> alternation property&#8221;; note that it says nothing about the size of the distance from <em>f<\/em> to <em>P<\/em>. Here is an illustration:<\/p>\n<p><a href=\"https:\/\/blogs.ethz.ch\/kowalski\/files\/2008\/10\/cheby.png\" title=\"Chebychev approximation\"><img decoding=\"async\" src=\"https:\/\/blogs.ethz.ch\/kowalski\/files\/2008\/10\/cheby.png\" alt=\"Chebychev approximation\" \/><\/a><\/p>\n<p>the red function <em>f<\/em> is the cosine function, and the blue graph is the best approximation of degree (at most) 3; the gray line is the difference <em>cos(x)-P(x)<\/em>, and the 5 alternating extrema are clearly visible. (I used <strong>Maple<\/strong> to construct the approximation, and <strong>Sage<\/strong> to plot the graph).<\/p>\n<p>The unicity seems to be typically proved by first showing that there is at most one polynomial with the <em>d+2<\/em> alternation property (which is fairly easy; in fact, alternation with <em>d+1<\/em> points would be enough), and then showing that minimizing polynomials are characterized by it. The most technical part is the proof that a minimizing polynomial has the required property (if there are less than <em>d+2<\/em> alternations, one shows how to improve the approximation).<\/p>\n<p>But the converse (an alternating polynomial is minimizing) looks also quite difficult at first since, as we observed, the defining condition doesn&#8217;t say anything about the size of<\/p>\n<p>$latex ||f-P||,$<\/p>\n<p>only that it is achieved at <em>d+2<\/em> points. But here is the trick: suppose another polynomial <em>Q<\/em> (of degree at most <em>d<\/em>) is such that<\/p>\n<p>$latex (*)\\ \\ \\ \\ \\ \\ \\ \\ \\ \\ \\ \\ \\ \\ \\ \\ \\ \\ \\ \\ ||f-Q||&lt;||f-P||.$<\/p>\n<p>Then, at the points <em>x<sub>i<\/sub><\/em>, the difference <em>Q-P<\/em> has <em>the same sign<\/em> as <em>f-P<\/em> (by writing<\/p>\n<p>$latex Q(x_i)-P(x_i)=(f(x_i)-P(x_i)) &#8211; (f(x_i)-Q(x_i)),$<\/p>\n<p>and the second summand is not big enough, by (*), to provoke a change of sign). By the assumption of alternation of signs of the <em>d+2<\/em> values<\/p>\n<p>$latex f(x_i)-P(x_i),$<\/p>\n<p>(which hasn&#8217;t been used yet), this means the polynomial <em>Q-P<\/em>, of degree at most <em>d<\/em>, has at least <em>d+1<\/em> changes of signs, hence that many zeros, so we must have <em>Q=P<\/em>, which is a contradiction.<\/p>\n<p>Incidentally, the only reason I know this result is that it was the subject of one of the entrance exams at the French <em>\u00c9cole Polytechnique<\/em>, and that this exam text was (because of this) used by the teacher of my <em>Math. Sp\u00e9<\/em> for one of our weekly take-home exams; the last questions were set up in a challenging way (which I don&#8217;t remember, but it must have been the proof of the clever part above), and the result was fixed in my mind because I succeeded in finding the trick&#8230;<\/p>\n<p>For a full account, see for instance <a href=\"http:\/\/books.google.ch\/books?id=y77n2ySMJHUC&amp;printsec=frontcover&amp;dq=analysis+of+numerical+methods&amp;hl=en&amp;sig=ACfU3U0T0ol7FRF3n_TXmGafixzcCgKGYQ#PPA224,M1\">Analysis of Numerical Methods<\/a>, by Isaacson and Keller (which I only have in my personal library because it was in the bargains bin in the English bookstore in Bordeaux one day, for some mysterious reason).<\/p>\n","protected":false},"excerpt":{"rendered":"<p>I just taught at ETH the basic result of the theory of Hilbert spaces that states that, given such a space H, a vector v in H and a closed, non-empty, convex subset C, there exists a unique vector y in C such that $latex ||v-y||=\\inf_{w\\in C}{||v-w||},$ and which deserves to be called the orthogonal &hellip; <a href=\"https:\/\/blogs.ethz.ch\/kowalski\/2008\/10\/04\/projection-to-subspaces-and-a-theorem-of-chebychev\/\" class=\"more-link\">Continue reading <span class=\"screen-reader-text\">Projection to subspaces and a theorem of Chebychev<\/span><\/a><\/p>\n","protected":false},"author":625,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[2],"tags":[],"class_list":["post-120","post","type-post","status-publish","format-standard","hentry","category-blogroll"],"_links":{"self":[{"href":"https:\/\/blogs.ethz.ch\/kowalski\/wp-json\/wp\/v2\/posts\/120","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/blogs.ethz.ch\/kowalski\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.ethz.ch\/kowalski\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.ethz.ch\/kowalski\/wp-json\/wp\/v2\/users\/625"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.ethz.ch\/kowalski\/wp-json\/wp\/v2\/comments?post=120"}],"version-history":[{"count":0,"href":"https:\/\/blogs.ethz.ch\/kowalski\/wp-json\/wp\/v2\/posts\/120\/revisions"}],"wp:attachment":[{"href":"https:\/\/blogs.ethz.ch\/kowalski\/wp-json\/wp\/v2\/media?parent=120"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.ethz.ch\/kowalski\/wp-json\/wp\/v2\/categories?post=120"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.ethz.ch\/kowalski\/wp-json\/wp\/v2\/tags?post=120"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}