Chebyshev approximation and Clenshaw-Curtis quadrature

Newton-Cotes integration and Gaussian integration are staples of numerical textbooks. And Gaussian integration is said to be better than Newton-Cotes rules because a Newton-Cotes rule on N+1 points can integrate polynomials up to order x^N, (x^{N+1} if N is even) whereas a Gaussian rule on N+1 points can integrate polynomials up to order 2N+1.

Although this is true, it is not a particularly useful measure. The reason that Gaussian integration is better than Newton-Cotes is that Gaussian integral approximations approach the exact value geometrically in N. In general this is not the case for Newton-Cotes.

We can see this by looking at Runge’s function

    \[ f(x)=\frac{1}{1+25x^2} \]

over the interval [-1,1], for which

    \[ \int^1_{-1}f(x)\,dx=\frac{2}{5}\arctan(5)\approx 0.5493603068. \]

So let’s check out Newton-Cotes of different orders.

We shall start with 5 points; the rule so determined is called Boole’s rule. Since this rule is to be accurate for x^n for each n=0,1,2,3,4, we can find the weights by using a Vandermonde matrix:

(1) -> N:=4
(2) -> xs:=[-1+2*k/N for k in 0..N]
(3) -> M := matrix [[xs.i^j for i in 1..N+1] for j in 0..N]
(4) -> V:= vector [((-1)^k+1)/(k+1) for k in 0..N]

We now need to solve the equation M\mathbf{x}=V; this is best done in PanAxiom by computing the row echelon form of the concatenation of M and V:

(5) -> MV:=rowEchelon(horizConcat(M,V))
(6) -> ws:=column(MV,N+2)

    \[ \left[\frac{7}{45},\frac{32}{45},\frac{4}{15},\frac{32}{45},\frac{7}{45}\right] \]

These are the weights of Boole’s rule. Applied to the quadrature of Runge’s function:

(7) -> nc5:=numeric reduce(+,[f(xs.i)*ws.i for i in 1..N+1])

    \[ 0.4748010610\_0795755968 \]

and this value is within about 0.075 of the exact value. The results are far worse for higher values of N. If we go through all the above steps for N=12 and N=24, the results (to 10 decimal places) are

    \[ -0.0625873032 \]


    \[ -20.4255502076 \]

respectively. And this is what is to be expected of Newton-Cotes rules: that higher rules in general provide very poor results. (This is why, in practice, you use a lower order rule, such as Simpson’s rule, on a sequence of small sub-intervals).

So let’s try Gauss. The nodes and weights can be found from Legendre polynomials:

(1) -> N:=5
(2) -> xs:= map(rhs,solve(legendreP(N,x),1.0e-20))
(3) -> function(D(legendreP(N,x),x),derivP,x)
(4) -> ws:=[2/(1-z^2)/(derivP(z))^2 for z in xs]
(5) -> reduce(+,[f(xs.i)*ws.i for i in 1..N])

    \[ 0.7069479204 \]

This is not so good – it’s even worse than the the result with Boole’s rule. But if the same commands as above are entered with N=13 and N=25, the results are

    \[ 0.5552465239 \]


    \[ 0.5494101495 \]

respectively, and this last is within 0.00005 of the exact value. And in general, this is what we’d expect: a higher order Gaussian rule leads to better approximations. And in fact the value for N=101 can be found to be within about 3.9\times 10^{-19} of the exact value.

As any numerical analyst will tell you, approximating a function by a polynomial through equidistantly spaced points is not likely to work well: such a polynomial will develop too many wiggles to fit the points, and the resulting fit (except at the points!) will be poor.

For example, here is a plot of Runge’s function, and an interpolating polynomial at 13 points:


As you can see, the fit is terrible, especially at the end points. It’s even worse for 25 points:


At this scale the original function has been scaled down to a straight line. Since Newton-Cotes rules simply integrate these interpolating polynomials, these graphs clearly show why high order rules can fail so dramatically.

Gaussian rules, although numerically excellent, have their own issues: the nodes are hard to compute, and in general have to be stored at maximum precision before you can start using them. Afurther problem is that Gaussian nodes aren’t nested: one Gaussian rule will not contain as a subset the nodes of a lower order rule (aside from the zero-th order rule, the midpoint rule). This makes error estimation difficult, and accounts for the necessity of Kronrod extensions, where the Gaussian nodes are augmented by a further set of nodes.

The best of both worlds would be an integration rule based on interpolation through a set of points, but which performs better than Newton-Cotes rules, and for which the nodes are nested. And in fact such a magical beast does exist: it’s called Clenshaw-Curtis quadrature (although in fact first discussed by the Hungarian mathematician Leopold Fejér some 30 years previously), and is based on integrating an interpolating polynomial through the values of \cos(k\pi/N) for k=0,1,2,\ldots N. These values are the extrema of the Chebyshev polynomial (of the first kind) T_{n+1}(x), or the zeros of the Chebyshev polynomial (of the second kind) U_{n+1}(x) augmented with -1 and 1.

It can be shown that an interpolating polynomial through such points is almost the best possible interpolating polynomial, in minimizing the least squares error.

We can see the vast improvement over interpolating with equidistant points by using 13 points:


and by using 25 points:


The last polynomial hugs the original function very closely, and the generally close fit is mirrored by their integrals, which are

    \[ 0.56020415227386 \]


    \[ 0.54945166115339 \]

respectively, with differences of about 0.01 and 9\times 10^{-5} from the exact value. With 101 points, the interpolating polynomial can be integrated between -1 and 1 to obtain a value which differs from the exact value by about 5.6\times 10^{-16}.

Given that the nodes are fixed for a given N, the weights for Clenshaw-Curtis quadrature can also be predetermined, and there are plenty of different expressions for them. If the value cos(\pi/N) is known exactly, then we can develop closed form expressions for the weights. Suppose we take N=6, for which the nodes will be

    \[ \left[-\cos(0),-\cos(\pi/6),-\cos(2\pi/6),\ldots,-\cos(6\pi/6)\right]=\left[-1,-\frac{\sqrt{3}}{2},-\frac{1}{2},0,\frac{1}{2},\frac{\sqrt{3}}{2},1\right]. \]

We would want the integral to be exact for all polynomials up to degree 6, so that with weights w_k we have

    \[ w_0(-1)^k+w_1\left(-\frac{\sqrt{3}}{2}\right)^k+w_2\left(-\frac{1}{2}\right)^k+w_3(0)^k+w_4\left(\frac{1}{2}\right)^k+w_5\left(\frac{\sqrt{3}}{2}\right)^k,w_6(1)^k=\frac{(-1)^k+1}{k+1} \]

for k=0,1,2,3,4,5,6. Again this can be done with a linear computation:

(1) -> N:=6
(2) -> xs:=[-cos(k%pi/N) for k in 0..n]
(3) -> M:=matrix [[xs.i^j for i in 1..N+1] for j in 0..N]
(4) -> V:=vector [((-1)^k+1)/(k+1) for k in 0..N]
(5) -> inverse(M)*V

    \[ \left[\frac{1}{35},\frac{16}{63},\frac{16}{35}, \frac{164}{315}, \frac{16}{35},\frac{16}{63}, \frac{1}{35}\right] \]

You can find (and it can be proved) that Clenshaw-Curtis weights are always positive, which makes Clenshaw-Curtis quadrature much more stable than Newton-Cotes, where the weights oscillate wildly especially as N increases. To bring this point home, again choose N=100. We shall compute both the Newton-Cotes and the Clenshaw-Curtis weights, and plot them against the nodes. First Newton-Cotes:

(1) -> N:=100
(2) -> digits(100)
(3) -> xs := [-1.0+2.0*K/N for k in 0..N]
(4) -> M := matrix [[xs.i^j for i in 1..N+1] for j in 0..N];
(5) -> V:Vector Float := vector [((-1)^k+1)/(k+1) for k in 0..N];
(6) -> ws := column(rowEchelon(horizConcat(M,V)),N+2);
(7) -> draw(xs,ws)


As you see, the weights oscillate wildly, and the greatest absolute value is about 2.4\times 10^{24}. For Clenshaw-Curtis, all we have to do is change the definition of the nodes, everything else is the same as above:

(1) -> N:=100
(2) -> digits(100)
(3) -> xs:List Float := [-cos(k*%pi/N) for k in 0..N]
(4) -> M := matrix [[xs.i^j for i in 1..N+1] for j in 0..N];
(5) -> V:Vector Float := vector [((-1)^k+1)/(k+1) for k in 0..N];
(6) -> ws := column(rowEchelon(horizConcat(M,V)),N+2);
(7) -> draw(xs,ws)


And as you also see, the weights are very nicely behaved. As a non-expert in the field of numerical analysis, I find it totally remarkable that so seemingly simple a change as the placement of the nodes can have such a profound impact. It also makes me wonder why Clenshaw-Curtis quadrature is hardly mentioned in numerical textbooks.

3 thoughts on “Chebyshev approximation and Clenshaw-Curtis quadrature

  1. Just few cents on Newton-Cotes.

    It is possible to build stable Newton-Cotes rules using the least-squares instead of interpolation.
    Some time ago I have studied this idea from signal processing/Fourier domain point of view. Numerical instability is equivalent to noise amplification in frequency domain which can be measured quantitatively and thus we can have much better control of it. For example, we can balance the accuracy & numerical stability and build infinite number of different rules with different properties in general.

    I am not sure, if it is interesting, but here is my brief notes:
    Closed type rules:

    Open type rules:

    Also there are some ideas on how to improve their accuracy without increasing the number of knots:

    Thank you for using the QuickLaTeX :)!

    1. Thanks for your comments – and for the links! Also note that some years ago I wrote a small blog post: about adjusting Newton-Cotes rules by finite differences to obtain simpler formulas. However, your own posts look mathematically more complete than mine!

Leave a Reply

Your email address will not be published. Required fields are marked *