Newton-Cotes integration and Gaussian integration are staples of numerical textbooks. And Gaussian integration is said to be better than Newton-Cotes rules because a Newton-Cotes rule on points can integrate polynomials up to order , ( if is even) whereas a Gaussian rule on points can integrate polynomials up to order .
Although this is true, it is not a particularly useful measure. The reason that Gaussian integration is better than Newton-Cotes is that Gaussian integral approximations approach the exact value geometrically in . In general this is not the case for Newton-Cotes.
We can see this by looking at Runge’s function
over the interval , for which
So let’s check out Newton-Cotes of different orders.
We shall start with 5 points; the rule so determined is called Boole’s rule. Since this rule is to be accurate for for each , we can find the weights by using a Vandermonde matrix:
(1) -> N:=4 (2) -> xs:=[-1+2*k/N for k in 0..N] (3) -> M := matrix [[xs.i^j for i in 1..N+1] for j in 0..N] (4) -> V:= vector [((-1)^k+1)/(k+1) for k in 0..N]
We now need to solve the equation ; this is best done in PanAxiom by computing the row echelon form of the concatenation of and :
(5) -> MV:=rowEchelon(horizConcat(M,V)) (6) -> ws:=column(MV,N+2)
These are the weights of Boole’s rule. Applied to the quadrature of Runge’s function:
(7) -> nc5:=numeric reduce(+,[f(xs.i)*ws.i for i in 1..N+1])
and this value is within about of the exact value. The results are far worse for higher values of . If we go through all the above steps for and , the results (to 10 decimal places) are
respectively. And this is what is to be expected of Newton-Cotes rules: that higher rules in general provide very poor results. (This is why, in practice, you use a lower order rule, such as Simpson’s rule, on a sequence of small sub-intervals).
So let’s try Gauss. The nodes and weights can be found from Legendre polynomials:
(1) -> N:=5 (2) -> xs:= map(rhs,solve(legendreP(N,x),1.0e-20)) (3) -> function(D(legendreP(N,x),x),derivP,x) (4) -> ws:=[2/(1-z^2)/(derivP(z))^2 for z in xs] (5) -> reduce(+,[f(xs.i)*ws.i for i in 1..N])
This is not so good – it’s even worse than the the result with Boole’s rule. But if the same commands as above are entered with and , the results are
respectively, and this last is within of the exact value. And in general, this is what we’d expect: a higher order Gaussian rule leads to better approximations. And in fact the value for can be found to be within about of the exact value.
As any numerical analyst will tell you, approximating a function by a polynomial through equidistantly spaced points is not likely to work well: such a polynomial will develop too many wiggles to fit the points, and the resulting fit (except at the points!) will be poor.
For example, here is a plot of Runge’s function, and an interpolating polynomial at 13 points:
As you can see, the fit is terrible, especially at the end points. It’s even worse for 25 points:
At this scale the original function has been scaled down to a straight line. Since Newton-Cotes rules simply integrate these interpolating polynomials, these graphs clearly show why high order rules can fail so dramatically.
Gaussian rules, although numerically excellent, have their own issues: the nodes are hard to compute, and in general have to be stored at maximum precision before you can start using them. Afurther problem is that Gaussian nodes aren’t nested: one Gaussian rule will not contain as a subset the nodes of a lower order rule (aside from the zero-th order rule, the midpoint rule). This makes error estimation difficult, and accounts for the necessity of Kronrod extensions, where the Gaussian nodes are augmented by a further set of nodes.
The best of both worlds would be an integration rule based on interpolation through a set of points, but which performs better than Newton-Cotes rules, and for which the nodes are nested. And in fact such a magical beast does exist: it’s called Clenshaw-Curtis quadrature (although in fact first discussed by the Hungarian mathematician Leopold Fejér some 30 years previously), and is based on integrating an interpolating polynomial through the values of for . These values are the extrema of the Chebyshev polynomial (of the first kind) , or the zeros of the Chebyshev polynomial (of the second kind) augmented with and .
It can be shown that an interpolating polynomial through such points is almost the best possible interpolating polynomial, in minimizing the least squares error.
We can see the vast improvement over interpolating with equidistant points by using 13 points:
and by using 25 points:
The last polynomial hugs the original function very closely, and the generally close fit is mirrored by their integrals, which are
respectively, with differences of about and from the exact value. With 101 points, the interpolating polynomial can be integrated between and to obtain a value which differs from the exact value by about .
Given that the nodes are fixed for a given , the weights for Clenshaw-Curtis quadrature can also be predetermined, and there are plenty of different expressions for them. If the value is known exactly, then we can develop closed form expressions for the weights. Suppose we take , for which the nodes will be
We would want the integral to be exact for all polynomials up to degree 6, so that with weights we have
for . Again this can be done with a linear computation:
(1) -> N:=6 (2) -> xs:=[-cos(k%pi/N) for k in 0..n] (3) -> M:=matrix [[xs.i^j for i in 1..N+1] for j in 0..N] (4) -> V:=vector [((-1)^k+1)/(k+1) for k in 0..N] (5) -> inverse(M)*V
You can find (and it can be proved) that Clenshaw-Curtis weights are always positive, which makes Clenshaw-Curtis quadrature much more stable than Newton-Cotes, where the weights oscillate wildly especially as increases. To bring this point home, again choose . We shall compute both the Newton-Cotes and the Clenshaw-Curtis weights, and plot them against the nodes. First Newton-Cotes:
(1) -> N:=100 (2) -> digits(100) (3) -> xs := [-1.0+2.0*K/N for k in 0..N] (4) -> M := matrix [[xs.i^j for i in 1..N+1] for j in 0..N]; (5) -> V:Vector Float := vector [((-1)^k+1)/(k+1) for k in 0..N]; (6) -> ws := column(rowEchelon(horizConcat(M,V)),N+2); (7) -> draw(xs,ws)
As you see, the weights oscillate wildly, and the greatest absolute value is about . For Clenshaw-Curtis, all we have to do is change the definition of the nodes, everything else is the same as above:
(1) -> N:=100 (2) -> digits(100) (3) -> xs:List Float := [-cos(k*%pi/N) for k in 0..N] (4) -> M := matrix [[xs.i^j for i in 1..N+1] for j in 0..N]; (5) -> V:Vector Float := vector [((-1)^k+1)/(k+1) for k in 0..N]; (6) -> ws := column(rowEchelon(horizConcat(M,V)),N+2); (7) -> draw(xs,ws)
And as you also see, the weights are very nicely behaved. As a non-expert in the field of numerical analysis, I find it totally remarkable that so seemingly simple a change as the placement of the nodes can have such a profound impact. It also makes me wonder why Clenshaw-Curtis quadrature is hardly mentioned in numerical textbooks.