The Aberth-Ehrlich method
Recall from the previous post that the Weierstrass-Durand-Kerner (WDK) method finds all roots of a polynomial equation in a manner similar to Newton's method, and so converges quadratically. The Aberth-Ehrlich method (more commonly called Aberth's method), is similar, but converges cubically, and as we will see, is similar to Halley's method for root finding.
Derivation from Halley's method
Halley's root finding method is sometimes called the most often rediscovered method in numerical analysis. It is an iterative method defined by
\[ x \leftarrow x - \frac{f(x)f'(x)}{f'(x)^2 - \dfrac{1}{2}f(x)f''(x)}. \]
See the Wikipedia page for the derivation.
We can divide through by \(f(x)f'(x)\) to obtain the formulation
\[ x \leftarrow x - \frac{1}{\dfrac{f'(x)}{f(x)} - \dfrac{1}{2}\dfrac{f''(x)}{f'(x)}} = x - \left[\frac{f'(x)}{f(x)} - \frac{1}{2}\frac{f''(x)}{f'(x)}\right]^{-1}. \]
which is very slightly easier to write,
Suppose now we are trying to find all roots simultaneously of a monic quartic polynomial \(p(x)\), and that the current approximations are \(a,b,c,d\). And as before, define the temporary function
\[ t(x) = (x-a)(x-b)(x-c)(x-d). \]
Then we have
\[\begin{align*} t'(x) &= (x-a)(x-b)(x-c)+(x-a)(x-b)(x-d)\\ &\qquad +(x-a)(x-c)(x-d)+(x-b)(x-c)(x-d) \end{align*}\]
and
\[\begin{align*} t''(x) &= 2\bigl((x-a)(x-b)+(x-a)(x-c)+(x-a)(x-d)\bigr.\\ &\qquad +\bigl.(x-b)(x-c)+(x-b)(x-d)+(x-c)(x-d)\bigr). \end{align*}\]
Then
\[\begin{align*} \frac{t''(a)}{t'(a)} &= \frac{2\left((x-b)(x-c)+(x-b)(x-d)+(x-c)(x-d)\right)}{(x-b)(x-c)(x-d)}\\ & = 2\left(\frac{1}{x-b}+\frac{1}{x-c}+\frac{1}{x-d}\right). \end{align*}\]
Substituting this into the second fraction in the denominator above produces the iteration:
\[ a\leftarrow a - \left[\frac{p'(a)}{p(a)}-\left(\frac{1}{x-b}+\frac{1}{x-c}+\frac{1}{x-d}\right)\right]^{-1}. \]
More generally, if the current approximations are \(a_1,a_2,\ldots,a_n\), then
\[ a_k \leftarrow a_k - \left[\frac{p'(a_k)}{p(a_k)}-\sum\limits_{\substack{i=1\\i\ne k}}^n\frac{1}{a_k-a_i}\right]^{-1}. \]
This is the Aberth-Ehrlich method.
It will be seen that our derivation is quite general, and will work for polynomials of any degree. If the polynonmial is not monic, or if we don't want to divide through by the leading coefficient, the only difference is that we define
\[ t(x) = a_0(x-a)(x-b)(x-c)(x-d) \]
where \(a_0\) is the leading coefficient; in this case the coefficient of \(x^4\).
Implementation
As previously, we'll use PARI/GP, and write the method in the form
\[ a_k \leftarrow a_k - \left[\frac{p'(a_k)}{p(a_k)}-\frac{1}{2}\frac{t''(a_k)}{t'(a_k)}\right]^{-1} \]
where \(t(x)\) is the product of \(x - x_k\) for all current root approximations \(a_k\).
The function definition is very similar to that for WDK (note that we've included the use of the leading coefficient):
aberth(f,eps,N = 200) =
{
local(deg,xs,df,ys,dt,dfrac,single_diffs,mean_diffs_vector,mean_diffs,count);
deg = poldegree(f);
lead = pollead(f);
xs = vector(deg,k,(0.6 + 0.8*I)^k); \\ starting values
df = deriv(f); \\ derivative of f
ys = vector(deg); \\ next computed values
single_diffs = vector(deg);
mean_diffs = 1.0;
mean_diffs_vector = Vec([1.0]);
count = 1;
while(mean_diffs > eps && count < N,
t = lead*prod(k = 1,deg,x - xs[k]); \\ create temporary function t(x)
dt = deriv(t,x); ddt = deriv(dt,x); \\ compute its first two derivatives
for(k = 1, deg,
dtk = subst(dt,'x,xs[k]); ddtk = subst(ddt,'x,xs[k]);
fk = subst(f,'x,xs[k]); dfk = subst(df,'x,xs[k]);
ys[k] = xs[k] - 1/(dfk/fk - 0.5*ddtk/dtk);
single_diffs[k] = abs(ys[k] - xs[k]);
xs[k] = ys[k]
);
mean_diffs = vecsum(single_diffs)/deg;
mean_diffs_vector = concat(mean_diffs_vector,[mean_diffs]);
count = count + 1;
);
return([count,mean_diffs_vector,xs]);
}
To see how fast it converges, we'll try it out on a fifth degree polynomial:
\p2000
s = aberth(x^5 + x^2 - 7,0.1^1500);
printoutput(s)
which produces:
number of iterations: 12
1.188550919
0.9079189478
0.09240861551
0.001667886271
7.282133417 e-9
6.098846060 e-25
3.582735741 e-73
7.263000773 e-218
6.050900797 e-652
3.498902657 e-1954
1.384316604 + 0.e-3032I
0.5300508632 + 1.457706660I
-1.222209165 + 0.7797478348I
-1.222209165 - 0.7797478348I
0.5300508632 - 1.457706660I
You can see that the accuracy basically triples each time.