Finding zeros of any old function is a common task, and using Newton's method is one of the best tools for carrying out this task. I've even written an old post that used this method.
However, Newton's method loses some of it's luster around repeated roots. Consider the iteration
On inspection, points where seem to be problematic. However, when these points are also roots, the division may no longer be a problem. The case where
is exactly the case of a repeated root.
For a practical example, consider which has corresponding iteration function
Starting with in exact arithmetic (i.e. no rounding) we'll always have since
This sequence converges to and the error halves every term. However, in double precision, the sequence stops once : at that point .
However, if we told the computer that there would be no reason to use Newton's method, the roots are clearly and . Instead, the "data" the computer is given are the coefficients in and each term is computed in floating point
where the numerator and denominator are given (with the help of Horner's method) by
Each floating point operation rounds the "exact" result to the nearest floating point number. For there is no rounding needed and we get the same values that we got in exact arithmetic.
The real problem happens when . At that point
and the exact result needs to be rounded into floating point:
which forces . After this point, the iteration stalls and every term after is equal to .
Thus when the algorithm terminates on a computer (since two consecutive
terms are identical), the actual error is .
This error is much too large! Using the standard floating point
type (double
), the machine precision is
and our error is approximately
. In other words, we expected
52 bits of precision and only get 27!
In general, a root repeated times — i.e. with multiplicity — will only be accurate to bits, a very large penalty!
We can see this phenomenon in action, we define the following function to do one Newton step using Horner's method:
def horner_newton(coeffs, val):
degree = len(coeffs) - 1
deriv_factor = float(degree)
fx = fpx = 0.0
for i in range(degree):
fx = val * fx + coeffs[i]
fpx = val * fpx + deriv_factor * coeffs[i]
deriv_factor -= 1.0
# Add the last term to fx.
fx = val * fx + coeffs[-1]
return val - fx / fpx
Applying it to — which has two double roots — we get
>>> values = [1.0]
>>> coeffs = [1, 0, -4, 0, 4]
>>> next_val = horner_newton(coeffs, values[-1])
>>> while next_val not in values:
... values.append(next_val)
... next_val = horner_newton(coeffs, values[-1])
...
>>> len(values)
28
>>> values[-1]
1.4142135634821
>>> math.log(abs(values[-1] - math.sqrt(2)), 2)
-29.74808712981571
i.e. the error is around .
Trying out with a triple root we get
>>> values = [2.0]
>>> coeffs = [1, -3, 3, -1]
>>> next_val = horner_newton(coeffs, values[-1])
>>> while next_val not in values:
... values.append(next_val)
... next_val = horner_newton(coeffs, values[-1])
...
>>> len(values)
31
>>> values[-1]
1.0000046685597561
>>> math.log(abs(values[-1] - 1), 2)
-17.70859102007032
with an error around as predicted.