Finding zeros of any old function is a common task, and using Newton's method is one of the best tools for carrying out this task. I've even written an old post that used this method.

However, Newton's method loses some of it's luster around repeated roots. Consider the iteration

$x_{n + 1} = x_n - \frac{f(x_n)}{f'(x_n)}$

On inspection, points where $f'(x_n) = 0$ seem to be problematic. However, when these points are also roots, the division $0 / 0$ may no longer be a problem. The case where

$f(x_n) = f'(x_n) = 0$

is **exactly** the case of a repeated root.

For a practical example, consider $f(x) = (x - 1)^2$ which has corresponding iteration function

$g(x) = x - \frac{(x - 1)^2}{2(x - 1)} = \frac{x + 1}{2}.$

Starting with $x_0 = 1 + 2^{-0} = 2,$ in exact arithmetic (i.e. no rounding) we'll always have $x_n = 1 + 2^{-n}$ since

$\frac{1 + x_n}{2} = \frac{2 + 2^{-n}}{2} = 1 + 2^{-n-1}.$

This sequence converges to $1$ and the error halves every term. However, in double precision, the sequence stops once $n = 53$: at that point $1 \oplus 2^{-53} = 0$.

However, if we told the computer that $f(x) = (x - 1)^2,$ there would be no reason to use Newton's method, the roots are clearly $1$ and $1$. Instead, the "data" the computer is given are the coefficients in $x^2 - 2x + 1$ and each term is computed in floating point

$x_{n + 1} = x_n \ominus \left(N_n \oslash D_n\right).$

where the numerator and denominator are given (with the help of Horner's method) by

$N_n = \left((x_n \ominus 2) \otimes x_n\right) \oplus 1, \quad D_n = \left(2 \otimes x_n\right) \ominus 2.$

Each floating point operation $(\otimes, \oslash, \oplus, \ominus)$ rounds the "exact" result to the nearest floating point number. For $x_1, \, \ldots, \, x_{27},$ there is no rounding needed and we get the same values that we got in exact arithmetic.

The **real problem** happens when $n = 27$. At that
point

$(x_n \ominus 2) \otimes x_n = \left(-\left(1 - 2^{-27}\right)\right) \otimes \left(1 + 2^{-27}\right)$

and the exact result $1 - 2^{-54}$ needs to be rounded into floating point:

$N_{27} = \left(-(1 \ominus 2^{-54})\right) \oplus 1 = \left(-1\right) \oplus 1 = 0$

which forces $x_{28} = x_{27} \ominus 0$. After this point, the iteration stalls and every term after is equal to $x_{27}$.

Thus when the algorithm terminates on a computer (since two consecutive
terms are identical), the actual error is $2^{-27}$.
This error is **much too large**! Using the standard floating point
type (`double`

), the machine precision is
$\varepsilon = 2^{-52}$ and our error is approximately
$\sqrt{\varepsilon}$. In other words, we **expected**
52 bits of precision and **only get** 27!

In general, a root repeated $m$ times — i.e. with
**multiplicity** $m$ — will only be accurate to
$\frac{52}{m}$ bits, a very large penalty!

We can see this phenomenon in action, we define the following function to do one Newton step using Horner's method:

```
def horner_newton(coeffs, val):
degree = len(coeffs) - 1
deriv_factor = float(degree)
fx = fpx = 0.0
for i in range(degree):
fx = val * fx + coeffs[i]
fpx = val * fpx + deriv_factor * coeffs[i]
deriv_factor -= 1.0
# Add the last term to fx.
fx = val * fx + coeffs[-1]
return val - fx / fpx
```

Applying it to $(x^2 - 2)^2 = x^4 - 4x^2 + 4$ — which has two double roots — we get

```
>>> values = [1.0]
>>> coeffs = [1, 0, -4, 0, 4]
>>> next_val = horner_newton(coeffs, values[-1])
>>> while next_val not in values:
... values.append(next_val)
... next_val = horner_newton(coeffs, values[-1])
...
>>> len(values)
28
>>> values[-1]
1.4142135634821
>>> math.log(abs(values[-1] - math.sqrt(2)), 2)
-29.74808712981571
```

i.e. the error is around $2^{-29} \approx \sqrt{\varepsilon}$.

Trying out with a triple root $(x - 1)^3 = x^3 - 3 x^2 + 3x - 1$ we get

```
>>> values = [2.0]
>>> coeffs = [1, -3, 3, -1]
>>> next_val = horner_newton(coeffs, values[-1])
>>> while next_val not in values:
... values.append(next_val)
... next_val = horner_newton(coeffs, values[-1])
...
>>> len(values)
31
>>> values[-1]
1.0000046685597561
>>> math.log(abs(values[-1] - 1), 2)
-17.70859102007032
```

with an error around $2^{-17} \approx \sqrt[3]{\varepsilon}$ as predicted.