Four Root-Finding Algorithms
for 1-D Functions

Dr. Phillip M. Feldman, 29 Dec, 2014


0. Abstract

Three of the most widely-used numerical algorithms for finding roots of 1-D functions are the bisection method, the modified Regula Falsi method, which I refer to as Regula Falsi, and Newton's method (also known as the Newton-Raphson method).

The bisection method is efficient for functions that are rough (irregular), while Regula Falsi and Newton's method are efficient for functions that are smooth. Many functions that are encountered in engineering and applied physics are rough at large scale but smooth at small scale. This suggests that an adaptive algorithm that combines bisection with Regula Falsi might perform better than either one of these alone.

It is sometimes necessary to repeatedly solve for roots of functions that have similar general behavior, i.e., one may have a family of functions such that the scale at which the transition from rough to smooth occurs is essentially the same for all members of the family. In such cases, a root-finding algorithm that provides tunable parameters can offer significant run-time performance gains.

I compare the bisection and secant algorithms against a novel root-finding algorithm that is both adaptive and tunable. While no root-finding algorithm is universally best, for a number of simple example functions, my algorithm performs better than either bisection or secant. Python implementations of all of these algorithms are provided.

1. Introduction

Practical problems in engineering, science, finance, and other domains often involve the finding of roots, i.e., finding the value or values of \(x\)—the input to a function \(f\) of a single variable—such that the output of the function is zero. A problem in which the desired output is a constant value other than zero, or in which the outputs of two single-variable functions are to be made equal to one another, can be trivially reformulated as a problem in which one seeks the root of a single function. (In a more general form of the problem, the function may have multiple inputs and multiple outputs, and one must find a combination of inputs that simultaneously drives all outputs to zero. I don't consider such problems here).

In some applications of root-finding, the function is expressed in a simple form and directly evaluated. Consider the following geometry problem: "A right triangle has sides of lengths \(a\), \(b\), and \(c\), where \(c= 1\) is the length of the hypotenuse and \(b= 3 a\). Find the larger of the two non-right angles, using neither the Pythagorean Theorem nor inverse trigonometric functions". This problem can be solved by finding the root of the following function of \(\theta\), the unknown angle in degrees:

    \(sin((\pi/180) \cdot \theta)- 3 \cdot cos((\pi/180) \cdot \theta)= 0\).

(The answer is approximately 71.56505).

In other applications of root-finding, the function has no closed-form expression. Examples include functions that are evaluated by performing numerical integration or via simulation. In such cases, the evaluation of \(f\) may be expensive, i.e., require a large amount of calculation.

Many numerical algorithms have been developed for finding the roots of a function in one variable. Some of these, such as Newton's method, require calculating the derivative (slope) of the function, either from an analytical formula or via numerical approximation. Such algorithms may fail if the derivative is undefined at one or more points. Here, I consider only algorithms that do not depend on derivatives, i.e., \(f\) must be continuous, but need not have a derivative anywhere.

Inputs to the typical root-finding algorithm include:

  • one or two starting values of \(x\),
  • tolerances on \(x\), \(f\), or both, and
  • a limit on the number of iterations.

The algorithm iterates until either the tolerances are satisfied or the iteration limit has been reached.

Desirable properties of a root-finding algorithm include the following:

  1. guaranteed convergence to a root when given a pair of starting values of \(x\) that bracket a root,
  2. rapid convergence, and
  3. low complexity.

The complexity of the root-finding algorithm is typically not very important. When \(f\) is a simple function that can be evaluated with minimal computation, almost any root-finding algorithm that works (converges to the desired root) is acceptable. In the more interesting situation where the evaluation of \(f\) is expensive—this is the case that I consider here—speed of convergence may become critical. As we shall see, properties 2. and 3. are not in general simultaneously achievable, i.e., one may be forced to sacrifice one for the sake of the other.

2. The Bisection Method

Inputs to the bisection method (and most other root-finding algorithms) include two starting values of \(x\)\(a\) and \(b\)—which are sometimes called initial guesses. The bisection algorithm requires that \(f(a)\) and \(f(b)\) have opposite signs. (This can be viewed as a drawback of this algorithm). Under the assumption that \(f\) is continuous, \(a\) and \(b\) bracket a root, i.e., the interval from \(min(a, b)\) to \(max(a, b)\) is guaranteed to contain a root. More precisely, this interval is guaranteed to contain an odd number of roots.

In the \(n\)th iteration, the interval between the current \(a\) and \(b\) is bisected, i.e., the midpoint \(x_n= (a+b)/2\) is calculated. \(f_n= f(x_n)\) is evaluated, the signs of \(f_n f(a)\) and \(f_n f(b)\) are compared to determine which subinterval contains the root, and the values of \(a\) and \(b\) are then updated accordingly. (If the root is exactly on the boundary, i.e., at \(f_n\), it does matter which subinterval we choose). The following figure illustrates the operation of the algorithm.

Figure 1: Diagram of the Bisection Algorithm

The algorithm iterates until tolerances are satisfied. Most root-finding algorithms count iterations and terminate if the count exceeds a specified limit. The above figure does not include such a test.

My Python implementation of this algorithm (see download link below) is called find_root_bisection.

3. The Secant Method

The secant method, like the bisection method, takes initial guesses \(a\) and \(b\) as inputs. In each iteration, \(f(a)\) and \(f(b)\) are evaluated and a straight line is fitted through the points \((a, f(a))\) and \((b, f(b)\):

   \(y= f(a) + (x-a) \cdot \frac{f(b)-f(a)}{b-a}\)

We can now set \(y= 0\) and solve for the value \(c\) where this line intersects the x-axis:

   \(c= a - f(a) \cdot \frac{b-a}{f(b)-f(a)}\).

\(a\) and \(b\) are then updated as follows:

   \(a \leftarrow b\)
   \(b \leftarrow c\).

When the secant method is given a pair \((a, b)\) that brackets a root sufficiently closely—it is difficult to specify how close is close enough—the algorithm will not only converge to that root, but converge more rapidly than bisection search. (I will say more about this later). But, while the bisection method guarantees convergence, the secant method does not.

My Python implementation of this algorithm (see download link below) is called find_root_secant.

4. The Regula Falsi Method

The Regula Falsi method (Ref. 2) is similar to the secant method, but differs as follows.

If the pair \((a, b)\) brackets a root, the updated pair is chosen according to the following rules:

  • If \(a c\) is negative, \([min(a, c), max(a, c)]\) brackets a root, and the updated pair is \((a, c)\).
  • Otherwise, the updated pair is \((b, c)\).

If the pair \((a, b)\) does not bracket a root, the updated pair is \((b, c)\).

The benefit of this modification is that if at any iteration the algorithm bounds a root, it will continue to bound that same root, i.e., convergence is guaranteed. As we shall see, the price of reliable convergence is that the fast convergence of the secant method is sacrificed.

My Python implementation of this algorithm (see download link below) is called find_root_Regula_Falsi.

5. An Adaptive, Tunable Root-Finding Algorithm

My root-finding algorithm is a hybrid of the Regula Falsi and bisection methods. Each iteration of the algorithm uses either bisection or Regula Falsi. The following parameters control the behavior of the algorithm:

  • a and b: These are the starting values of \(x\).

  • contraction_factor: This parameter controls the application of the bisection method. When a pair of consecutive x's brackets a root, and the ratio of the new interval's length to that of the old exceeds this factor, the algorithm applies the bisection method at the next iteration and then reverts to using Regula Falsi. Allowed values for the contraction_factor are 0.5 to 1. The default value is 0.7071= sqrt(2), chosen so that the algorithm will take no more than twice as many steps as bisection in the worst case. This value appears to typically work well, but may be tuned as needed.

  • use_bisection: In the initial use_bisection steps—this argument equals zero by default— the algorithm uses the bisection method.

My goals in designing this algorithm were the following:

  1. to provide, like Regula Falsi, more reliable convergence than the secant method, and in particular, to provide guaranteed convergence to a root when the initial \(a\) and \(b\) are such that \(f(a)\) and \(f(b)\) have opposite signs (which means that \(a\) and \(b\) bracket a root).

  2. to provide faster convergence than bisection search when f() is close to a straight line, or becomes close to a straight line after a small number of bisection steps.

  3. to provide faster convergence than Regula Falsi when \(f\) is irregular on a large scale but smooth at smaller scales.

My Python implementation of this algorithm (see download link below) is called find_root.

6. Examples and Discussion

6.1 Example #1

Consider the following function, which has a simple root at x= 0.7:

    \(f(x)= 2(x-0.7) + 0.03(x-0.7)^3\)

from find_roots import *

def f(x):
   return 2.*(x-0.7) + 0.03*(x-0.7)**3

print("\nTesting `find_root_bisection` ...")
x= find_root_bisection(f, 0.6, 6.0, xtol=1e-6, ftol=1e-6, verbose=True)
print('%6f' % x)

print("\nTesting `find_root_secant` ...")
x= find_root_secant(f, 0.6, 6.0, xtol=1e-6, ftol=1e-6, verbose=True)
print('%6f' % x)

print("\nTesting `find_root` with default value of `contraction_factor` ...")
x= find_root(f, 0.6, 6.0, xtol=1e-6, ftol=1e-6, verbose=True)
print('%6f' % x)

print("\nTesting `find_root` with `contraction_factor= 1.0` ...")
x= find_root(f, 0.6, 6.0, xtol=1e-6, ftol=1e-6, contraction_factor=1.0, verbose=True)
print('%6f' % x)

Testing `find_root_bisection` ...
Convergence achieved after 23 steps and 25 calls.
0.700000

Testing `find_root_secant` ...
Convergence achieved after 4 steps and 6 calls.
0.700000

Testing `find_root` with default value of `contraction_factor` ...
After step 1.  Converging too slowly.  ==> Temporarily switching to bisection.
At step 2, following 3 calls, using bisection.
After step 3.  Converging too slowly.  ==> Temporarily switching to bisection.
At step 4, following 5 calls, using bisection.
After step 5.  Converging too slowly.  ==> Temporarily switching to bisection.
At step 6, following 7 calls, using bisection.
After step 7.  Converging too slowly.  ==> Temporarily switching to bisection.
At step 8, following 9 calls, using bisection.
After step 9.  Converging too slowly.  ==> Temporarily switching to bisection.
At step 10, following 11 calls, using bisection.
After step 11.  Converging too slowly.  ==> Temporarily switching to bisection.
At step 12, following 13 calls, using bisection.
Convergence achieved after 14 steps and 16 calls.
0.700000

Testing `find_root` with `contraction_factor= 1.0` ...
Convergence achieved after 11 steps and 13 calls.
0.700000

Discussion

  • find_root_secant achieves the most rapid convergence, followed by find_root, with find_root_bisection in last place. The performance of find_root is slightly improved by using a non-default value of the contraction_factor parameter.

6.2 Example #2

Consider the following function, which has a root of multiplicity 4 at x= 0.7:

    \(f(x)= (x-0.7)^4\)

The function touches but does not cross the x-axis.

from find_roots import *

def f(x):
   return (x-0.7)**4

print('\nTesting `find_root_bisection` ...')
try:
   x= find_root_bisection(f, 0.6, 6.0, xtol=1e-6, ftol=1e-6, verbose=True)
   print('%6f' % x)
except Exception as ex:
   print(ex)

print('\nTesting `find_root_secant` ...')
x= find_root_secant(f, 0.6, 6.0, xtol=1e-6, ftol=1e-6, verbose=True)
print('%6f' % x)

print('\nTesting `find_root` with default value of `contraction_factor` ...')
x= find_root(f, 0.6, 6.0, xtol=1e-6, ftol=1e-6, verbose=True)
print('%6f' % x)

Testing `find_root_bisection` ...
This function requires that f(x1) and f(x2) have opposite signs.  You may try using `find_root`, which does not have this restriction.

Testing `find_root_secant` ...
Convergence achieved after 52 steps and 54 calls.
0.699996

Testing `find_root` with default value of `contraction_factor` ...
Convergence achieved after 52 steps and 54 calls.
0.699996

Discussion

  • find_root_secant and find_root converge in exactly the same number of steps, while find_root_bisection fails because \(f(x_1)\) and \(f(x_2)\) are both positive.
  • So far, find_root appears to have no advantage over find_root_secant.

6.3 Example #3

Consider the following function, which is piecewise-linear with three sections:

     \(f(x)= \left \{ \begin{array}{cc} -1 & x < -1 \\ 0 & -1 \le x \le 1 \\ 1 & x > 1 \end{array} \right .\)

from find_roots import *

def f(x):
   return numpy.clip(x, -1, 1)

print('\nTesting `find_root_bisection` ...')
try:
   x= find_root_bisection(f, 0.6, 6.0, xtol=1e-6, ftol=1e-6, verbose=True)
   print('%6f' % x)
except Exception as ex:
   print(ex)

print("\nTesting `find_root_secant` ...")
try:
   x= find_root_secant(f, 0.6, 6.0, xtol=1e-6, ftol=1e-6, verbose=True)
   print('%6f' % x)
except Exception as ex:
   print(ex)

print("\nTesting `find_root` with default value of `contraction_factor` ...")
x= find_root(f, 0.6, 6.0, xtol=1e-6, ftol=1e-6, verbose=True)
print('%6f' % x)

Testing `find_root_bisection` ...
This function requires that f(x1) and f(x2) have opposite signs.  You may try using `find_root`, which does not have this restriction.

Testing `find_root_secant` ...
At step 5, the slope is too close to zero!

Testing `find_root` with default value of `contraction_factor` ...
Convergence achieved after 6 steps and 8 calls.
0.000000

Discussion

  • find_root_bisection and find_root_secant both fail.
  • find_root achieves rapid convergence.

6.4 Example #4

Consider the following function: \(f(x)= x e^ \left . -|x| \right .\) \(f()\) is continuous and has a first derivative everywhere. The function has peaks on either side of the single root, which occurs at x= 0.

from find_roots import *

def f(x):
   return x * exp(-abs(x))

print('\nTesting `find_root_bisection` ...')
x= find_root_bisection(f, -0.5, 10., xtol=1e-6, ftol=1e-6, verbose=True)
print('%6f' % x)

print("\nTesting `find_root_secant` ...")
try:
   x= find_root_secant(f, -0.5, 10., xtol=1e-6, ftol=1e-6, verbose=True)
   print('%6f' % x)
except Exception as ex:
   print(ex)

print("\nTesting `find_root` with default value of use_bisection ...")
x= find_root(f, -0.5, 10., xtol=1e-6, ftol=1e-6, verbose=True)
print('%6f' % x)

print("\nTesting `find_root` with use_bisection=2 ...")
x= find_root(f, -0.5, 10., xtol=1e-6, ftol=1e-6, use_bisection=2, verbose=True)
print('%6f' % x)

Testing `find_root_bisection` ...
Convergence achieved after 24 steps and 26 calls.
-0.000000

Testing `find_root_secant` ...
At step 191, the slope is too close to zero!

Testing `find_root` with default value of use_bisection ...
After step 1.  Converging too slowly.  ==> Temporarily switching to bisection.
At step 2, following 3 calls, using bisection.
After step 3.  Converging too slowly.  ==> Temporarily switching to bisection.
At step 4, following 5 calls, using bisection.
Convergence achieved after 11 steps and 13 calls.
-0.000000

Testing `find_root` with use_bisection=2 ...
At step 1, following 2 calls, using bisection.
At step 2, following 3 calls, using bisection.
Convergence achieved after 9 steps and 11 calls.
-0.000000

Discussion

  • find_root_secant fails.

  • find_root and find_root_bisection both converge.

  • find_root with default parameters requires half as many function calls as find_root_bisection.

  • With find_root, the number of function calls is modestly reduced by performing two bisection search steps at the outset.

7. Conclusions

  • find_root and find_root_secant are more flexible and convenient to use than find_root_bisection because they do not require that the initial values of x bracket a root.

  • find_root is more reliable than find_root_secant.

  • In situations where find_root and find_root_bisection are both applicable, find_root often converges more rapidly.

8. Source Code


9. References

  1. http://en.wikipedia.org/wiki/Secant_method

  2. Joanna M. Papakonstantinou and Richard A. Tapia, "Origin and Evolution of the Secant Method in One Dimension", The American Mathematical Monthly, Vol. 120, No. 6, June–July 2013, pp. 500-518.