|
|
|
|
Mastering Algorithms with CBy Kyle Loudon1st Edition August 1999 1-56592-453-3, Order Number: 4533 560 pages, $34.95, Includes Diskette |
Chapter 13 Numerical Methods
Numerical methods are algorithms in numerical analysis. Numerical analysis is the study of problems in which numbers and approximation play an especially significant role. Computers are particularly well-suited to problems in numerical analysis because many such problems, while essentially involving common mathematical operations, require a lot of them. In the early days of computing, scientists monopolized computers with problems like this, which were far too intensive to be carried out by hand. Even today, problems in numerical analysis still occupy a good part of the cycles of some of the largest computers in the world. Hence, numerical analysis is a vast subject, and many numerical methods are as complicated and specific as the mathematical problems they solve. This chapter presents three numerical methods that are relatively simple but applicable to a wide variety of problems.This chapter covers:
- Polynomial interpolation
- A method of approximating values of a function for which values are known at only a few points. Fundamental to this method is the construction of an interpolating polynomial pn(z) of degree ≤ n, where n + 1 is the number of points for which values are known.
- Least-squares estimation
- A method of determining estimators b1 and b0 for a function y (x) = b1x + b0 so that y (x) is a best-fit line through a set of n points (x0, y0), . . ., (xn - 1, yn - 1). A best-fit line using least-squares estimation minimizes the sum of squared vertical distances between each point (xi , yi) , i = 0, . . ., n - 1, and a corresponding point (xi , y (xi ) ) along y (x).
- Solution of equations
- The process of finding roots of equations having the form f (x) = 0. Whereas for some equations it is possible to determine exact roots, a great deal of the time a method of approximation must be used.
Some applications of numerical methods are:
- Linear regression models
- Statistical models in which there is a linear-form relationship between an independent variable x and a variable y that depends on it. Least-squares estimators help to predict values of y for values of x we have not observed experimentally.
- Curve fitting
- The process of fitting a curve to a number of points. If the points for which we have values are located at meaningful places on the curve we are trying to fit, and we know values at enough points, interpolation helps us draw a smooth curve.
- Scatter plots
- Statistical tools that help ascertain the relationship between an independent variable x and a variable y that depends on it. Using least-squares estimators to draw a best-fit line through a linear-form scatter plot helps with this.
- Approximating functions
- The process of determining the value of a function at points for which exact values are not known. This can be done by constructing an interpolating polynomial of the appropriate degree.
- Function tables
- Tables containing values of computationally expensive functions or models of complicated physical phenomena. Often it is too costly to compute and store values of a function with the granularity required at some later time. Thus, we store a limited number of points and interpolate between them.
- Scientific computing
- An area in which solving equations is one of the most fundamental problems routinely performed.
Description of Polynomial Interpolation
There are many problems that can be described in terms of a function. However, often this function is not known, and we must infer what we can about it from only a small number of points. To do this, we interpolate between the points. For example, in Figure 13-1, the known points along f (x) are x0, . . ., x8, shown by circular black dots. Interpolation helps us get a good idea of the value of the function at points z0, z1, and z2, shown by white squares. This section presents polynomial interpolation.
Figure 13-1. Interpolation with nine points to find the value of a function at other points
![]()
Fundamental to polynomial interpolation is the construction of a special polynomial called an interpolating polynomial. To appreciate the significance of this polynomial, let's look at some principles of polynomials in general. First, a polynomial is a function of the form:
![]()
where a0, . . ., an are coefficients. Polynomials of this form are said to have degree n, provided an is nonzero. This is the power form of a polynomial, which is especially common in mathematical discussions. However, other forms of polynomials are more convenient in certain contexts. For example, a form particularly relevant to polynomial interpolation is the Newton form:
![]()
where a0, . . ., an are coefficients and c1, . . ., cn are centers. Notice how when c1, . . ., cn are all 0, the Newton form of a polynomial reduces to the power form above.
Constructing an Interpolating Polynomial
Now that we understand a bit about polynomials, let's look at how to construct the polynomial that interpolates a function f (x). To interpolate f (x), a polynomial pn(z) of degree ≤ n is constructed using n + 1 points, x0, . . ., xn, known along f (x). The points x0, . . ., xn are called interpolation points. Using pn(z), we approximate the value of f (x) at x=z. Interpolation requires that point z be on the interval [x0, xn]. pn(z) is constructed using the formula:
![]()
where x0, . . ., xn are the points along f (x) for which values are known, and f [ x0], . . ., f [x0, . . ., xn] are divided differences, which are derived from x0, . . ., xn and the values of f (x) at these points. This is called the Newton formula for interpolating polynomials. Notice its similarities with the Newton form of polynomials in general. Divided differences are computed using the formula:
![]()
Notice that this formula shows that for divided differences when i < k we must have computed other divided differences beforehand. For example, to compute f [x0, x1, x2, x3], values are required for f [x1, x2, x3] and f [x0, x1, x2] in the numerator. Fortunately, we can use a divided-difference table to help compute divided differences in a systematic manner (see Figure 13-2).
A divided-difference table consists of several rows. The top row stores x0, . . ., xn. The second row stores values for f [x0], . . ., f [xn]. To compute each divided difference in the remainder of the table, we draw a diagonal from each divided difference back to f [xi ] and f [xj ] (shown as dotted lines for f [x1, x2, x3] in Figure 13-2). To get xi and xj in the denominator, we then proceed straight up from f [xi ] and f [xj ]. The two divided differences in the numerator are those immediately above the one being computed. When the table is complete, the coefficients for the interpolating polynomial are the divided differences at the far left of each row, beginning with the second row (shown in light gray in Figure 13-2).
Figure 13-2. A divided-difference table for determining the coefficients of an interpolating polynomial of degree 3
![]()
Evaluating an Interpolating Polynomial
Once we have determined the coefficients of the interpolating polynomial pn(z), we evaluate the polynomial once for each point at which we would like to know the value of f. For example, say we know the values of f at four points: x0 = -3.0, f (x0) = -5.0; x1 = -2.0, f (x1) = -1.1; x2 = 2.0, f (x2) = 1.9; and x3 = 3.0, f (x3) = 4.8; and we would like to know the value of f at z0 = -2.5, z1 = 0.0, z2 = 1.0, and z3 = 2.5. Since we know four points along f, the interpolating polynomial will have a degree of 3. Figure 13-3 is the divided-difference table for determining the coefficients of p3(z).
Figure 13-3. A divided-difference table producing the coefficients -5.0, 3.9, -0.63, and 0.1767
![]()
Once we have obtained the coefficients from the divided-difference table, we construct p3(z) using the Newton formula for interpolating polynomials presented earlier. Using the coefficients from Figure 13-3, the interpolating polynomial is:
![]()
Next, we evaluate this polynomial once at each point z. For example, at z0 = -2.5 we perform the following calculation:
![]()
The value of f at z1, z2, and z3 is determined in a similar manner. The results are tabulated and plotted in Figure 13-4.
Figure 13-4. Interpolating a function f (x) using the polynomial p3(z) presented in the text
![]()
Now that we have an understanding of how to interpolate a function, it is important to briefly mention the subject of error. As with any approximation method, it is important to understand that an interpolating polynomial usually has some amount of error associated with it. To minimize this error, qualitatively speaking, we must construct an interpolating polynomial using enough points along f (x), and ones properly spaced, so that the resulting polynomial gives an accurate impression of the function's character. Naturally, quantitative methods do exist for bounding the error associated with interpolation, but this book will not address them (see the related topics at the end of the chapter).
Interface for Polynomial Interpolation
interpol
int interpol (const double *x, const double *fx, int n, double *z, double *pz,
int m);Return Value
0 if interpolating is successful, or -1 otherwise.
Description
Determines the value of a function at specified points using polynomial interpolation. Points at which values are known are specified by the caller in x. The known values of the function at each point in x are specified in fx. Points at which values are to be determined are specified in z. The values calculated for the points passed in z are returned in pz. The number of values in x and fx is specified as n. The number of points in z (and thus returned in pz) is specified as m. It is the responsibility of the caller to manage the storage associated with x, fx, z, and pz.
Complexity
O (mn2), where m is the number of values to determine and n is the number of points at which values are known.
Implementation and Analysis of Polynomial Interpolation
Polynomial interpolation works fundamentally by determining the value of an interpolating polynomial at a number of desired points. To obtain this polynomial, first we must determine its coefficients by computing divided differences.
The interpol operation begins by allocating space for the divided differences as well as for the coefficients to be determined (see Example 13-1). Note that since the entries in each row in a divided-difference table depend only on the entries computed in the row before it (see Figures 13-2 and 13-3), we do not have to keep all of the table around at once. Thus, we allocate space only for the largest row, which has n entries. Next, we initialize the first row in the table with the values in fx. This is so that we are ready to compute what equates to the third row of the divided-difference table. (Nothing needs to be done for the first two rows because these entries are already stored in x and fx.) The final initialization step is to store the value of fx[0] in coeff[0] since this is the first coefficient of the interpolating polynomial.
The process of computing divided differences revolves around a single nested loop, which uses the formula for divided differences discussed earlier in the chapter. In terms of Figures 13-2 and 13-3, the outer loop, k, counts the number of rows for which entries must be computed (excluding the rows for x and fx). The inner loop, i, controls which entry is being computed in the current row. As we complete the entries in each row, the value in table [0] becomes the next coefficient for the interpolating polynomial. Thus, we store this value in coeff[k]. Once we have determined all coefficients for the interpolating polynomial, we evaluate the polynomial at each point in z. The results are stored in pz.
The runtime complexity of interpol is O (mn2), where m is the number of values in z (and values returned in pz), and n is the number of values in x (and fx). The factor n2 comes from the following. For each iteration of the loop controlled by j, we multiply one factor more than the previous term into the current term. Thus, when j is 1, term requires one multiplication; when j is 2, term requires two multiplications, and so forth until when j is n - 1, term requires n - 1 multiplications. Effectively, this becomes a summation from 1 to n - 1, which results in a running time of T (n) = (n (n + 1)/2) - n, times some constant amount of time. (This is from the well-known formula for summing an arithmetic series from 1 to n.) In O-notation, this simplifies to O (n2). The factor m in O (mn2) comes from evaluating the interpolating polynomial once for each point in z. The first nested loop, in which divided differences are computed, is O (n2). Thus, this term is not significant relative to mn2, which has the additional factor m.
Example 13-1: Implementation of Polynomial Interpolation/***************************************************************************** * * * ----------------------------- interpol.c ---------------------------------* * * *****************************************************************************/ #include <stdlib.h> #include <string.h> #include "nummeths.h" /***************************************************************************** * * * ------------------------------ interpol -------------------------------- * * * *****************************************************************************/ int interpol(const double *x, const double *fx, int n, double *z, double *pz, int m) { double term, *table, *coeff; int i, j, k; /***************************************************************************** * * * Allocate storage for the divided-difference table and coefficients. * * * *****************************************************************************/ if ((table = (double *)malloc(sizeof(double) * n)) == NULL) return -1; if ((coeff = (double *)malloc(sizeof(double) * n)) == NULL) { free(table); return -1; } /***************************************************************************** * * * Initialize the coefficients. * * * *****************************************************************************/ memcpy(table, fx, sizeof(double) * n); /***************************************************************************** * * * Determine the coefficients of the interpolating polynomial. * * * *****************************************************************************/ coeff[0] = table[0]; for (k = 1; k < n; k++) { for (i = 0; i < n - k; i++) { j = i + k; table[i] = (table[i + 1] - table[i]) / (x[j] - x[i]); } coeff[k] = table[0]; } free(table); /***************************************************************************** * * * Evaluate the interpolating polynomial at the specified points. * * * *****************************************************************************/ for (k = 0; k < m; k++) { pz[k] = coeff[0]; for (j = 1; j < n; j++) { term = coeff[j]; for (i = 0; i < j; i++) term = term * (z[k] - x[i]); pz[k] = pz[k] + term; } } free(coeff); return 0; }Description of Least-Squares Estimation
Least-squares estimation determines estimators b1 and b0 for a function y (x) = b1x + b0 so that y (x) is a best-fit line through a set of n points (x0, y0 ), . . ., (xn - 1, yn - 1). A best-fit line using least-squares estimation minimizes the sum of squared vertical distances between each point (xi, yi ), i = 0, . . ., n - 1 and a corresponding point (xi, y (xi )) along y (x). This is one way of defining a line so that each point (xi, yi ) is as close as possible to it.
Perhaps the most important application of least-squares estimation is to make inferences about a linear-form relationship between two variables. Given an independent variable x and a variable y that depends on it, estimators b1 and b0 allow us to calculate the expected value of y at values of x for which we have not actually observed y. This is particularly meaningful when x and y are related by a statistical relationship, which is an inexact relationship. For example, imagine how the number of new employees hired each month at a consulting firm is related to the number of hours the firm bills. Generally, as the firm hires more employees, it will bill more hours. However, there is not an exact number of hours it bills for a given number of employees. Contrast this with a functional relationship, which is exact. For example, a functional relationship might be one between the amount of money the firm charges for a project and the time the project requires. This relationship is exact if we assume that given a project of a certain length, there is an exact amount of money the firm will charge.
To understand how least-squares estimation works, recall that the distance r between two points (x1, y1) and (x2, y2) is defined as:
![]()
Since the points (xi, yi ) and (xi, y (xi )) have the same x-coordinate, the line between them is vertical. Consequently, this formula tells us that the distance between these points is simply the difference in y-coordinates, or yi - y (xi ) . This difference is called the deviation of yi at xi .
Consider for a moment why the squared deviations are used to compute b1 and b0 , and not simply the deviations themselves. The reason is primarily anachronistic. When we minimize the sum of the errors, we end up with simultaneous equations that are linear. Before computers, these were the easiest to solve. Another justification can be made on the basis of probability theory. Simply stated, the probability that b1 and b0 are optimal for the observed values of (xi, yi ) is proportional to a negative exponential containing the sum of all (yi - y (xi ))2. Thus, when we minimize the summation of squared deviations, we maximize the probability that b1 and b0 are good estimators as well. Yet another justification is that by squaring the deviations, more emphasis is given to larger deviations. Since in a normal distribution there are fewer instances of large deviations, this gives more weight to the deviations that occur less frequently.
To compute b1 and b0, we use the following formulas, where x and y are the coordinates of n points. These are derived from the simultaneous equations we mentioned above but did not show. The Σ (sigma) symbol in the formulas is used as a concise way of saying "sum all."
![]()
Figure 13-5 illustrates computing b1 and b0 for a set of n = 9 points (x0, y0 ), . . ., (x8, y8 ). The results of the calculations that need to be performed appear in the table. Using the values from the table, b1 and b0 are calculated using:
![]()
Figure 13-5. Least-squares estimation and the best-fit line that results
![]()
Substituting these values into y (x) = b1x + b0 yields y (x) = 0.5519x - 0.0249. Figure 13-5 plots this line with the points used to determine it. From the standpoint of least-squares estimation, no other line is a better fit for the data than this one.
Interface for Least-Squares Estimation
lsqe
void lsqe(const double *x, const double *y, int n, double *b1, double *b0);
Return Value
None.
Description
Uses least-squares estimation to obtain b1 and b0 in y (x) = b1x + b0 so that y (x) is a best-fit line through a set of points. The x-coordinates of the points are specified in x. The y-coordinates are specified in y. The number of points is specified in n. The operation returns the appropriate values in b1 and b0.
Complexity
O (n), where n is the number of points used in determining b1 and b0.
Implementation and Analysis
of Least-Squares EstimationThe implementation of least-squares estimation presented here requires us to do little more than compute a few summations and apply the results to the formulas presented earlier. The operation begins by summing all values for xi in sumx, all values for yi in sumy, all values of xi 2 in sumx2, and all values of xi yi in sumxy (see Example 13-2). Once we have completed this, we compute b1 and b0 using the formulas presented earlier.
The runtime complexity of lsqe is O (n), where n is the number of points used to determine b1 and b0. This is because a single loop that iterates n times is used to compute the summations.
Example 13-2: Implementation of Least-Squares Estimation/***************************************************************************** * * * -------------------------------- lsqe.c -------------------------------- * * * *****************************************************************************/ #include <math.h> #include "nummeths.h" /***************************************************************************** * * * --------------------------------- lsqe --------------------------------- * * * *****************************************************************************/ void lsqe(const double *x, const double *y, int n, double *b1, double *b0) { double sumx, sumy, sumx2, sumxy; int i; /***************************************************************************** * * * Compute the required summations. * * * *****************************************************************************/ sumx = 0.0; sumy = 0.0; sumx2 = 0.0; sumxy = 0.0; for (i = 0; i < n; i++) { sumx = sumx + x[i]; sumy = sumy + y[i]; sumx2 = sumx2 + pow(x[i], 2.0); sumxy = sumxy + (x[i] * y[i]); } /***************************************************************************** * * * Compute the least-squares estimators. * * * *****************************************************************************/ *b1 = (sumxy - ((sumx * sumy)/(double)n)) / (sumx2-(pow(sumx,2.0)/(double)n)); *b0 = (sumy - ((*b1) * sumx)) / (double)n; return; }Description of the Solution of Equations
One of the most fundamental problems in scientific computing is solving equations of the form f (x) = 0. This is often referred to as finding the roots, or zeros, of f (x). Here, we are interested in the real roots of f (x), as opposed to any complex roots it might have. Specifically, we will focus on finding real roots when f (x) is a polynomial.
Finding Roots with Newton's Method
Although factoring and applying formulas are simple ways to determine the roots of polynomial equations, a great majority of the time polynomials are of a large enough degree and sufficiently complicated that we must turn to some method of approximation. One of the best approaches is Newton's method. Fundamentally, Newton's method looks for a root of f (x) by moving closer and closer to it through a series of iterations. We begin by choosing an initial value x = x0 that we think is near the root we are interested in. Then, we iterate using the formula:
![]()
until xi + 1 is a satisfactory approximation. In this formula, f (x) is the polynomial for which we are trying to find a root, and f ' (x) is the derivative of f (x).
Computing the Derivative of a Polynomial
The derivative of a function is fundamental to calculus and can be described in many ways. For now, let's simply look at a formulaic description, specifically for polynomials. To compute the derivative of a polynomial, we apply to each of its terms one of two formulas:
![]()
where k is a constant, r is a rational number, and x is an unknown. The symbol d /dx means "derivative of," where x is the variable in the polynomial. For each term that is a constant, we apply the first formula; otherwise, we apply the second. For example, suppose we have the function:
![]()
In order to compute f ' (x), the derivative of f (x), we apply the second formula to the first three terms of the polynomial, and the first formula to the last term, as follows:
![]()
Sometimes it is necessary to compute higher-order derivatives as well, which are derivatives of derivatives. For example, the second derivative of f (x), written f '' (x), is the derivative of f ' (x). Similarly, the third derivative of f (x), written f ''' (x), is the derivative of f '' (x), and so forth. Thus, to compute the second derivative of f (x) in the previous equation, we compute the derivative of f ' (x), as follows:
![]()
Understanding the First and Second Derivative
Now let's look at what derivatives really mean. To use Newton's method properly, it is important to understand the meaning of the first and second derivative in particular.
The value of the first derivative of f (x) at some point x = x0 indicates the slope of f (x) at point x0; that is, whether f (x) is increasing (sloping upward from left to right) or decreasing (sloping downward). If the value of the derivative is positive, f (x) is increasing; if the value is negative, f (x) is decreasing; if the value is zero, f (x) is neither increasing nor decreasing. The magnitude of the value indicates how fast f (x) is increasing or decreasing. For example, Figure 13-6a depicts a function whose value increases within the shaded regions; thus, these are the regions where the first derivative is positive. The plot of the first derivative crosses the x-axis at the points where the slope of f (x) changes sign.
The value of the second derivative of f (x) at some point x = x0 indicates the concavity of f (x) at point x0, that is, whether the function is opening upward or downward. The magnitude of the value indicates how extreme the concavity is. In Figures 13-6a and 13-6c, the dotted line indicates the point at which the concavity of the function changes sign. This is the point at which the plot of the second derivative crosses the x-axis.
Another way to think of the value of the derivative of f (x) at some point x = c is as the slope of the line tangent to f (x) at c, expressed in point-slope form. The point-slope form of a line is:
![]()
Thus, if f (x) = x 3 - x 2 - 3x + 1.8 as shown in Figure 13-6a, the equation of the line tangent to f (x) at c = 1.5 as can be determined as follows. Figure 13-6d is a plot of this line along with f (x).
![]()
Figure 13-6. The meaning of the first and second derivatives of f (x)
![]()
Selecting an Initial Point for Newton's Method
Now that we understand a little about derivatives, let's return to Newton's method. Paramount to Newton's method is the proper selection of an initial iteration point x0. In order for Newton's method to converge to the root we are looking for, the initial iteration point must be "near enough" and on the correct side of the root we are seeking. There are two rules that must be followed to achieve this:
- Determine an interval [a, b ] for x0 where one and only one root exists. To do this, choose a and b so that the signs of f (a) and f (b) are not the same and f ' (x) does not change sign. If f (a) and f (b) have different signs, the interval contains at least one root. If the sign of f ' (x) does not change on [a, b ], the interval contains only one root because the function can only increase or decrease on the interval.
- Choose either x0 = a or x0 = b so that f (x0) has the same sign as f '' (x) on the interval [a, b ]. This also implies that f '' (x) does not change sign on the interval. Recall that the second derivative of f (x) is an indication of concavity. If f '' (x) does not change sign and x0 is chosen so that f (x0) has the same sign as f '' (x), each successive iteration of Newton's method will converge closer to the root on the interval [a, b ] (see Figure 13-7).
In each of the four parts of Figure 13-7, f (x) is shown as a heavy line, and a and b are shown as vertical dotted lines. If f (a) matches the criteria just given, iteration begins at a and tangent lines slope from a toward the root to which we would like to converge. If f (b) matches the criteria above, iteration begins at b and tangent lines slope from b toward the root to which we would like to converge.
Figure 13-7. Convergence of Newton's method
![]()
How Newton's Method Works
As an example, suppose we would like to find the roots of f (x) = x 3 - x2 - 3x + 1.8. Figure 13-8 illustrates that this function appears to have three roots: one on the interval [-2, -1], another on the interval [0, -1], and a third on the interval [2, 3]. Once we have an idea of the number and location of a function's roots, we test each interval against the rules for selecting an initial iteration point. To do this, we need to know the following:
![]()
Using this information, we see that the interval [-2, -1] satisfies the first rule because f (-2) = -4.2 and f (-1) = 2.8, and f ' (x) does not change sign on the interval: it is always positive. Considering this, we know there is, in fact, one and only one root on the interval [-2, -1]. To satisfy the second rule, we see that f '' (x) does not change sign on the interval: it is negative. We select x0 = -2 as the initial iteration point since f (-2) = -4.2 is also negative. Figure 13-8 illustrates calculating the root on this interval to within 0.0001 of its actual value. We end up iterating five times to obtain this approximation.
Figure 13-8. Calculating the three real roots of f (x) = x3 - x2 - 3x + 1.8 = 0 to within 0.0001 of their actual values
![]()
Moving to the root on the interval [0, 1], we see that the first rule is satisfied just as for the previous interval. However, the sign of f '' (x) is not constant on this interval; therefore, the interval does not satisfy the second rule. Suspecting that the root is closer to 1 than 0, we try the interval [0.5, 1] next, which corrects the problem. The first rule is satisfied because f (0.5) = 0.175 and f (1) = -1.2, and f ' (x) does not change sign on the interval: it is negative. To complete the second rule, we select x0 = 0.5 since f (0.5) = 0.175 is positive and has the same sign as f '' (x) over the interval [0.5, 1]. Figure 13-8 illustrates calculating the root on this interval to within 0.0001 of its actual value. We end up iterating four times to obtain this approximation. Calculating the third root proceeds in a similar manner.
Interface for the Solution of Equations
root
int root(double (*f)(double x), double (*g)(double x), double *x, int *n,
double delta)Return Value
0 if a root is found, -1 otherwise.
Description
Computes the root of f to which Newton's method converges given an initial iteration point. This point is specified in x[0]. The derivative of f is specified in g. The argument n is the maximum number of iterations to perform. The argument delta is the difference between successive approximations at which to stop iterating. Upon return, successive values of x calculated during the iteration process are returned in the x array. Upon return, n contains the number of values in array x. It is the responsibility of the caller to manage the storage associated with x.
Complexity
O (n), where n is the maximum number of iterations the caller wishes to perform.
Implementation and Analysis of the Solution of Equations
Recall that solving an equation of the form f (x) = 0 means finding its roots. The root operation locates the real root to which Newton's method converges given an initial iteration point.
The root operation revolves around a single loop (see Example 13-3), which calculates successive approximations using the Newton iteration formula. In the implementation presented here, f is the function for which we are approximating the root, and g is the derivative of f. After each iteration, we determine whether the current approximation of the root is satisfactory. An approximation is deemed satisfactory when the difference between it and that of the previous iteration is less than delta. If after n iterations a satisfactory root still has not been found, root terminates.
The runtime complexity of root is O (n), where n is the maximum number of iterations the caller wishes to perform. The worst case occurs when we do not find the root we are looking for.
Example 13-3: Implementation for the Solution of Equations/***************************************************************************** * * * -------------------------------- root.c -------------------------------- * * * *****************************************************************************/ #include <math.h> #include "nummeths.h" /***************************************************************************** * * * --------------------------------- root --------------------------------- * * * *****************************************************************************/ int root(double (*f)(double x), double (*g)(double x), double *x, int *n, double delta) { int satisfied, i; /***************************************************************************** * * * Use Newton's method to find a root of f. * * * *****************************************************************************/ i = 0; satisfied = 0; while (!satisfied && i + 1 < *n) { /************************************************************************** * * * Determine the next iteration of x. * * * **************************************************************************/ x[i + 1] = x[i] - (f(x[i]) / g(x[i])); /************************************************************************** * * * Determine whether the desired approximation has been obtained. * * * **************************************************************************/ if (fabs(x[i + 1] - x[i]) < delta) satisfied = 1; /************************************************************************** * * * Prepare for the next iteration. * * * **************************************************************************/ i++; } /***************************************************************************** * * * Even without iterating, indicate that one value has been stored in x. * * * *****************************************************************************/ if (i == 0) *n = 1; else *n = i + 1; /***************************************************************************** * * * Return whether a root was found or the maximum iterations were reached. * * * *****************************************************************************/ if (satisfied) return 0; else return -1; }Questions and Answers
- In the discussion of polynomial interpolation, we stated that we need to choose enough points to give an accurate impression of the function we are interpolating. What happens if we do not use enough points?
- Interpolating a function with not enough points, or poorly placed points, leads to an interpolating polynomial that does not accurately reflect the function we think we are interpolating. A simple example is interpolating a quadratic polynomial (a parabola when plotted) with only two points. Interpolation with two points results in a line, which is far from a parabola!
- Using the guidelines presented in this chapter, how many interpolation points should we use to interpolate the function f (x) = x 5 + 2.8x 3 - 3.3x 2 - x + 4.1?
- When interpolating a function that we know is a polynomial itself, we can get a good impression of the function by using n + 1 well-placed points, where n is the degree of the polynomial. In this example, the polynomial has a degree of 5, so we should use six well-placed interpolation points. This results in an interpolating polynomial that has the same degree as f (x).
- Recall that to approximate a root of an equation using Newton's method, we select an interval [a, b] on which the root exists and iterate closer and closer to it. What if we choose this interval much larger than needed, but in such a way that both rules mentioned in this chapter are still satisfied?
- The discussion of Newton's method mentioned two rules that must be satisfied in order to guarantee the algorithm's success: we need to determine an interval [a, b ] where one and only one root exists; and we need to choose x0, the initial iteration point, so that f (x0) has the same sign as f '' (x) over the interval. Provided these rules are satisfied, the interval [a, b ] can be as large as we would like to make it. However, Newton's method will require more iterations to converge if we use an interval that is excessively large. Therefore, typically a relatively small interval convenient to the problem should be chosen.
- In the implementation of root, what symptoms might we notice if we have violated one the rules of Newton's method that guarantee convergence?
- If we follow the rules presented in this chapter, Newton's method guarantees convergence to the root that exists on the interval [a, b ] containing the initial iteration point, x0. Various symptoms help to determine when we have violated these rules. For example, successive approximations that appear to be diverging instead of converging indicate a problem. Another symptom is convergence to a root other than the one we expect. For example, suppose we think there is a root near -2 (perhaps by plotting the function), but we end up finding a root near 9. In order to relay these symptoms back to the caller, root returns both an array of the approximations obtained in successive iterations of Newton's method and an array of values for f (x) computed using the approximations. Normally, successive values for f (x) should approach 0. The parameter n of the root operation provides a way to keep a divergent series from running too long.
Related Topics
- Error approximation
- An important part of more substantial work with numerical methods. Numerical analysis is replete with approximation methods, and inherent in any approximation is some amount of error. Often it is important to quantify this.
- Derivatives of functions
- A fundamental part of calculus. The numerical methods presented in this chapter required only a primitive understanding of derivatives. However, for many numerical methods, a more complete understanding of derivatives and calculus is essential.
- Muller's method
- An algorithm for finding both the real and complex roots of equations. Complex roots are complex numbers, which result from taking the square root of negative numbers. This chapter focused on finding real roots.
Back to: Mastering Algorithms with C
© 2001, O'Reilly & Associates, Inc.