Method Introduction
There are various methods that could search local minimums for a function. This locally lowest value search could be called numerical optimization, which could happen in one dimension or more than one dimension. Normally, the later one is more difficult.
Here, only optimization in one dimension is studied and methods adopted are bisection method and Newton’s method.
These two methods are iterative and can only provide approximations, which is controlled by the convergence criterion. An initial estimate is needed to use these two methods and no guidance is provided by the methods themselves. So the following content will study how the initial guess changes convergence properties of these two methods.
The function used here is f(x)=exp(-x)cos(x). Ranges selected for bisection method are (1.2, 2.4), (1.4, 2.6), (1.6, 2.8), (1.8, 3.0), (2.0, 3.2), (2.2, 3.4). Initial estimates are boundary values of each range for bisection method. Initial estimates selected for Newton’s method are 1.2, 1.4, 1.6, 1.8, 2.0, 2.2, 2.4, 2.6, 2.8, 3.0, 3.2, 3.4. The initial estimates of Newton’s method are consistent with that of bisection method, which is designed for comparison between these two methods.
A reference local minimum for the studied function is x=2.356194490… . How many decimal places is determined by the precision we set. The convergence criterion used here is 0.000001(1E-06). The range selection for bisection is stopped at (2.2, 3.4) (the change step is 0.2 between each range) otherwise the reference value will fall outside the range.
Another tricky thing is that bisection method has two initial estimates, which form a range for later iteration, while Newton’s method only needs one. So, when we doing comparison, choosing which value for Newton’s method should be paid attention. Since Newton’s method converges to different local minimums with bisection method at 3.0, 3.2, 3.4, smaller boundary values of each range(1.2, 1.4, 1.6, 1.8, 2.0, 2.2) are adopted for Newton’s method when we comparing these two methods at each range.
Calculation Results
The difference of each step ε is defined as x value is this iteration minus x value in last iteration.
Following figures show the calculation results. Full data could be got through the link attach at the end of this post.
Comparison for convergence property between bisection and newton’s method
Range (1.2, 2.4)
Range (1.2, 2.4) is chosen for bisection method, the local minimum is 2.356194.
For New’s method, 1.2 is the initial estimate. The local minimum is 2.356194.
Fig 1 shows the convergence properties of bisection method and Newton’s method. Fig 2 shows a part of Fig1 in order to show the convergence detail.
Fig1. difference of each step ε vs iteration steps at range (1.2, 2.4)
Fig2. part of Fig1 in order to show details of convergence
Both of the methods converge to the same local minimum under the convergence criterion of 1E-06. But the Newton’s method converges needing less steps. The same pattern could be found for following ranges.
Range (1.4, 2.6)
Range (1.4, 2.6) is chosen for bisection method. The local minimum is 2.356194.
For Newton’s method, the initial estimate is 1.4. The local minimum is 2.356194.
Fig 3 and 4 show the convergence properties of the two methods.
Fig 3. difference of each step ε vs iteration steps at range (1.4, 2.6)
Fig 4. part of Fig3 in order to show details of convergence
Range (1.6, 2.8)
Range (1.6, 2.8) is chosen for bisection method. The local minimum is 2.356194.
For Newton’s method, the initial estimate is 1.6. The local minimum is 2.356194.
Fig 5 and 6 show the convergence properties of the two methods.
Fig 5. difference of each step ε vs iteration steps at range (1.6, 2.8)
Fig 6. part of Fig5 in order to show details of convergence
Range (1.8, 3.0)
Range (1.8, 3.0) is chosen for bisection method. The local minimum is 2.356194.
For Newton’s method, the initial estimate is 1.8. The local minimum is 2.356194.
Fig 7 and 8 show the convergence properties of the two methods.
Fig 7. difference of each step ε vs iteration steps at range (1.8, 3.0)
Fig 8. part of Fig 7 in order to show details of convergence
Range (2.0, 3.2)
Range (2.0, 3.2) is chosen for bisection method. The local minimum is 2.356194.
For Newton’s method, the initial estimate is 2.0. The local minimum is 2.356194.
Fig 9 and 10 show the convergence properties of the two methods.
Fig 9. difference of each step ε vs iteration steps at range (2.0, 3.2)
Fig 10. part of Fig 9 in order to show details of convergence
Range (2.2, 3.4)
Range (2.2, 3.4) is chosen for bisection method. The local minimum is 2.356194.
For Newton’s method, the initial estimate is 2.2. The local minimum is 2.356194.
Fig 11 and 12 show the convergence properties of the two methods.
Fig 11. difference of each step ε vs iteration steps at range (2.2, 3.4)
Fig 12. part of Fig 11 in order to show details of convergence
Study for convergence property for bisection and Newton’s method, respectively
The Bisection method
Fig 13 shows the convergence property of bisection method at different range. When only one local minimum exits in the ranges, we can say that optimization in different ranges has the same convergence path, which could be confirmed from data in attachment.
Fig 13. difference of each step ε vs iteration steps for bisection method at different ranges
Newton’s method
Besides 1.2, 1.4, 1.6, 1.8, 2.0, 2.2, Newton’s method could get the same local minimum 2.356194 at 2.4, 2.6, 2.8 for the initial estimate.So the new initial guesses are included for the comparison, which is shown in Fig 14. Only around the local minimum, we can say that the convergence is faster if the initial estimate is closer to the local minimum. When the initial estimate is far from the local minimum, the argument is not right. For example, the convergence is faster when the initial estimate is 1.2 compared with the situation when the initial estimate is 2.8, while 2.8 is closer to 2.356194. The explanation could be that the Newton’s method uses the tangent of the first-order derivative function of the original function to find points where the first-order derivative is zero. This search is determined by the initial estimate and the curve’s shape of the derivative function.
Fig 14. difference of each step ε vs iteration steps for Newton’s method at different initial estimates
Another local minimum found by Newton’s method
When the initial estimate is 3.0, 3.2, 3.4, Newton’s method converges to different numbers from 2.356194. Results are shown in Table 13. The convergence value could be a local minimum when the second-order derivative is positive. So when the initial estimate is 3.0, 3.2, 3.4, the convergence values are local maximum actually. Then some negative initial estimates are tried. The new local minimum could be -3.926990.
the initial estimate |
convergence value |
second-order derivative |
other |
2.356194 |
0.134 |
3.0 |
-63.617251 |
-0.613 |
3.2 |
11.780972 |
-1.081 |
3.4 |
5.497787 |
-0.005 |
-3.0 |
2.356194 |
0.134 |
-4.0 |
-3.926990 |
71.776 |
Table 1 some trials to find other local minimums
Conclusion
Comparing how fast (‘fast’ in this post means needing less iteration steps) the convergence happens for the two methods makes more sense if they converge to the same value. So the initial estimate is chosen to let bisection and Newton’s method converge to the same value. Under this condition, Newton’s method converges faster, in other words, needs less iteration steps than bisection method. A little change for the initial estimate that does not change the convergence value will still give the same fact that Newton’s method converges with less iteration steps.
Bisection method studied respectively, if the ranges chosen for initial estimates has only one minimum, changing the initial estimate will not change the convergence path, which could be seen in Fig 13.
For Newton’s method, if the initial estimate is close to the local minimum, the convergence could need less iteration steps, which however does not mean relatively far initial estimates will guarantee a convergence with more steps. The convergence property is related to the function’s property as well. For the function studied in this post exp(-x)cos(x), we can have other variations like exp(-ax)cos(x) or exp(-x)cos(ax), in which a is a positive number. As we can see, the function has two parts, ‘exp’ and ‘cos’. the first part determines the functions value especially when x takes values far away from 0 and the second part determines the oscillation period. If we introduce parameter ‘a’ into this function, we can expect we will have different function curves and different local minimums.
Here are some plots showing the curves of f(x) = exp(-x)cos(x) and its two variations in different ranges.
Fig 15. f(x) in range(1, 3), we can see the local minimum at 2.356194.
Fig 16. f(x) in range (-5, 5), we can see the local minimum at -3.926990.
Fig 17. f(x) in range (-100, 100), we can see another local minimum near -100.
Fig 18. f(x) and its variations: exp(-2x)cos(x) and exp(-x)cos(2x) in range (-5, 5)
Fig 19. f(x) and its variations: exp(-2x)cos(x) and exp(-x)cos(2x) in range (-100, 100)
From these plots, we can verify above-mentioned local minimums for f(x) and see f(x) and its variations have different curve shapes, which would affect convergence of Newton’s method.
Another point of view is if the initial estimate is far enough, a new extremum might be found. But whether it is a local minimum is determined by the sigh of its second-order derivative.
Reference
W. H. Press, S. A. Teukolsky, W. T. Vetterling, and B. P. Flannery, Numerical Recipes in C++: The Art of Scientific Computing, Cambridge University Press, Cambridge, UK, 2002.
Support information
Calculation data could be got through this link:
https://psu.box.com/s/h1ha1f482o6qbj0j3idk4b5nrmbxkhld