Mathematically, sufficient conditions for a local minimum of a smooth function are quite straightforward: is a local minimum if and the Hessian is positive definite. (It is a necessary condition that the Hessian be positive semidefinite.) The conditions for a root are even simpler. However, when the function is being evaluated on a computer where its value is only known, at best, to a certain precision, and practically only a limited number of function evaluations are possible, it is necessary to use error estimates to decide when a search has become close enough to a minimum or a root, and to compute the solution only to a finite tolerance. For the most part, these estimates suffice quite well, but in some cases, they can be in error, usually due to unresolved fine scale behavior of the function.
Tolerances affect how close a search will try to get to a root or local minimum before terminating the search. Assuming that the function itself has some error (as is typical when it is computed with numerical values), it is not typically possible to locate the position of a minimum much better than to half of the precision of the numbers being worked with. This is because of the quadratic nature of local minima. Near the bottom of a parabola, the height varies quite slowly as you move across from the minimum. Thus, if there is any error noise in the function, it will typically mask the actual rise of the parabola over a width roughly equal to the square root of the noise. This is best seen with an example.
From the sequence of plots, it is clear that for changes of order , which is about half of machine precision and smaller, errors in the function are masking the actual shape of the curve near the minimum. With just sampling of the function at that precision, there is no way to be sure if a given point gives the smallest local value of the function or not to any closer tolerance.
The value of the derivative, if it is computed symbolically, is much more reliable, but for the general case, it is not sufficient to rely only on the value of the derivative; the search needs to find a local minimal value of the function where the derivative is small to satisfy the tolerances in general. Note also that if symbolic derivatives of your function cannot be computed and finite differences or a derivative-free method is used, the accuracy of the solution may degrade further.
For the reason of this limitation, the default tolerances for the Find functions are all set to be half of the final working precision. Depending on how much error the function has, this may or may not be achievable, but in most cases it is a reasonable goal. You can adjust the tolerances using the AccuracyGoal and PrecisionGoal options. When AccuracyGoal->ag and PrecisionGoal->pg, this defines tolerances and .
Given and FindMinimum tries to find a value such that . Of course, since the exact position of the minimum, , is not known, the quantity is estimated. This is usually done based on past steps and derivative values. To match the derivative condition at a minimum, the additional requirement is imposed. For FindRoot, the corresponding condition is that just the residual be small at the root: .
In multiple dimensions, the situation is even more complicated since there can be more error in some directions than others, such as when a minimum is found along a relatively narrow valley, as in the Freudenstein–Roth problem. For searches such as this, often the search parameters are scaled, which in turn affects the error estimates. Nonetheless, it is still typical that the quadratic shape of the minimum affects the realistically achievable tolerances.
When you need to find a root or minimum beyond the default tolerances, it may be necessary to increase the final working precision. You can do this with the WorkingPrecision option. When you use WorkingPrecision->prec, the search starts at the precision of the starting values and is adaptively increased up to prec as the search converges. By default, WorkingPrecision->MachinePrecision, so machine numbers are used, which are usually much faster. Going to higher precision can take significantly more time, but can get you much more accurate results if your function is defined in an appropriate way. For very high-precision solutions, Newton's method is recommended because its quadratic convergence rate significantly reduces the number of steps ultimately required.
It is important to note that increasing the setting of the WorkingPrecision option does no good if the function is defined with lower-precision numbers. In general, for WorkingPrecision->prec to be effective, the numbers used to define the function should be exact or at least of precision prec. When possible, the precision of numbers in the function is artificially raised to prec using SetPrecision so that convergence still works, but this is not always possible. In any case, when the functions and derivatives are evaluated numerically, the precision of the results is raised to prec if necessary so that the internal arithmetic can be done with prec digit precision. Even so, the actual precision or accuracy of the root or minimum and its position is limited by the accuracy in the function. This is especially important to keep in mind when using FindFit, where data is usually only known up to a certain precision.
If you specify WorkingPrecision->prec, but do not explicitly specify the AccuracyGoal and PrecisionGoal options, then their default settings of Automatic will be taken to be AccuracyGoal->prec/2 and PrecisionGoal->prec/2. This leads to the smallest tolerances that can realistically be expected in general, as discussed earlier.
|WorkingPrecision||MachinePrecision||the final working precision, prec, to use; precision is adaptively increased from the smaller of prec and the precision of the starting conditions to prec|
|AccuracyGoal||Automatic||setting ag determines an absolute tolerance by tola=10-ag; when Automatic, ag=prec/2|
|PrecisionGoal||Automatic||setting pg determines an absolute tolerance by tolr=10-pg; when Automatic, pg=prec/2|
A search will sometimes converge slowly. To prevent slow searches from going on indefinitely, the Find commands all have a maximum number of iterations (steps) that will be allowed before terminating. This can be controlled with the option MaxIterations that has the default value MaxIterations->100. When a search terminates with this condition, the command will issue the cvmit message.
The Levenberg–Marquardt method is converging slowly on this problem because the residual is nonzero near the minimum and the second-order part of the Hessian is needed. While the method eventually does converge in just under 400 steps, perhaps a better option is to use a method which may converge faster.
In a larger calculation, one possibility when hitting the iteration limit is to use the final search point, which is returned, as a starting condition for continuing the search, ideally with another method.