Termination Conditions
Mathematically, sufficient conditions for a local minimum of a smooth function are quite straightforward:
x^{*} is a local minimum if
f (x^{*}) = 0 and the Hessian
^{2}f (x^{*}) is positive definite. (It is a necessary condition that the Hessian be positive semidefinite.) The conditions for a root are even simpler. However, when the function
f is being evaluated on a computer where its value is only known, at best, to a certain precision, and practically only a limited number of function evaluations are possible, it is necessary to use error estimates to decide when a search has become close enough to a minimum or a root, and to compute the solution only to a finite tolerance. For the most part, these estimates suffice quite well, but in some cases, they can be in error, usually due to unresolved fine scale behavior of the function.
Tolerances affect how close a search will try to get to a root or local minimum before terminating the search. Assuming that the function itself has some error (as is typical when it is computed with numerical values), it is not typically possible to locate the position of a minimum much better than to half of the precision of the numbers being worked with. This is because of the quadratic nature of local minima. Near the bottom of a parabola, the height varies quite slowly as you move across from the minimum. Thus, if there is any error noise in the function, it will typically mask the actual rise of the parabola over a width roughly equal to the square root of the noise. This is best seen with an example.
This loads a package that contains some utility functions. 
The following command displays a sequence of plots showing the minimum of the function over successively smaller ranges. The curve computed with machine numbers is shown in black; the actual curve (computed with 100 digits of precision) is shown in blue.
Out[2]=  

From the sequence of plots, it is clear that for changes of order
10^{8}, which is about half of machine precision and smaller, errors in the function are masking the actual shape of the curve near the minimum. With just sampling of the function at that precision, there is no way to be sure if a given point gives the smallest local value of the function or not to any closer tolerance.
The value of the derivative, if it is computed symbolically, is much more reliable, but for the general case, it is not sufficient to rely only on the value of the derivative; the search needs to find a local minimal value of the function where the derivative is small to satisfy the tolerances in general. Note also that if symbolic derivatives of your function cannot be computed and finite differences or a derivativefree method is used, the accuracy of the solution may degrade further.
Root finding can suffer from the same inaccuracies in the function. While it is typically not as severe, some of the error estimates are based on a merit function, which does have a quadratic shape.
For the reason of this limitation, the default tolerances for the
Find functions are all set to be half of the final working precision. Depending on how much error the function has, this may or may not be achievable, but in most cases it is a reasonable goal. You can adjust the tolerances using the
AccuracyGoal and
PrecisionGoal options. When
AccuracyGoal>ag and
PrecisionGoal>pg, this defines tolerances
tol_{a} = 10^{ag} and
tol_{r} = 10^{pg}.
Given
tol_{a} and
tol_{r} FindMinimum tries to find a value
x_{k} such that
x_{k}  x^{*} ≤ max (tol_{a}, x_{k} tol_{r}). Of course, since the exact position of the minimum,
x^{*} is not known, the quantity
x_{k}  x^{*} is estimated. This is usually done based on past steps and derivative values. To match the derivative condition at a minimum, the additional requirement
f (x_{k})≤ tol_{a} is imposed. For
FindRoot, the corresponding condition is that just the residual be small at the root:
f≤ tol_{a}.
This finds the to at least 12 digits of accuracy, or within a tolerance of 10^{12}. The precision goal of means that tol_{r} = 0, so it does not have any effect in the formula. (Note, you cannot similarly set the accuracy goal to since that is always used for the size of the residual.)
Out[3]=  

This shows that the result satisfied the requested error tolerances.
Out[4]=  

This tries to find the minimum of the function sin (x)cos (x) to 8 digits of accuracy. FindMinimum gives a warning message because of the error in the function as seen in the plots.
Out[5]=  

This shows that though the value at the minimum was found to be basically machine epsilon, the position was only found to the order of 10^{8} or so.
Out[6]=  

In multiple dimensions, the situation is even more complicated since there can be more error in some directions than others, such as when a minimum is found along a relatively narrow valley, as in the
FreudensteinRoth problem. For searches such as this, often the search parameters are scaled, which in turn affects the error estimates. Nonetheless, it is still typical that the quadratic shape of the minimum affects the realistically achievable tolerances.
When you need to find a root or minimum beyond the default tolerances, it may be necessary to increase the final working precision. You can do this with the
WorkingPrecision option. When you use
WorkingPrecision>prec, the search starts at the precision of the starting values and is adaptively increased up to
prec as the search converges. By default,
WorkingPrecision>MachinePrecision, so machine numbers are used, which are usually much faster. Going to higher precision can take significantly more time, but can get you much more accurate results if your function is defined in an appropriate way. For very highprecision solutions,
Newton's method is recommended because its quadratic convergence rate significantly reduces the number of steps ultimately required.
It is important to note that increasing the setting of the
WorkingPrecision option does no good if the function is defined with lowerprecision numbers. In general, for
WorkingPrecision>prec to be effective, the numbers used to define the function should be exact or at least of precision
prec. When possible, the precision of numbers in the function is artificially raised to
prec using
SetPrecision so that convergence still works, but this is not always possible. In any case, when the functions and derivatives are evaluated numerically, the precision of the results is raised to
prec if necessary so that the internal arithmetic can be done with
prec digit precision. Even so, the actual precision or accuracy of the root or minimum and its position is limited by the accuracy in the function. This is especially important to keep in mind when using
FindFit where data is usually only known up to a certain precision.
Here is a function defined using machine numbers. 
Even with higher working precision, the minimum cannot be resolved better because the actual function still has the same errors as shown in the plots. The derivatives were specified to keep other things consistent with the computation at machine precision shown previously.
Out[8]=  

Here is the computation done with 20digit precision when the function does not have machine numbers.
Out[9]=  

If you specify
WorkingPrecision>prec, but do not explicitly specify the
AccuracyGoal and
PrecisionGoal options, then their default settings of
Automatic will be taken to be
AccuracyGoal>prec/2 and
PrecisionGoal>prec/2. This leads to the smallest tolerances which can realistically be expected in general, as discussed earlier.
This shows that though the value at the minimum was actually found to be even better than the default 25digit tolerances.
Out[11]=  

The following table shows a summary of the options affecting precision and tolerance.
Precision and tolerance options in the "Find" functions.
A search will sometimes converge slowly. To prevent slow searches from going on indefinitely, the
Find commands all have a maximum number of iterations (steps) that will be allowed before terminating. This can be controlled with the option
MaxIterations that has the default value
MaxIterations>100. When a search terminates with this condition, the command will issue the
cvmit message.
This attempt solves the problem with the default method, which is the LevenbergMarquardt method, since the function is a sum of squares.
Out[13]=  

The LevenbergMarquardt method is converging slowly on this problem because the residual is nonzero near the minimum and the secondorder part of the Hessian is needed. While the method eventually does converge in just under 400 steps, perhaps a better option is to use a method which may converge faster.
Out[44]=  
In a larger calculation, one possibility when hitting the iteration limit is to use the final search point, which is returned, as a starting condition for continuing the search, ideally with another method.