Trust Region Methods
A trust region method has a region around the current search point, where the quadratic model
for local minimization is "trusted" to be correct and steps are chosen to stay within this region. The size of the region is modified during the search, based on how well the model agrees with actual function evaluations.
Very typically, the trust region is taken to be an ellipse such that
.
is a diagonal scaling (often taken from the diagonal of the approximate Hessian) and
is the trust region radius, which is updated at each step.
When the step based on the quadratic model alone lies within the trust region, then, assuming the function value gets smaller, that step will be chosen. Thus, just as with line search methods, the step control does not interfere with the convergence of the algorithm near to a minimum where the quadratic model is good. When the step based on the quadratic model lies outside the trust region, a step just up to the boundary of the trust region is chosen, such that the step is an approximate minimizer of the quadratic model on the boundary of the trust region.
Once a step
is chosen, the function is evaluated at the new point, and the actual function value is checked against the value predicted by the quadratic model. What is actually computed is the ratio of actual to predicted reduction.
If
is close to 1, then the quadratic model is quite a good predictor and the region can be increased in size. On the other hand, if
is too small, the region is decreased in size. When
is below a threshold,
, the step is rejected and recomputed. You can control this threshold with the method option "AcceptableStepRatio"->
. Typically the value of
is quite small to avoid rejecting steps that would be progress toward a minimum. However, if obtaining the quadratic model at a point is quite expensive (e.g. evaluating the Hessian takes a relatively long time), a larger value of
will reduce the number of Hessian evaluations, but it may increase the number of function evaluations.
To start the trust region algorithm, an initial radius
needs to be determined. By default Mathematica uses the size of the step based on the model (1) restricted by a fairly loose relative step size limit. However, in some cases, this may take you out of the region you are primarily interested in, so you can specify a starting radius
using the option
. The option contains Scaled in its name because the trust region radius works through the diagonal scaling
, so this is not an absolute step size.
| In[1]:= |
| In[2]:= |
| Out[2]= | ![]() |
The plot looks quite bad because the search has extended over such a large region that the fine structure of the function cannot really be seen on that scale.
| In[3]:= |
| Out[3]= | ![]() |
It is also possible to set an overall maximum bound for the trust region radius by using the option
so that for any step,
.
Trust region methods can also have difficulties with functions which are not smooth due to problems with numerical roundoff in the function computation. When the function is not sufficiently smooth, the radius of the trust region will keep getting reduced. Eventually, it will get to the point at which it is effectively zero.
| In[4]:= |
| Out[4]= | ![]() |
The message means that the size of the trust region has become effectively zero relative to the size of the search point, so steps taken would have negligible effect. Note: On some platforms, due to subtle differences in machine arithmetic, the message may not show up. This is because the reasons leading to the message have to do with numerical uncertainty, which can vary between different platforms.
| In[6]:= |
| Out[6]= | ![]() |
The plot along one direction makes it fairly clear why no more improvement is possible. Part of the reason the Levenberg-Marquardt method gets into trouble in this situation is that convergence is relatively slow because the residual is nonzero at the minimum. With Newton's method, the convergence is faster, and the full quadratic model allows for a better estimate of step size, so that FindMinimum can have more confidence that the default tolerances have been satisfied.
| In[52]:= |
| Out[52]= | ![]() |
The following table summarizes the options for controlling trust region step control.
option name | default value | |
| "AcceptableStepRatio" | 1/10000 | the threshold |
| "MaxScaledStepSize" | ∞ | the value |
| "StartingScaledStepSize" | Automatic | the initial trust region size |







