This is documentation for Mathematica 6, which was
based on an earlier version of the Wolfram Language.
View current documentation (Version 11.1)

Gauss-Newton Methods

For minimization problems for which the objective function is a sum of squares,
it is often advantageous to use the special structure of the problem. Time and effort can be saved by computing the residual function r (x), and its derivative, the Jacobian J (x). The Gauss-Newton method is an elegant way to do this. Rather than using the complete second-order Hessian matrix for the quadratic model, the Gauss-Newton method uses Bk = JkTJk in (1) such that the step pk is computed from the formula
where Jk = J (xk), and so on. Note that this is an approximation to the full Hessian, which is JTJ +rj2rj. In the zero residual case, where r=0 is the minimum, or when r varies nearly as a linear function near the minimum point, the approximation to the Hessian is quite good and the quadratic convergence of "Newton's method" is commonly observed.
Objective functions, which are sums of squares, are quite common, and, in fact, this is the form of the objective function when FindFit is used with the default value of the NormFunction option. One way to view the Gauss-Newton method is in terms of least-squares problems. Solving the Gauss-Newton step is the same as solving a linear least-squares problem, so applying a Gauss-Newton method is in effect applying a sequence of linear least-squares fits to a nonlinear function. With this view, it makes sense that this method is particularly appropriate for the sort of nonlinear fitting that FindFit does.
This loads a package that contains some utility functions.
Click for copyable input
This uses the unconstrained problems package to set up the classic Rosenbrock function, which has a narrow curved valley.
Click for copyable input
When Mathematica encounters a problem that is expressly a sum of squares, such as the Rosenbrock example, or a function that is the dot product of a vector with itself, the Gauss-Newton method will be used automatically.
This shows the steps taken by FindMinimum with the Gauss-Newton method for the trust of region Newton Rosenbrock's function.
Click for copyable input
If you compare this with the same example done with "Newton's method", you can see that it was done with fewer steps and evaluations because the Gauss-Newton method is taking advantage of the special structure of the problem. The convergence rate near the minimum is just as good as for Newton's method because the residual is zero at the minimum.
The Levenberg-Marquardt method is a Gauss-Newton method with "trust region" step control (though it was originally proposed before the general notion of trust regions had been developed). You can request this method specifically by using the FindMinimum option Method→"LevenbergMarquardt" or equivalently Method→"GaussNewton".
Sometimes it is awkward to express a function so that it will explicitly be a sum of squares or a dot product of a vector with itself. In these cases, it is possible to use the "Residual" method option to specify the residual directly. Similarly, you can specify the derivative of the residual with the "Jacobian" method option. Note that when the residual is specified through the "Residual" method option, it is not checked for consistency with the first argument of FindMinimum. The values returned will depend on the value given through the option.
This finds the minimum of Rosenbrock's function using the specification of the residual.
Click for copyable input
option name
default value
"Residual"AutomaticAllows you to directly specify the residual r such that f=1/2r.r
"EvaluationMonitor"AutomaticAn expression that is evaluated each time the residual is evaluated
"Jacobian"AutomaticAllows you to specify the (matrix) derivative of the residual
"StepControl""TrustRegion"Must be "TrustRegion", but allows you to change control parameters through method options

Method options for Method->"LevenbergMarquardt".

Another natural way of setting up sums of squares problems in Mathematica is with FindFit, which computes nonlinear fits to data. A simple example follows.
Here is a model function.
Click for copyable input
Here is some data generated by the function with some random perturbations added.
Click for copyable input
This finds a nonlinear least-squares fit to the model function.
Click for copyable input
This shows the fit model with the data.
Click for copyable input
In the example, FindFit internally constructs a residual function and Jacobian, which are in turn used by the Gauss-Newton method to find the minimum of the sum of squares, or the nonlinear least-squares fit. Of course, FindFit can be used with other methods, but because a residual function that evaluates rapidly can be constructed, it is often faster than the other methods.