Documentation  / Mathematica / Built-in Functions  / Advanced Documentation / Linear Algebra / Matrix Computations /

 

Least Squares Solutions

This section follows from the previous section. It is concerned with finding solutions to the matrix equation , where is an matrix, and (i.e., when there are more equations than unknowns). In this case, there may well be no solution and LinearSolve will issue an error message.

In[1]:=

Out[3]=

In these cases it is possible to find a best-fit solution that minimizes . A particularly common choice of p is 2; this generates a least squares solution. These are often used because the function is differentiable in and because the 2-norm is preserved under orthogonal transformations.

If the rank of is (i.e., it has full column rank), it can be shown (Golub and van Loan) that there is a unique solution to the least squares problem and it solves the linear system.

These are called the normal equations. Although they can be solved directly to get a least squares solution, this is not recommended because if the matrix is ill-conditioned then the product will be even more ill-conditioned. Ways of finding least squares solutions for full rank matrices are explored in Examples: Full Rank Least Squares Solutions.

It should be noted that best-fit solutions that minimize other norms, such as 1 and Infinity, are also possible. A number of ways to find solutions to overdetermined systems that are suitable for special types of matrix or minimize other norms are given in the section Examples: Minimization of 1 and Infinity Norms.

A general way to find a least squares solution to an overdetermined system is to use a singular value decomposition to form a matrix that is known as the pseudo-inverse of a matrix. In Mathematica this is computed with PseudoInverse. This technique works even if the input matrix is rank deficient. The basis of the technique is given below.

Here is an overdetermined system; you can see that the matrix is rank deficient.

In[4]:=

Out[6]=

The least squares solution can still be found using PsuedoInverse.

In[7]:=

Out[7]=

This shows how close the solution actually is.

In[8]:=

Out[8]=

Data Fitting

Scientific measurements often result in collections of ordered pairs of data, . In order to make predictions at times other than those that were measured, it is common to try to fit a curve through the data points. If the curve is a straight line, then the aim is to find unknown coefficients Alpha and Beta for which

for all the data pairs. This can be written in a matrix form as below and is immediately recognized as being equivalent to solving an overdetermined system of linear equations.

Fitting a more general curve, for example,

is equivalent to the matrix equation

In this case the left-hand matrix is a Vandermonde matrix. In fact any function that is linear in the unknown coefficients can be used.

An example of how to find best-fit parameters will now be given. This will start with the measurements shown below.

In[1]:=

These can be plotted graphically.

In[2]:=

You could try to solve this with the PseudoInverse technique introduced above. First, split the data into and components.

In[3]:=

Now form the left-hand side matrix.

In[5]:=

Out[6]//MatrixForm=

The best-fit parameter can be found as follows.

In[7]:=

Out[7]=

Mathematica provides the function FindFit that can find best-fit solutions to data. The solution agrees with that found by the PseudoInverse.

In[8]:=

Out[8]=

Finally, a plot is made that shows the data points and the fitted curve.

In[9]:=

The function FindFit is quite general; it can also fit functions to data that are not linear in the parameters. More information is found in The Mathematica Book.