Documentation *Mathematica*Built-in Functions Advanced DocumentationLinear AlgebraMatrix Computations

Solving Linear Systems

One of the most important uses of matrices is to represent and solve linear systems. This section will discuss how to solve linear systems with *Mathematica*. It will make strong use of LinearSolve, the main function provided for this purpose.

Solving a linear system involves solving a matrix equation . Because is an matrix this is a set of linear equations in unknowns.

When the system is said to be square. If there are more equations than unknowns and the system is overdetermined. If there are fewer equations than unknowns and the system is underdetermined.

Classes of linear systems represented by rectangular matrices.

Note that even though you could solve the matrix equation by computing the inverse via , this is not a recommended method. You should use a function that is designed to solve linear systems directly and in *Mathematica* this is provided by LinearSolve.

Solving and linear systems with LinearSolve.

This uses LinearSolve to solve a matrix equation.

In[1]:=

Out[3]=

The solution can be tested to demonstrate that it solves the problem.

In[4]:=

Out[4]=

You can always generate the linear equations from the input matrices.

In[5]:=

Out[5]=

Solve can then be used to find the solution.

In[6]:=

Out[6]=

LinearSolve also works when the right-hand side of the matrix equation is also a matrix.

In[7]:=

Out[9]=

In[10]:=

Out[10]=

If the system is overdetermined, LinearSolve will find a solution if it exists.

In[11]:=

In[13]:=

Out[13]=

If no solution can be found, the system is inconsistent. In this case it might still be useful to find a best-fit solution. This is often done with a least squares technique, which is explored in the next section.

If the system is underdetermined, there is either no solution or infinitely many solutions and LinearSolve will return one.

In[14]:=

In[16]:=

Out[16]=

A system is consistent if and only if the rank of the matrix equals the rank of the matrix formed by adding as an extra column to (known as the augmented matrix). The following demonstrates that the previous system is consistent.

In[17]:=

Out[17]=

LinearSolve can work with all the different types of matrices that *Mathematica* supports, including symbolic and exact, machine number, and arbitrary-precision. It also works with both dense and sparse matrices. When a sparse matrix is used in LinearSolve, special techniques suitable for sparse matrices are used and the result is the solution vector.

In[18]:=

Out[20]=

Options for LinearSolve.

LinearSolve has three options that allow users more control. The Modulus and ZeroTest options are used by symbolic techniques and are discussed in the section Symbolic and Exact Matrices. The Method option allows you to choose methods that are known to be suitable for a particular problem; the default setting of Automatic means that LinearSolve chooses a method based on the input.

Singular Matrices

A matrix is singular if its inverse does not exist. One way to test for singularity is to compute a determinant; this will be zero if the matrix is singular. For example, the following matrix is singular.

In[1]:=

Out[2]=

For many right-hand sides there is no solution.

In[3]:=

Out[4]=

However, for certain values there will be a solution.

In[5]:=

Out[6]=

For the first example, the rank of does not equal that of the augmented matrix. This confirms that the system is not consistent and cannot be solved by LinearSolve.

In[7]:=

Out[7]=

For the second example, the rank of does equal that of the augmented matrix. This confirms that the system is consistent and can be solved by LinearSolve.

In[8]:=

Out[8]=

In those cases where a solution cannot be found, it is still possible to find a solution that makes a best-fit to the problem. One important class of best-fit solutions involves least squares solutions and is discussed in a later section.

Homogeneous Equations

The homogeneous matrix equation involves a zero right-hand side.

This equation has a nonzero solution if the matrix is singular. To test if a matrix is singular, you can compute the determinant.

In[1]:=

Out[2]=

To find the solution of the homogeneous equation you can use the function NullSpace. This returns a set of orthogonal vectors, each of which solves the homogeneous equation. In the example below, there is only one vector.

In[3]:=

Out[3]=

This demonstrates that the solution in fact solves the homogeneous equation.

In[4]:=

Out[4]=

The function Chop can be used to replace approximate numbers close to 0.

In[5]:=

Out[5]=

The solution to the homogeneous equation can be used to form an infinite number of solutions to the inhomogeneous equation. This solves an inhomogeneous equation.

In[6]:=

Out[7]=

The solution does in fact solve the equation.

In[8]:=

Out[8]=

If you add to sol an arbitrary factor times the homogeneous solution, this new vector also solves the matrix equation.

In[9]:=

Out[9]=

Estimating and Calculating Accuracy

An important way to quantify the accuracy of the solution of a linear system is to compute the condition number . This is defined as follows for a suitable choice of norm.

It can be shown that for the matrix equation the relative error in is times the relative error in and . Thus, the condition number quantifies the accuracy of the solution. If the condition number of a matrix is large, the matrix is said to be ill-conditioned. You cannot expect a good solution from an ill-conditioned system. For some extremely ill-conditioned systems it is not possible to obtain any solution.

In *Mathematica* an approximation of the condition number is computed with the function MatrixConditionNumber, which is loaded from the package MatrixManipulation.

In[1]:=

This computes the condition number of a matrix.

In[2]:=

Out[3]=

This matrix is singular and the condition number is .

In[4]:=

Out[5]=

This matrix has a large condition number and is said to be ill-conditioned.

In[6]:=

Out[7]=

If a matrix is multiplied with itself, the condition number increases.

In[8]:=

Out[9]=

When you solve a matrix equation involving an ill-conditioned matrix, the result may not be as accurate.

In[10]:=

Out[12]=

The solution is less accurate for the matrix mat2.

In[13]:=

Out[14]=

The solution to these problems is to avoid constructing matrices that may be ill-conditioned, for example, by avoiding multiplying matrices by themselves. Another method of solving problems more accurately is to use *Mathematica* arbitrary-precision computation; this is discussed in the section Arbitrary-Precision Matrices.

In[15]:=

Out[17]=

Symbolic and Exact Matrices

LinearSolve works for all the different types of matrices that can be represented in *Mathematica*. These are described in more detail in Matrix Types. Here is an example of working with purely symbolic matrices.

In[1]:=

Out[2]//MatrixForm=

In[3]:=

Out[4]=

The original matrix can be multiplied into the solution.

In[5]:=

Out[5]=

The result of multiplication needs some post-processing with Simplify to reach the expected value. This demonstrates the need for intermediate processing in symbolic linear algebra computation. Without care these intermediate results can become extremely large and make the computation run slowly.

In[6]:=

Out[6]=

In order to simplify the intermediate expressions, the ZeroTest option might be useful.

LinearSolve can also be used to solve exact problems.

In[7]:=

Out[8]//MatrixForm=

In[9]:=

Out[10]=

In[11]:=

Out[11]=

There are a number of methods that are specific to symbolic and exact computation: CofactorExpansion, DivisionFreeRowReduction, and OneStepRowReduction. These are discussed in the section on Symbolic Methods.

Row Reduction

Row reduction involves adding multiples of rows together so as to produce zero elements when possible. The final matrix is in reduced row echelon form as described previously. If the input matrix is a non-singular square matrix, the result will be an identity matrix.

In[1]:=

Out[1]=

This can be used as a way to solve a system of equations.

In[2]:=

Out[2]=

This can be compared with the following use of LinearSolve.

In[3]:=

Out[3]=

Saving the Factorization

Many applications of linear systems involve the same matrix but different right-hand sides . Because a significant part of the effort involved in solving the system involves processing , it is common to save the factorization and use it to solve repeated problems. In *Mathematica*, this is done by using a one-argument form of LinearSolve; this returns a functional that you can apply to different vectors to obtain each solution.

In[1]:=

Out[2]=

When you can apply the LinearSolveFunction to a particular right-hand side the solution is obtained.

In[3]:=

Out[4]=

This solves the matrix equation.

In[5]:=

Out[5]=

A different right-hand side yields a different solution.

In[6]:=

Out[7]=

This new solution solves this matrix equation.

In[8]:=

Out[8]=

The one argument form of LinearSolve works in a completely equivalent way to the two argument form. It works with the same range of input matrices, for example, returning the expected results for symbolic, exact, or sparse matrix input. It also accepts the same options.

One issue with the one argument form is that for certain input matrices the factorization cannot be saved. For example, if the system is overdetermined, the exact solution is not certain to exist. In this case a message is generated that warns you that the factorization will be repeated each time the functional is applied to a particular right-hand side.

In[9]:=

Out[10]=

For this right-hand side there is a solution, and this is returned.

In[11]:=

Out[12]=

However, for this right-hand side there is no solution, and an error is encountered.

In[13]:=

Out[14]=

Methods

LinearSolve provides a number of different techniques to solve matrix equations that are specialized to particular problems. You can select between these using the option Method. In this way a uniform interface is provided to all the functionality that *Mathematica* provides for solving matrix equations.

The default setting of the option Method is Automatic. As is typical for *Mathematica*, this means that the system will make an automatic choice of method to use. For LinearSolve if the input matrix is numerical and dense, then a method using LAPACK routines is used; if it is numerical and sparse, a multifrontal direct solver is used. If the matrix is symbolic then specialized symbolic routines are used.

The details of the different methods are now described.

LAPACK

LAPACK is the default method for solving dense numerical matrices. When the matrix is square and non-singular the routines dgesv, dlange, and dgecon are used for real matrices and zgesv, zlange, and zgecon for complex matrices. When the matrix is non-square or singular dgelss is used for real matrices and zgelss for complex matrices. More information on LAPACK is available in the references section.

If the input matrix uses arbitrary-precision numbers, then LAPACK algorithms extended for arbitrary-precision computation are used.

Multifrontal

The Multifrontal method is a direct solver used by default if the input matrix is sparse; it can also be selected by specifying a method option.

This reads in a sample sparse matrix stored in the Matrix Market format. More information on the importing of matrices is in the section Import and Export of Sparse Matrices.

In[1]:=

Out[1]=

In[2]:=

This solves the matrix equation using the sample matrix.

In[3]:=

If the input matrix to the Multifrontal method is dense, it is converted to a sparse matrix.

The implementation of the Multifronal method uses the UMFPACK library.

Krylov

The Krylov method is an iterative solver that is suitable for large sparse linear systems, such as those arising from numerical solving of PDEs. In *Mathematica* two Krylov methods are implemented: Conjugate Gradient (for symmetric positive definite matrices) and BiCGSTAB (for non-symmetric systems). It is possible to set the method that is used and a number of other parameters by using appropriate sub-options.

Sub-options of the Krylov method.

The default method for Krylov, BiCGSTAB, is more expensive but more generally applicable. The ConjugateGradient method is suitable for symmetric positive definite systems, always converging to a solution (though the convergence may be slow). If the matrix is not symmetric positive definite the ConjugateGradient may not converge to a solution. In this example, the matrix is not symmetric positive definite and the ConjugateGradient method does not converge.

In[1]:=

Out[3]=

The default method, BiCGSTAB, converges and returns the solution.

In[4]:=

Out[4]=

This tests the solution that was found.

In[5]:=

Out[5]=

Here is an example that shows the benefit of the Krylov method and the use of a preconditioner. First, a function that builds a structured banded sparse matrix is defined.

In[6]:=

This demonstrates the structure of the matrices generated by this function.

In[7]:=

This builds a larger system, with a 1000010000 matrix.

In[9]:=

This will use the default sparse solver, a mutifrontal direct solver.

In[12]:=

Out[12]=

The Krylov method is faster.

In[13]:=

Out[13]=

The use of the ILU preconditioner is even faster. Currently, the preconditioner can only work if the matrix has nonzero diagonal elements.

In[14]:=

Out[14]=

At present, only the ILU preconditioner is built-in. You can still define your own preconditioner by defining a function that is applied to the input matrix. An example that involves solving a diagonal matrix is shown below. First the problem is set up and solved without a preconditioner.

In[15]:=

Out[18]=

Now, a preconditioner function fun is used; there is a significant improvement to the speed of the computation.

In[19]:=

Out[20]=

Generally, a problem will not be structured so that it can benefit so much from such a simple preconditioner. However, this example is useful since it shows how to create and use your own preconditioner.

If the input matrix to the Krylov method is dense, the result is still found because the method is based on matrix/vector multiplication.

The Krylov method can be used to solve systems that are too large for a direct solver. However, it is not a general solver, being particularly suitable for those problems that have some form of diagonal dominance.

Cholesky

The Cholesky method is suitable for solving symmetric positive definite systems.

In[1]:=

Out[1]=

In[2]:=

In[3]:=

Out[3]=

The Cholesky method is faster for this matrix.

In[4]:=

Out[4]=

The Cholesky method is also more stable. This can be demonstrated with the following matrix.

In[5]:=

For this matrix the default LU decomposition method reports a potential error.

In[8]:=

Out[8]=

The Cholesky method succeeds.

In[9]:=

Out[9]=

For dense matrices the Cholesky method uses LAPACK functions such as dpotrf and dpotrs for real matrices and zpotrf and zpotrs for complex matrices. For sparse matrices the Cholesky method uses the TAUCS library.

Symbolic Methods

There are a number of methods that are specific to symbolic and exact computation: CofactorExpansion, DivisionFreeRowReduction, and OneStepRowReduction. These can be demonstrated for the following symbolic matrix.

In[1]:=

The computation is repeated for all three methods.

In[3]:=

Out[3]=

In[4]:=

Out[4]=

In[5]:=

Out[5]=