Documentation *Mathematica*Built-in Functions Advanced DocumentationLinear AlgebraMatrix Computations

Matrix Decompositions

This section will discuss a number of standard techniques for working with matrices. These are often used as building blocks for solving matrix problems. The decompositions fall into the categories of factorizations, orthogonal transformations, and similarity transformations.

LU Decomposition

The LU decomposition of a matrix is frequently used as part of a Gaussian elimination process for solving a matrix equation. In *Mathematica* there is no need to use an LU decomposition to solve a matrix equation, because the function LinearSolve does this for you, as discussed in the section Solving Linear Systems. Note that if you want to save the factorization of a matrix, so that it can be used to solve the same left-hand side, you can use the one argument form of LinearSolve, as demonstrated in the section Saving the Factorization.

The LU decomposition with partial pivoting of a square non-singular matrix involves the computation of the unique matrices , , and such that

where is lower triangular with ones on the diagonal, is upper triangular, and is a permutation matrix. Once it is found, it can be used to solve the matrix equation by finding the solution to the following equivalent system.

This can be done by first solving for in the following.

Then solve for to get the desired solution.

Because both and are triangular matrices it is particularly easy to solve these systems.

Despite the fact that in *Mathematica* you do not need to use the LU decomposition for solving matrix equations, *Mathematica* provides the function LUDecomposition for computing the LU decomposition.

LUDecomposition returns a list of three elements. The first element is a combination of upper and lower triangular matrices, the second element is a vector specifying rows used for pivoting (a permutation vector which is equivalent to the permutation matrix), and the third element is an estimate of the condition number.

In this example, the three results of the LUDecomposition computation are shown.

In[1]:=

Out[2]=

One of the features of the LU decomposition is that the lower and upper matrices are stored in the same matrix. The individual and factors can be obtained as follows. Note how they use a vectorized operation for element by element multiplication with a sparse matrix. This makes the process more efficient; these techniques are discussed further in the section Programming Efficiency.

This generates the matrix.

In[3]:=

Out[7]//MatrixForm=

This generates the matrix.

In[8]:=

Out[11]//MatrixForm=

The original matrix was non-singular and so this is a factorization. Therefore multiplying and recreates the original matrix permuted with the row pivot vector.

In[12]:=

Out[12]=

The following subtracts the product of the and matrices from the permuted original matrix. The difference is very close to zero.

In[13]:=

Out[13]=

The diagonal elements in the matrix are known as the pivots. If any zero pivots emerge during a computation of the LU factorization this means that the matrix is singular. In such a case LUDecomposition produces a warning.

In[14]:=

Out[15]=

Cholesky Decomposition

The Cholesky decomposition of a symmetric positive definite matrix is a factorization into a unique upper triangular such that

This factorization has a number of uses, one of which is that, because it is a triangular factorization, it can be used to solve systems of equations involving symmetric positive definite matrices. (The way that a triangular factorization can be used to solve a matrix equation is shown above in the section on LU Decomposition.) If a matrix is known to be of this form it is preferred over the LU factorization because the Cholesky factorization is faster to compute. If you want to solve a matrix equation using the Cholesky factorization you can do this directly from LinearSolve using the Cholesky method, this is described in a previous section.

The Cholesky factorization can be computed in *Mathematica* with the function CholeskyDecomposition.

In[1]:=

Out[3]//MatrixForm=

This reconstructs the original matrix.

In[4]:=

Out[4]=

If the matrix is not positive definite then the Cholesky decomposition does not exist.

In[5]:=

Out[6]=

There are a number of equivalent definitions of positive definite, such as the eigenvalues being all positive.

In[7]:=

Out[8]=

One way to test if a matrix is positive definite is to see if the Cholesky decomposition exists.

Note that if the matrix is complex, the factorization is defined by the conjugate transpose. The following computes the Cholesky decomposition of a complex matrix.

In[9]:=

Out[10]=

This demonstrates that the factorization is correct.

In[11]:=

Out[11]//MatrixForm=

Cholesky and LU Factorizations

The Cholesky factorization is related to the LU factorization as

where is the diagonal matrix of pivots. This can be demonstrated as follows.

In[1]:=

Out[4]//MatrixForm=

Now you can compute the matrix.

In[5]:=

Out[9]//MatrixForm=

This step computes the matrix of pivots.

In[10]:=

Out[12]//MatrixForm=

Finally, this computes ; its transpose is equal to the Cholesky decomposition.

In[13]:=

Out[13]//MatrixForm=

Gram-Schmidt Orthogonalization

Gram-Schmidt orthogonalization generates an orthonormal basis from an arbitrary basis. An orthonormal basis is a basis, , for which

In *Mathematica* a Gram-Schmidt orthogonalization can be computed from a set of vectors with the package function GramSchmidt, which is defined in the package LinearAlgebra`Orthogonalization`.

This loads the package.

In[1]:=

This creates a set of three vectors that form a basis for .

In[2]:=

A plot visualizes the vectors; they all tend to lie in the same direction.

In[5]:=

This computes an orthonormal basis.

In[6]:=

Out[6]=

The orthonormal vectors are obviously much more spread out.

In[7]:=

The vectors v1, v2, and v3 are orthonormal, thus the dot product of each vector with itself is 1.

In[8]:=

Out[8]=

In addition, the dot product of a vector with another vector is 0.

In[9]:=

Out[9]=

This uses Outer to compare all vectors with all other vectors.

In[10]:=

Out[10]=

QR Decomposition

The QR decomposition of a rectangular matrix with linearly independent columns involves the computation of matrices and such that

where the columns of form an orthonormal basis for the columns of and is upper triangular. It can be computed in *Mathematica* with the function QRDecomposition.

This computes the QR decomposition of a sample matrix. The result is a list of the two matrices.

In[1]:=

Out[2]=

In fact the first argument that is returned is a list of the orthonormal columns. To generate the matrix it is necessary to compute the transpose, as shown below.

In[3]:=

Out[4]=

This is a matrix factorization, so is equal to .

In[5]:=

Out[5]=

In addition, the columns of are orthonormal. Because the result of QRDecomposition returns the list of columns, they can be extracted with one part index. The following two examples demonstrate the orthonormal properties of the columns.

In[6]:=

Out[6]=

In[7]:=

Out[7]=

The QR decomposition is also defined for rectangular matrices. In this case .

In[8]:=

Out[10]=

For an input matrix where , the result of QRDecomposition are two matrices: an matrix and an matrix . When the matrix represents an overdetermined system of equations, that is, , QRDecomposition returns an matrix and an matrix . (Note that to generate the matrix , it is necessary to transpose the result from QRDecomposition.) This is demonstrated in the following example.

In[11]:=

Out[13]=

This is often referred to as the thin QR decomposition, see for example Golub and van Loan. If you wish to compute the full QR decomposition, it is possible to pad out with extra rows of zeros and with extra orthonormal columns. This can be done with the following function.

In[14]:=

The and computed by this function are and , respectively.

In[15]:=

Out[17]=

In[18]:=

Out[18]=

Solving Systems of Equations

For a square non-singular matrix the QR decomposition can be used to solve the matrix equation , as is also the case for the LU decomposition. However, when the matrix is rectangular, the QR decomposition is also useful for solving the matrix equation.

One particular application of the QR decomposition is to find least squares solutions to overdetermined systems, by solving the system of normal equations

because , this can be simplified as

Thus the normal equations simplify to

and because is non-singular this simplifies to

Because is triangular this is a particularly easy system to solve; sample code to implement this technique is given in the section Examples: Least Squares QR.

Singular Value Decomposition

The singular value decomposition of a rectangular matrix involves the computation of orthogonal matrices and and a diagonal matrix such that

The diagonal elements of the matrix are called the singular values of . In *Mathematica*, the functions SingularValueList computes the singular values and SingularValueDecomposition computes the full singular value decomposition.

In[1]:=

Out[2]=

In[3]:=

Out[4]=

This is a factorization so the original matrix can be restored.

In[5]:=

Out[5]//MatrixForm=

Because the matrices and are orthogonal, they can be used as an orthogonal transformation of the original matrix to generate the diagonal matrix with the singular values on the diagonal.

In[6]:=

Out[6]//MatrixForm=

If the matrix is singular then some of the singular values will be zero.

In[7]:=

Out[9]//MatrixForm=

SingularValueList only returns the nonzero singular values.

In[10]:=

Out[10]=

Note that if the matrix is complex, the definition of the singular value decomposition uses the conjugate transpose.

There are many applications of the singular value decomposition. The singular values of a matrix give information on the linear transformation represented by . For example, the action of on a unit sphere generates an ellipsoid with semiaxes given by the singular values. The singular values can be used to compute the rank of a matrix; the number of nonzero singular values is equal to the rank. The singular values can be used to compute the 2-norm of a matrix, and the columns of the matrix that correspond to zero singular values are the null space of the matrix.

Certain applications of singular values do not require all of the singular values to be computed. *Mathematica* provides a mechanism for obtaining only some singular values.

This returns the smallest singular value of the input matrix; because it is zero, this demonstrates the matrix is not full rank.

In[11]:=

Out[12]=

Generalized Singular Values

For an mn matrix and pn matrix the generalized singular values are given by the pair of factorizations

where is , is , and is ; and are orthogonal, and is invertible.

These are some sample matrices.

In[1]:=

This returns the generalized singular values of the matrices matA and matB.

In[3]:=

Out[3]=

This computes the full generalized singular value decomposition of the matrices matA and matB.

In[4]:=

This demonstrates the identity for matA.

In[5]:=

Out[5]//MatrixForm=

This demonstrates the identity for matB.

In[6]:=

Out[6]//MatrixForm=

Finally, the singular values are computed by dividing the corresponding diagonal elements.

In[7]:=

Out[7]=

Options

The functions SingularValueList and SingularValueDecomposition both take a Tolerance option.

The option controls the size at which singular values are treated as being zero. By default, values that are *tol* times smaller than the largest singular value are dropped, where *tol* is 100×$MachineEpsilon for machine number matrices. For arbitrary-precision matrices it is , where is the precision of the matrix.

The smallest singular value of this matrix is just larger than the default setting for the tolerance. It is not treated as being equivalent to a zero singular value.

In[1]:=

Out[2]=

Increasing the setting for the tolerance causes the small singular value to be treated as being zero.

In[3]:=

Out[3]=

Schur Decomposition

The Schur decomposition of a square matrix involves finding a unitary matrix that can be used for a similarity transformation of to form a block upper triangular matrix with 11 and 22 blocks on the diagonal (the 22 blocks correspond to complex conjugate pairs of eigenvalues for a real matrix ). A block upper triangular matrix of this form can be called upper quasi-triangular.

The Schur decomposition always exists and so the similarity transformation of to upper triangular always exists. This contrasts with the eigensystem similarity transformation, used to diagonalize a matrix, which does not always exist.

This computes the Schur decomposition of a sample matrix. The result is a list of the two matrices.

In[1]:=

Out[3]=

The matrix can be used for a similarity transformation on the matrix to generate the upper triangular result .

In[4]:=

Out[4]//MatrixForm=

For this particular matrix, a similarity transformation that generates an even simpler form can be found, because the matrix can be diagonalized.

In[5]:=

Out[7]//MatrixForm=

This matrix cannot be diagonalized because the matrix of eigenvectors is singular.

In[8]:=

Out[10]=

However, the Schur decomposition can be found and the matrix transformed to an upper triangular form.

In[11]:=

Out[12]=

In this example, the matrix has complex eigenvalues.

In[13]:=

Out[14]=

Now the resulting matrix has a 22 block on the diagonal.

In[15]:=

Out[16]=

The matrix can be used for a similarity transformation on the matrix to generate the upper quasi-triangular result. Note that an upper triangular result (with 11 blocks on the diagonal) that may involve complex numbers can be obtained by using the option RealBlockForm. When the result is upper triangular (i.e., has 11 blocks on the diagonal) the eigenvalues of the matrix are always found on the diagonal.

In[17]:=

Out[17]//MatrixForm=

Note that if the matrix is complex the definition of the Schur decomposition uses the conjugate transpose and returns an upper triangular result. This is demonstrated for the following complex matrix.

In[18]:=

Out[20]=

This demonstrates that the result satisfies the definition of the Schur decomposition.

In[21]:=

Out[21]//MatrixForm=

The diagonal of contains the eigenvalues of .

In[22]:=

Out[22]=

Generalized Schur Decomposition

For matrices and , the generalized Schur decomposition is defined as

where and are unitary, is upper triangular, and is upper quasi-triangular.

These are some sample matrices.

In[1]:=

This returns the generalized Schur decomposition.

In[3]:=

Out[3]=

This demonstrates the results are consistent with the definition of the decomposition.

In[4]:=

Out[4]//MatrixForm=

In[5]:=

Out[5]//MatrixForm=

For real input, a result involving complex numbers and an upper triangular result can be obtained with the option RealBlockForm.

Options

SchurDecomposition takes two options.

Options for SchurDecomposition.

The option Pivoting can be used to allow pivoting and scaling to improve the quality of the result. When it is set to True, pivoting and scaling may be used and a matrix that represents the changes to is returned. With this new matrix the definition of the Schur decomposition can be seen as follows.

The use of pivoting and scaling is now demonstrated for the following matrix. When Pivoting is set to True, pivoting and scaling are used if necessary, and an extra matrix (which here only represents scaling) is returned.

In[1]:=

Out[3]=

This demonstrates the transformation to an upper triangular form.

In[4]:=

Out[4]//MatrixForm=

The diagonal elements of are the eigenvalues of .

In[5]:=

Out[5]=

When the Schur decomposition is computed without pivoting and scaling, the diagonal elements of are not as close to the eigenvalues of . This demonstrates the utility of the Pivoting option.

In[6]:=

Out[7]//MatrixForm=

The option RealBlockForm controls the generation of the upper quasi-triangular result. If this is set to True, a result that may have 22 blocks on the diagonal is generated. If it is set to False, the result is upper triangular with 11 blocks on the diagonal (but which may involve complex numbers). This is demonstrated for the following real matrix, which has complex eigenvalues and a Schur decomposition with 22 blocks on the diagonal.

In[8]:=

Out[9]=

Setting RealBlockForm to False generates a matrix that is upper triangular; both and are complex.

In[10]:=

Out[11]=

This generates the upper triangular result .

In[12]:=

Out[12]//MatrixForm=

Jordan Decomposition

The Jordan decomposition of a square matrix involves finding the non-singular matrix that can be used for a similarity transformation of to generate a matrix (known as the Jordan form) that has a particularly simple triangular structure.

The Jordan decomposition always exists, but it is hard to compute with floating point arithmetic. However, computation with exact arithmetic avoids these problems.

This demonstrates the Jordan form for a sample matrix.

In[1]:=

Out[3]//MatrixForm=

The Jordan form has the eigenvalues of the matrix along its diagonal. Any defective eigenvalues are grouped into blocks by 1s just above the diagonal. The Jordan form of the above matrix is shown below.

In[4]:=

Out[4]//MatrixForm=

The Jordan form shows that there are two eigenvalues: -1 and 2. The eigenvalue -1 is repeated twice and has a complete set of eigenvectors. The eigenvalue 2 is repeated four times. It appears once with its own eigenvector, and then three times with only one full eigenvector. This is demonstrated when the eigensystem for the matrix is computed.

In[5]:=

Out[5]=