Additional functionality related to this tutorial has been introduced in subsequent versions of Mathematica. For the latest information, see Matrices and Linear Algebra.

Performance of Linear Algebra Computation

This tutorial covers issues related to performance.

Packed Arrays

One of the challenges of building Mathematica is to make sure that the system is general enough to support symbolic computation and fast enough so that numerical computation is efficient. These and other goals are described under "Design Principles of Mathematica". One technique that Mathematica provides is to use specialized representations for important types of Mathematica expression. In the case of linear algebra this involves what is known as Packed Array Technology, and it was first introduced in Mathematica 4.0.

An example shows a vector of real numbers.
In[1]:=
Click for copyable input
Out[2]=
In[3]:=
Click for copyable input
Out[3]=
Now insert a symbolic quantity into the vector. Updating elements of vectors and matrices is discussed under "Setting Pieces of a Matrix".
In[4]:=
Click for copyable input
Out[4]=
When the byte count is measured the vector is shown to use more memory.
In[5]:=
Click for copyable input
Out[5]=
In addition, a mathematical operation on the vector is much slower.
In[6]:=
Click for copyable input
Out[6]=

The first vector uses less memory and supports faster computations because it is built from a packed array. Packed arrays are a representation for rectangular tensors of machine integers, machine reals, and complex numbers with real and imaginary parts that are machine reals.

machinesized integers
machineprecision reals
machineprecision complex numbers

Packed array types.

There are a number of functions that are useful for working with packed arrays. tests if its argument is a packed array. This demonstrates that the result of the Table computation is a packed array.
In[7]:=
Click for copyable input
Out[8]=
If a real number is inserted into the array, the result is still packed.
In[9]:=
Click for copyable input
Out[10]=
If a computation is done with the packed array, the result is packed.
In[11]:=
Click for copyable input
Out[12]=
However, if a general symbolic quantity is inserted into the packed array, the result is not packed.
In[13]:=
Click for copyable input
Out[14]=
All the elements of the packed array must be of the same type. Therefore, if the integer 0 is inserted into a packed array of reals, the result is not packed.
In[15]:=
Click for copyable input
Out[17]=

Packed Array Functions

A number of functions for working with packed arrays are summarized.

Developer`PackedArrayQ[expr]return True if expr is a packed array
Developer`ToPackedArray[expr]convert expr to a packed array if possible
Developer`ToPackedArray[expr,t]convert expr to a packed array of type t if possible
Developer`FromPackedArray[expr]convert expr from a packed array
Developer`PackedArrayForm[expr]print packed arrays in expr in a summary form

Packed array functions.

The function can be used to make a packed array.
In[1]:=
Click for copyable input
Out[2]=
If the elements of the list are not all of the same type the result will not be a packed array.
In[3]:=
Click for copyable input
Out[4]=
By specifying a type of Real, the integers are coerced to reals and the result is packed.
In[5]:=
Click for copyable input
Out[6]=
This shows that is a packed array of three reals.
In[7]:=
Click for copyable input
Out[7]=
This does not make a packed array because the argument is not a rectangular tensor.
In[8]:=
Click for copyable input
Out[9]=

Packed Array Operations

The important detail of packed arrays is that they work exactly like lists except for the packed array specific functions described in "Packed Array Functions". For example, SameQ on the packed and nonpacked version of a list will return True.
In[1]:=
Click for copyable input
Out[2]=
The packed and nonpacked versions will behave the same in the pattern matcher.
In[3]:=
Click for copyable input
Out[3]=
In[4]:=
Click for copyable input
Out[4]=

Many functions can work more efficiently with packed arrays compared with unpacked lists, but the results will be the same, whether a packed or an unpacked version is used. In order to maintain consistency, the system must, on occasion, unpack a packed array. Because this is a potential source of inefficiency, a message is available to let you know that something has become unpacked. You can enable this message with a system option.

This sets the system option and creates a packed array to use as an example.
In[5]:=
Click for copyable input
Out[7]=
Now if a real number is inserted into the packed array, it must be unpacked and a message is produced. This message can warn you that your code may not run as efficiently.
It is usually a good idea to turn the message off again.
In[9]:=
Click for copyable input

Packed Array Summary

The main point about packed arrays is that Mathematica uses them automatically where appropriate. This is very much the case for the linear algebra functions, which naturally work with the same internal representation. Thus many functions return packed arrays even if the input is not packed. For example, the function Dot returns a packed array because internally it works with the packed version.
In[1]:=
Click for copyable input
Out[2]=

To make use of packed arrays, it is typically hard to exploit functions such as . The main issue is to make sure that you have a uniformity of elements in your matrices; for example, that you are not mixing integers and reals.

Programming Efficiency

This section discusses techniques for writing efficient Mathematica programs that solve linear algebra problems.

Measuring Performance

If you want to study the performance of your Mathematica code there are a number of timing related functions you can use.

Timing[expr]evaluate expr and return a list of the CPU time needed, together with the result obtained
AbsoluteTiming[expr]evaluate expr giving the absolute time taken

Timing Mathematica operations.

The Timing function is particularly useful because it returns the CPU time needed for the computation.
In[1]:=
Click for copyable input
Out[2]=

Another useful function is TimeConstrained.

TimeConstrained[expr,t]try to evaluate expr, aborting the calculation after t seconds
TimeConstrained[expr,t,failexpr]return failexpr if the time constraint is not met

Timeconstrained calculation.

This helps you to develop and test your code, shutting it down if it takes more than a preset limit.
In[3]:=
Click for copyable input
Out[4]=

Vectorizing Loops

A common operation involves some type of iteration over the elements of a matrix. Because the matrices may be large it is obviously important to do this efficiently.

One of the most important ways to write efficient code is to avoid explicit part references, especially in inner loops. This is a major source of inefficiency in user code and can be avoided by using functions that work on all the elements of a matrix or a vector at once. These vectorized operations allow Mathematica to use highly optimized code that runs faster.

List Creation

This problem involves creating a list and then initializing it with certain values. For example, computing

        vi  Sin[i]

One way to do this would be to create the list and then to use a For loop that iterates over its elements to set them to a value.

    n = 10;    
    v = Table[0, {n}];
    For[i = 1, i <= n, i++, v[[i]] = Sin[2.0 Pi i]];

Slow way to initialize a list.

It is much faster to create the list and initialize it all at once; it is also neater.

    n = 10;
    v = Table[ Sin[2.0 Pi i], {i, 1, n}]

Faster way to initialize a list.

An even faster way is to use the optimized function Range to create the list and then to use vectorized operations.

    n = 10;
    v = Sin[2.0 Pi Range[1., n]]

An even faster way to initialize a list.

Commands such as Table and Range are very optimized for building lists and matrices. They are discussed under "Building Matrices".

List Updating

This problem involves updating the elements of a list; for example, computing the Sin of every element.

        vi  Sin[vi]

One way to do this involves iterating over the elements of the list replacing each with the updated version.

    Do[v[[i]] = Sin[v[[i]]], {i, n}];

Slow way to update a list.

It is much faster to compute the new values by using a vectorized operation. The code is also neater and tidier.

    v = Sin[v]

Faster way to update a list.

Another form of updating requires another argument that is not constant across the list.

        vi  vi/i^2

One way to do this involves iterating over the elements of the list, dividing each by the square of the index.

    n = Length[v];
    Do[ v[[i]] = v[[i]]/i^2 , {i, n}];

Slow way to update a list.

It is faster to compute the list of updates and then carry out a vectorized division.

    d=Range[n]^2;
    v=v/d

Faster way to update a list.

The use of listable operations is discussed under "Listability".

Using Built-in Support

Mathematica has a very wide range of functions, including many for list processing that have important applications for writing matrix computations. Before you start to work it is worth looking to see if any of the built-in functions will do what you want. For example, if what you are doing involves a convolution or correlation then you should use the appropriate functions; they will almost certainly be faster.

Another important point to remember is that Mathematica can represent a wide range of mathematical objects. For example, Mathematica can work with the actual linear equations that matrices are used to represent. Sometimes the equational form can be more expressive and convenient and this can lead to greater efficiency. The topic is discussed under "Converting Equations to Sparse Arrays".

One area of built-in functions with many applications for linear algebra computations are those for structural manipulation of lists. Mathematica functions such as Part, Take, Drop, Transpose, PadLeft, PadRight, RotateLeft, and RotateRight are all heavily optimized and have many applications. These are discussed in more detail under "Structural Operations". It is usually good if you can write your computations to use these functions.

A worked example will now be given that discusses different techniques to extend a matrix by tiling copies. For example, the input matrix

            

should be padded out to return the following matrix.

            

Three different ways to solve the problem are demonstrated.

A Slow Way

The slowest way to do this is to write this in a step-by-step approach directly manipulating every element.
In[1]:=
Click for copyable input
In[2]:=
Click for copyable input
Out[3]//MatrixForm=
In[4]:=
Click for copyable input
Out[5]=

This way suffers from several speed deficiencies, the most severe being the inner loops over the part indices. These would be much faster if they were replaced by vectorized code. It was explained in "Measuring Performance" that explicit part references in an inner loop can lead to inefficient code.

One thing to note about the function is that it will work when given a sparse matrix as input, but it will return a dense matrix as output.
In[6]:=
Click for copyable input
Out[7]=

A Faster Way

A much faster way to replicate the matrix is to write it to use Mathematica vectorizing functions.
In[1]:=
Click for copyable input
In[2]:=
Click for copyable input
Out[3]=
In[4]:=
Click for copyable input
Out[5]=

This method is much faster; it uses the function Part, which is very flexible and heavily optimized. It is described further under "Getting Pieces of a Matrix".

This function works for a sparse matrix as input, and it returns the result as a sparse array which maintains equivalent sparsity.
In[6]:=
Click for copyable input
Out[7]=

Also Fast but Neater

Another way to extend the matrix, which involves no programming, is to use the built-in function PadRight.
In[1]:=
Click for copyable input
Out[2]=
In[3]:=
Click for copyable input
Out[4]=

This method is also neater and is probably the fastest for small input matrices. For large matrices, it is similar to the speed of the Part example already shown. The use of PadRight is described under "Extending Matrices".

This function works for a sparse matrix as input, and it returns the result as a sparse array which maintains equivalent sparsity.
In[5]:=
Click for copyable input
Out[6]=

Matrix Contents

Matrices in Mathematica can be constructed from all the different types of object that Mathematica holds. They can contain machine-precision floating-point numbers, arbitrary-precision floating-point numbers, complex floating-point numbers, integers, rational numbers, and general symbolic quantities. The different types are discussed under "Matrix Types".

That Mathematica can work with these different types of matrix is a great strength of the system. However, it has a number of implications for the efficiency of numerical floating-point computation: first from a mix of symbolic and numerical entries, secondly from a mix of different types of numbers, and thirdly from integer matrices.

Mixed Symbolic/Numerical Matrices

If you mix symbolic and numerical elements in a matrix, the linear algebra functions will treat the matrix as symbolic as discussed under "Mixed Mode Matrices". It should be noted that techniques for working with symbolic matrices are typically slower than the heavily optimized techniques available for numerical matrices.

An example here compares matrix multiplication of two 200×200 matrices. The second computation, which involves a symbolic matrix, is much slower.
In[1]:=
Click for copyable input
Out[2]=
In[3]:=
Click for copyable input
Out[4]=

The solution is to make sure that your matrices only contain numbers.

Mixed Numerical Type Matrices

If you mix different types of numbers in your matrices the linear algebra functions will treat the matrix as numerical and will coerce the matrix to the lowest precision. This is discussed under "Mixed Mode Matrices". For linear algebra functions this will not have as severe an impact on performance as when working with symbolic matrices (this was demonstrated in the previous section).

This example compares matrix multiplication of two 200×200 matrices. The second matrix, which contains an integer, is slower for matrix/matrix multiplication. If the operation was something more costly, this difference might not be significant.
In[1]:=
Click for copyable input
Out[2]=
In[3]:=
Click for copyable input
Out[4]=
The cost of mixed numerical matrices comes because Mathematica cannot use its efficient storage techniques, as discussed in the section "Packed Arrays". This is demonstrated in the following example. Here, a vector of numbers is allocated and this is updated by some process. Because the updated values will be real numbers and the initial vector contained integers, the process cannot use packed arrays.
In[5]:=
Click for copyable input
Out[7]=
In[8]:=
Click for copyable input
Out[8]=
In this version the vector is initialized with real numbers. Now the computation can use packed arrays throughout and it runs faster.
In[9]:=
Click for copyable input
Out[11]=
In[12]:=
Click for copyable input
Out[12]=

The solution to this problem is to make sure you initialize with real numbers if you are going to work with real numbers.

Integer Matrices

A final type of problem involves working with purely integer matrices. Mathematica makes a distinction between working with integer and approximate real matrices; this is discussed under "Matrices in Mathematica". Because an integer matrix will use symbolic computation techniques, it will be slower. Of course, if Mathematica did not use integer matrix techniques with matrices of integers it would not be so general.
In[1]:=
Click for copyable input
Out[2]=
In[3]:=
Click for copyable input
Out[5]=
The difference between the two computations can be compared. The eigenvalues of the integer matrix are returned with the exact values; in this case this involves radicals. The eigenvalues of the real matrix are returned with approximate real values.
In[6]:=
Click for copyable input
Out[6]=
In[7]:=
Click for copyable input
Out[7]=

Of course, it is a strength of Mathematica that it can return the exact result for a computation involving an integer matrix. But it should be noted that many purely numerical systems would return the approximate result.

The solution is that to work with matrices of approximate real numbers, you should start with approximate real numbers.

Expression Efficiency

"Matrices as Mathematica Expressions" explained that Mathematica represents matrices as expressions, a uniform data type that is used throughout the system. It described the advantages that came from this uniformity in the system. There are a number of efficiency implications that arise from the use of Mathematica expressions. This section discusses these.

Updating of Matrices

A general principle of Mathematica expressions is that they cannot be changed directly. This immutability of expressions is very useful; it allows an expression to be shared in many locations and each use need not be concerned that some other part of the code will modify the expression. This means that when an expression is updated (this is described under "Setting Pieces of a Matrix"), it may be necessary to make a copy. In this example, a vector is made and assigned to two different symbols.
In[1]:=
Click for copyable input
Now change the contents of one symbol.
In[3]:=
Click for copyable input
Out[4]=
However, the other one is left unchanged. This is because the expression was copied before modification.
In[5]:=
Click for copyable input
Out[5]=
This copying can have an implication for efficiency, especially if you are iterating over the elements of a matrix and updating it. If this is done and you also have other copies of the matrix, it will need to be copied. The principle is demonstrated in the following two examples. In this first example there are no extra copies of the vector and so it does not need to be copied and the loop runs faster.
In[6]:=
Click for copyable input
Out[7]=
In this example there are extra copies of the vector and so it does need to be copied at each step; the loop runs more slowly.
In[8]:=
Click for copyable input
Out[9]=
Another slow way to update elements is to use ReplacePart. This also has to copy at every step.
In[10]:=
Click for copyable input
Out[11]=

If you use the vectorizing operations described previously, this is a good way to avoid the iteration over the matrix and minimize the number of copying operations. If you really cannot avoid iterating over the matrix, it is a good idea to keep the loop as simple and tidy as possible so that you can avoid extra copies of the matrix.

The time for the overall process will scale with the square of the length of the list for the technique that has to copy at every step. With no copying the time will scale linearly. The longer the input list, the greater the impact, and the performance of the code will start to degrade.

Appending to Matrices

In Mathematica, expressions are implemented as arrays. This allows fast access to any particular elements of an expression. Because matrices are regular Mathematica expressions, they are also arrays. While arrays are very fast for accessing elements, they are slower for adding elements. Typically this requires a copy, and it should be avoided. The example that follows shows the cost of adding elements.

In[1]:=
Click for copyable input
Out[2]=
It is much more efficient to generate the entire vector at once.
In[3]:=
Click for copyable input
Out[3]=
If the arguments are not known all at once, other techniques may be used. These are demonstrated next. First, build a list of random numbers.
In[4]:=
Click for copyable input
The first technique is using AppendTo.
In[6]:=
Click for copyable input
Out[7]=
It is much faster to accumulate the list, inserting empty lists that can be removed at the end with Flatten.
In[8]:=
Click for copyable input
Out[8]=
It may be inconvenient to use Flatten because this might interfere with the structure of the list. In this case it might be convenient to use Sow and Reap. This is more flexible than the Flatten approach.
In[9]:=
Click for copyable input
Out[9]=
All three methods generate the same result.
In[10]:=
Click for copyable input
Out[10]=

The time for the overall process will scale with the square of the length of the list for the technique that has to copy at every step. With no copying the time will scale linearly. The longer the input list, the greater the impact, and the performance of the code will start to degrade.