MATHEMATICA TUTORIAL

Stiffness Detection

Overview

Many differential equations exhibit some form of stiffness, which restricts the step size and hence effectiveness of explicit solution methods.

A number of implicit methods have been developed over the years to circumvent this problem.

For the same step size, implicit methods can be substantially less efficient than explicit methods, due to the overhead associated with the intrinsic linear algebra.

This cost can be offset by the fact that, in certain regions, implicit methods can take substantially larger step sizes.

Several attempts have been made to provide user-friendly codes that automatically attempt to detect stiffness at runtime and switch between appropriate methods as necessary.

A number of strategies that have been proposed to automatically equip a code with a stiffness detection device are outlined here.

Particular attention is given to the problem of estimation of the dominant eigenvalue of a matrix in order to describe how stiffness detection is implemented in NDSolve.

Numerical examples illustrate the effectiveness of the strategy.

Initialization

Load some packages with predefined examples and utility functions.
In[1]:=
Click for copyable input

Introduction

Consider the numerical solution of initial value problems:

Stiffness is a combination of problem, solution method, initial condition and local error tolerances.

Stiffness limits the effectiveness of explicit solution methods due to restrictions on the size of steps that can be taken.

Stiffness arises in many practical systems as well as in the numerical solution of partial differential equations by the method of lines.

Example

The van der Pol oscillator is a non-conservative oscillator with nonlinear damping and is an example of a stiff system of ordinary differential equations:

with ;

consider initial conditions

and solve over the interval .

The method uses a pair of extrapolation methods by default:

  • Explicit modified midpoint (Gragg smoothing), double-harmonic sequence 2, 4, 6, ...
  • Linearly implicit Euler, sub-harmonic sequence 2, 3, 4, ...

Solution

This loads the problem from a package.
In[4]:=
Click for copyable input
Solve the system numerically using a nonstiff method.
Solve the system using a method that switches when stiffness occurs.
In[6]:=
Click for copyable input
Here is a plot of the two solution components. The sharp peaks (in blue) extend out to about 450 in magnitude and have been cropped.
In[7]:=
Click for copyable input
Out[7]=

Stiffness can often occur in regions that follow rapid transients.

This plots the step sizes taken against time.
In[8]:=
Click for copyable input
Out[8]=

The problem is that when the solution is changing rapidly, there is little point using a stiff solver, since local accuracy is the dominant issue.

For efficiency, it would be useful if the method could automatically detect regions where local accuracy (and not stability) is important.

Linear Stability

Linear stability theory arises from the study of Dahlquist's scalar linear test equation:

as a simplified model for studying the initial value problem (1).

Stability is characterized by analyzing a method applied to (2) to obtain

where and is the (rational) stability function.

The boundary of absolute stability is obtained by considering the region:

Explicit Euler Method

The explicit or forward Euler method

applied to (3) gives

The shaded region represents instability, where TemplateBox[{{R, (, z, )}}, Abs]>1.

In[9]:=
Click for copyable input
Out[9]=

The linear stability boundary (LSB) is often taken as the intersection with the negative real axis.

For the explicit Euler method .

For an eigenvalue of , linear stability requirements mean that the step size needs to satisfy , which is a very mild restriction.

However, for an eigenvalue of , linear stability requirements mean that the step size needs to satisfy , which is a very severe restriction.

Example

This example shows the effect of stiffness on the step size sequence when using an explicit Runge-Kutta method to solve a stiff system.

This system models a chemical reaction.
In[10]:=
Click for copyable input
The system is solved by disabling the built-in stiffness detection.
In[11]:=
Click for copyable input
The step size sequence starts to oscillate when the stability boundary is reached.
In[12]:=
Click for copyable input
Out[12]=
  • A large number of step rejections often has a negative impact on performance.
    • The large number of steps taken adversely affects the accuracy of the computed solution.
    The built-in detection does an excellent job of locating when stiffness occurs.

Implicit Euler Method

The implicit or backward Euler method:

applied to (4) gives:

The method is unconditionally stable for the entire left half-plane.

In[14]:=
Click for copyable input
Out[14]=

This means that to maintain stability there is no longer a restriction on the step size.

The drawback is that an implicit system of equations now has to be solved at each integration step.

Type Insensitivity

A type-insensitive solver recognizes and responds efficiently to stiffness at each step and so is insensitive to the (possibly changing) type of the problem.

One of the most established solvers of this class is LSODA [H83], [P83].

Later generations of LSODA such as CVODE no longer incorporate a stiffness detection device. The reason is because LSODA uses norm bounds to estimate the dominant eigenvalue and these bounds, as will be seen later, can be quite inaccurate.

The low order of A()-stable BDF methods means that LSODA and CVODE are not very suitable for solving systems with high accuracy or systems where the dominant eigenvalue has a large imaginary part. Alternative methods, such as those based on extrapolation of linearly implicit schemes, do not suffer from these issues.

Much of the work on stiffness detection was carried out in the 1980s and 1990s using standalone FORTRAN codes.

New linear algebra techniques and efficient software have since become available and these are readily accessible in Mathematica.

Stiffness can be a transient phenomenon, so detecting nonstiffness is equally important [S77], [B90].

"StiffnessTest" Method Option

There are several approaches that can be used to switch from a nonstiff to a stiff solver.

Direct Estimation

A convenient way of detecting stiffness is to directly estimate the dominant eigenvalue of the Jacobian of the problem (see [S77], [P83], [S83], [S84a], [S84c], [R87] and [HW96]).

Such an estimate is often available as a by-product of the numerical integration and so it is reasonably inexpensive.

If denotes an approximation to the eigenvector corresponding to the dominant eigenvalue of the Jacobian, with sufficiently small, then by the mean value theorem a good approximation to the leading eigenvalue is

Richardson's extrapolation provides a sequence of refinements that yields a quantity of this form, as do certain explicit Runge-Kutta methods.

Cost is at most two function evaluations, but often at least one of these is available as a by-product of the numerical integration, so it is reasonably inexpensive.

Let denote the linear stability boundary—the intersection of the linear stability region with the negative real axis.

The product gives an estimate that can be compared to the linear stability boundary of a method in order to detect stiffness:

where is a safety factor.

Description

The methods , , and have the option , which can be used to identify whether the method applied with the specified AccuracyGoal and PrecisionGoal tolerances to a given problem is stiff.

The method option itself accepts a number of options that implement a weak form of (5) where the test is allowed to fail a specified number of times.

The reason for this is that some problems can be only mildly stiff in a certain region and an explicit integration method may still be efficient.

"NonstiffTest" Method Option

The method has the option , which is used to switch back from a stiff method to a nonstiff method.

The following settings are allowed for the option :

Switching to a Nonstiff Solver

An approach that is independent of the stiff method is used.

Given the Jacobian J (or an approximation), compute one of:

Norm Bound:

Spectral Radius: rho(J)=max TemplateBox[{{lambda, _, i}}, Abs]

Dominant Eigenvalue .

Many linear algebra techniques focus on solving a single problem to high accuracy.

For stiffness detection, a succession of problems with solutions to one or two digits is adequate.

For a numerical discretization

consider a sequence of matrices in some sub-interval(s)

The spectra of the succession of matrices often change very slowly from step to step.

The goal is to find a way of estimating (bounds on) dominant eigenvalues of succession of matrices that:

  • Costs less than the work carried out in the linear algebra at each step in the stiff solver.
  • Takes account of the step-to-step nature of the solver.

NormBound

A simple and efficient technique of obtaining a bound on the dominant eigenvalue is to use the norm of the Jacobian TemplateBox[{J, p}, Norm2] where typically or .

The method has complexity , which is less than the work carried out in the stiff solver.

This is the approach used by LSODA.

  • Norm bounds for dense matrices overestimate and the bounds become worse as the dimension increases.
    • Norm bounds can be tight for sparse or banded matrices of quite large dimension.

    The setting of the option computes TemplateBox[{J, 1}, Norm2] and TemplateBox[{J, infty}, Norm2] and returns the smaller of the two values.

Example

The following Jacobian matrix arises in the numerical solution of the van der Pol system using a stiff solver.
In[2]:=
Click for copyable input
Bounds based on norms overestimate the spectral radius by more than an order of magnitude.
In[3]:=
Click for copyable input
Out[3]=

Direct Eigenvalue Computation

For small problems () it can be efficient just to compute the dominant eigenvalue directly.

  • Hermitian matrices use the LAPACK function xgeev
    • General matrices use the LAPACK function xsyevr

    The setting of the option computes the dominant eigenvalue of using the same LAPACK routines as Eigenvalues.

    For larger problems the cost of direct eigenvalue computation is , which becomes prohibitive when compared to the cost of the linear algebra work in a stiff solver.

    A number of iterative schemes have been implemented for this purpose. These effectively work by approximating the dominant eigenspace in a smaller subspace and using dense eigenvalue methods for the smaller problem.

The Power Method

Shampine has proposed the use of the power method for estimating the dominant eigenvalue of the Jacobian [S91].

The power method is perhaps not a very well-respected method, but has received a resurgence of interest due to its use in Google's page ranking.

The power method can be used when

  • has n linearly independent eigenvectors (diagonalizable).
  • The eigenvalues can be ordered in magnitude as .
  • is the dominant eigenvalue of .

Description

Given a starting vector , compute

The Rayleigh quotient is used to compute an approximation to the dominant eigenvalue:

In practice, the approximate eigenvector is scaled at each step:

Properties

The power method converges linearly with rate

which can be slow.

In particular, the method does not converge when applied to a matrix with a dominant complex conjugate pair of eigenvalues.

Generalizations

The power method can be adapted to overcome the issue of equimodular eigenvalues (e.g. NAPACK).

However the modification does not generally address the issue of the slow rate of convergence for clustered eigenvalues.

There are two main approaches to generalizing the power method:

  • Subspace iteration for small to medium dimensions
    • Arnoldi iteration for large dimensions

    Although the methods work quite differently, there are a number of core components that can be shared and optimized.

    Subspace and Krylov iteration cost operations.

    They project an matrix to an matrix, where .

    The small matrix represents the dominant eigenspace and approximation uses dense eigenvalue routines.

Subspace Iteration

Subspace (or simultaneous) iteration generalizes the ideas in the power method by acting on m vectors at each step.

Start with an orthonormal set of vectors , where usually :

Form the product with the matrix :

In order to prevent all vectors from converging to multiples of the same dominant eigenvector of , they are orthonormalized:

The orthonormalization step is expensive compared to the matrix product.

Rayleigh-Ritz Projection

Input: matrix and an orthonormal set of vectors

  • Compute the Rayleigh quotient
    • Compute the Schur decomposition

    The matrix S has small dimension .

    Note that the Schur decomposition can be computed in real arithmetic when using a quasi upper triangular matrix .

Convergence

Subspace (or simultaneous) iteration generalizes the ideas in the power method by acting on vectors at each step.

SRRIT converges linearly with rate

In particular, the rate for the dominant eigenvalue is

Therefore it can be beneficial to take, for example, or more even if we are only interested in the dominant eigenvalue.

Error Control

A relative error test on successive approximations' dominant eigenvalue is

This is not sufficient since it can be satisfied when convergence is slow.

If or then the ^(th) column of is not uniquely determined.

The residual test used in SRRIT is:

where , is the ^(th) column of , and is the ^(th) column of .

This is advantageous since it works for equimodular eigenvalues.

The first column position of the upper triangular matrix is tested because of the use of an ordered Schur decomposition.

Implementation

There are several implementations of subspace iteration.

  • Subspace iteration with Chebyshev acceleration [S84b], [DS93]

    The implementation for use in is based on:

    • Schur Rayleigh-Ritz iteration [BS97]

    "An attractive feature of SRRIT is that it displays monotonic consistency, that is, as the convergence tolerance decreases so does the size of the computed residuals" [LS96].

    SRRIT makes use of an ordered Schur decomposition where eigenvalues of the largest modulus appear in the upper-left entries.

    Modified Gram-Schmidt with reorthonormalization is used to form , which is faster than Householder transformations.

    The approximate dominant subspace at integration time is used to start the iteration at the next integration step :

KrylovIteration

Given an matrix whose columns comprise an orthogonal basis of a given subspace :

The Rayleigh-Ritz procedure consists of computing and solving the associated eigenproblem .

The approximate eigenpairs of the original problem , satisfy and , which are called Ritz values and Ritz vectors.

The process works best when the subspace approximates an invariant subspace of .

This process is effective when is equal to the Krylov subspace associated with a matrix and a given initial vector as

Description

The method of Arnoldi is a Krylov-based projection algorithm that computes an orthogonal basis of the Krylov subspace and produces a projected matrix with .

Input: matrix , the number of steps , an initial vector of norm 1.

Output: with

In the case of Arnoldi, has an unreduced upper Hessenberg form (upper triangular with an additional nonzero subdiagonal).

Orthogonalization is usually carried out by means of a Gram-Schmidt procedure.

The quantities computed by the algorithm satisfy

The residual gives an indication of proximity to an invariant subspace and the associated norm indicates the accuracy of the computed Ritz pairs:

Restarting

The Ritz pairs converge quickly if the initial vector is rich in the direction of the desired eigenvalues.

When this is not the case then a restarting strategy is required in order to avoid excessive growth in both work and memory.

There are several strategies for restarting, in particular:

  • Explicit restart—a new starting vector is a linear combination of a subset of the Ritz vectors.
    • Implicit restart—a new starting vector is formed from the Arnoldi process combined with an implicitly shifted QR algorithm.

    Explicit restart is relatively simple to implement, but implicit restart is more efficient since it retains the relevant eigeninformation of the larger problem. However implicit restart is difficult to implement in a numerically stable way.

    An alternative that is much simpler to implement, but achieves the same effect as implicit restart, is a Krylov-Schur method [S01].

Implementation

A number of software implementations are available, in particular:

    The implementation in is based on Krylov-Schur iteration.

Automatic Strategy

The setting uses an amalgamation of the methods as follows.

  • For direct eigenvalue computation is used. Either or is used, depending on which method is active.
  • For subspace iteration is used with a default basis size of . If the method succeeds then the resulting basis is used to start the method at the next integration step.
  • If subspace iteration fails to converge after iterations then the dominant vector is used to start the Krylov method with a default basis size of . Subsequent integration steps use the Krylov method, starting with the resulting vector from the previous step.
  • If Krylov iteration fails to converge after iterations, then norm bounds are used for the current step. The next integration step will continue to try to use Krylov iteration.
  • Since they are so inexpensive, norm bounds are always computed when subspace or Krylov iteration is used and the smaller of the absolute values is used.

Step Rejections

Caching of the time of evaluation ensures that the dominant eigenvalue estimate is not recomputed for rejected steps.

Stiffness detection is also performed for rejected steps since:

  • Step rejections often occur for nonstiff solvers when working near the stability boundary.
  • Step rejections often occur for stiff solvers when resolving fast transients.

Iterative Method Options

The iterative methods of have options that can be modified:

In[20]:=
Click for copyable input
Out[20]=
In[21]:=
Click for copyable input
Out[21]=

The default tolerance aims for just one correct digit, but often obtains substantially more accurate values—especially after a few successful iterations at successive steps.

The default values limiting the number of iterations are:

  • For subspace iteration .
    • For Krylov iteration .

    If these values are set too large then a convergence failure becomes too costly.

    In difficult problems, it is better to share the work of convergence across steps. Since the methods effectively refine the basis vectors from the previous step, there is a reasonable chance of convergence in subsequent steps.

Latency and Switching

It is important to incorporate some form of latency in order to avoid a cycle where the method continually tries to switch between stiff and nonstiff methods.

The options and of and are used for this purpose.

The default settings allow switching to be quite reactive, which is appropriate for one-step integration methods.

  • is carried out at the end of a step with a nonstiff method. When either value of the option is reached, a step rejection occurs and the step is recomputed with a stiff method.
  • is preemptive. It is performed before a step is taken with a stiff solve using the Jacobian matrix from the previous step.

Examples

Van der Pol

Select an example system.
In[22]:=
Click for copyable input

StiffnessTest

The system is integrated successfully with the given method and the default option settings for .
In[23]:=
Click for copyable input
Out[23]=
A longer integration is aborted and a message is issued when the stiffness test condition is not satisfied.
In[24]:=
Click for copyable input
Out[24]=
Using a unit safety factor and specifying that only one stiffness failure is allowed effectively gives a strict test. The specification uses the nested method option syntax.
In[25]:=
Click for copyable input
Out[25]=

NonstiffTest

For such a small system, direct eigenvalue computation is used.

The example serves as a good test that the overall stiffness switching framework is behaving as expected.

Set up a function to monitor the switch between stiff and nonstiff methods and the step size taken. Data for the stiff and nonstiff solvers is put in separate lists by using a different tag for Sow.
In[26]:=
Click for copyable input
Solve the system and collect the data for the method switching.
In[28]:=
Click for copyable input
Plot the step sizes taken using an explicit solver (blue) and an implicit solver (red).
In[30]:=
Click for copyable input
Out[30]=
Compute the number of nonstiff and stiff steps taken (including rejected steps).
In[31]:=
Click for copyable input
Out[31]=

CUSP

The cusp catastrophe model for the nerve impulse mechanism [Z72]:

Combining with the van der Pol oscillator gives rise to the CUSP system [HW96]:

where

and and .

Discretization of the diffusion terms using the method of lines is used to obtain a system of ODEs of dimension .

Unlike the van der Pol system, because of the size of the problem, iterative methods are used for eigenvalue estimation.

Step Size and Order Selection

Select the problem to solve.
In[32]:=
Click for copyable input
Set up a function to monitor the type of method used and step size. Additionally, the order of the method is included as a tooltip.
In[33]:=
Click for copyable input
Collect the data for the order of the method as the integration proceeds.
In[35]:=
Click for copyable input
Plot the step sizes taken using an explicit solver (blue) and an implicit solver (red). A tooltip shows the order of the method at each step.
In[37]:=
Click for copyable input
Out[37]=
Compute the total number of nonstiff and stiff steps taken (including rejected steps).
In[39]:=
Click for copyable input
Out[39]=

Jacobian Example

Define a function to collect the first few Jacobian matrices.
In[41]:=
Click for copyable input
In[43]:=
Click for copyable input

A switch to a stiff method occurs near 0.00113425 and the first test for nonstiffness occurs at the next step .

Graphical illustration of the Jacobian .
In[45]:=
Click for copyable input
Out[45]=
Define a function to compute and display the first few eigenvalues of , , ... and the norm bounds.
In[46]:=
Click for copyable input
In[47]:=
Click for copyable input
Out[47]=

Norm bounds are quite sharp in this example.

Korteweg-deVries

The Korteweg-deVries partial differential equation is a mathematical model of waves on shallow water surfaces:

We consider boundary conditions

and solve over the interval .

Discretization using the method of lines is used to form a system of 192 ODEs.

Step Sizes

Select the problem to solve.
In[48]:=
Click for copyable input
The backward differentiation formula methods used in LSODA run into difficulties solving this problem.
In[49]:=
Click for copyable input
Out[49]=
A plot shows that the step sizes rapidly decrease.
In[50]:=
Click for copyable input
Out[50]=
In contrast, immediately switches to using the linearly implicit Euler method, which needs very few integration steps.
In[51]:=
Click for copyable input
Out[51]=
In[52]:=
Click for copyable input
Out[52]=

The extrapolation methods never switch back to a nonstiff solver once the stiff solver is chosen at the beginning of the integration.

Therefore this is a form of worst case example for the nonstiff detection.

Despite this, the cost of using subspace iteration is only a few percent of the total integration time.

Compute the time taken with switching to a nonstiff method disabled.
In[53]:=
Click for copyable input
Out[53]=

Jacobian Example

Collect data for the first few Jacobian matrices using the previously defined monitor function.
In[54]:=
Click for copyable input
Graphical illustration of the initial Jacobian .
In[56]:=
Click for copyable input
Out[56]=
Compute and display the first few eigenvalues of , , ... and the norm bounds.
In[57]:=
Click for copyable input
Out[57]=

Norm bounds overestimate slightly, but more importantly they give no indication of the relative size of real and imaginary parts.

Option Summary

StiffnessTest

option name
default value
"MaxRepetitions"{3,5}specify the maximum number of successive and total times that the stiffness test (6) is allowed to fail
"SafetyFactor"specify the safety factor to use in the right-hand side of the stiffness test (7)

Options of the method option .

NonstiffTest

option name
default value
"MaxRepetitions"{2,∞}specify the maximum number of successive and total times that the stiffness test (8) is allowed to fail
"SafetyFactor"specify the safety factor to use in the right-hand side of the stiffness test (9)

Options of the method option .

New to Mathematica? Find your learning path »
Have a question? Ask support »