Stiffness Detection
Overview
Many differential equations exhibit some form of stiffness, which restricts the step size and hence effectiveness of explicit solution methods.
A number of implicit methods have been developed over the years to circumvent this problem.
For the same step size, implicit methods can be substantially less efficient than explicit methods, due to the overhead associated with the intrinsic linear algebra.
This cost can be offset by the fact that, in certain regions, implicit methods can take substantially larger step sizes.
Several attempts have been made to provide user-friendly codes that automatically attempt to detect stiffness at runtime and switch between appropriate methods as necessary.
A number of strategies that have been proposed to automatically equip a code with a stiffness detection device are outlined here.
Particular attention is given to the problem of estimation of the dominant eigenvalue of a matrix in order to describe how stiffness detection is implemented in
NDSolve.
Numerical examples illustrate the effectiveness of the strategy.
Initialization
Load some packages with predefined examples and utility functions.
Introduction
Consider the numerical solution of initial value problems:
Stiffness is a combination of problem, solution method, initial condition and local error tolerances.
Stiffness limits the effectiveness of explicit solution methods due to restrictions on the size of steps that can be taken.
Stiffness arises in many practical systems as well as in the numerical solution of partial differential equations by the method of lines.
Example
The van der Pol oscillator is a non-conservative oscillator with nonlinear damping and is an example of a stiff system of ordinary differential equations:
with

;
consider initial conditions
and solve over the interval

.
The method

uses a pair of extrapolation methods by default:
- Explicit modified midpoint (Gragg smoothing), double-harmonic sequence 2, 4, 6, ...
- Linearly implicit Euler, sub-harmonic sequence 2, 3, 4, ...
Solution
This loads the problem from a package.
Solve the system numerically using a nonstiff method.
Solve the system using a method that switches when stiffness occurs.
Here is a plot of the two solution components. The sharp peaks (in blue) extend out to about 450 in magnitude and have been cropped.
| Out[7]= |  |
Stiffness can often occur in regions that follow rapid transients.
This plots the step sizes taken against time.
| Out[8]= |  |
The problem is that when the solution is changing rapidly, there is little point using a stiff solver, since local accuracy is the dominant issue.
For efficiency, it would be useful if the method could automatically detect regions where local accuracy (and not stability) is important.
Linear Stability
Linear stability theory arises from the study of Dahlquist's scalar linear test equation:
as a simplified model for studying the initial value problem (
1).
Stability is characterized by analyzing a method applied to (
2) to obtain
where

and

is the (rational) stability function.
The boundary of absolute stability is obtained by considering the region:
Explicit Euler Method
The explicit or forward Euler method
The shaded region represents instability, where

.
| Out[9]= |  |
The linear stability boundary (LSB) is often taken as the intersection with the negative real axis.
For the explicit Euler method

.
For an eigenvalue of

, linear stability requirements mean that the step size needs to satisfy

, which is a very mild restriction.
However, for an eigenvalue of

, linear stability requirements mean that the step size needs to satisfy

, which is a very severe restriction.
Example
This example shows the effect of stiffness on the step size sequence when using an explicit Runge-Kutta method to solve a stiff system.
This system models a chemical reaction.
The system is solved by disabling the built-in stiffness detection.
The step size sequence starts to oscillate when the stability boundary is reached.
| Out[12]= |  |
- A large number of step rejections often has a negative impact on performance.
Implicit Euler Method
The implicit or backward Euler method:
The method is unconditionally stable for the entire left half-plane.
| Out[14]= |  |
This means that to maintain stability there is no longer a restriction on the step size.
The drawback is that an implicit system of equations now has to be solved at each integration step.
Type Insensitivity
A type-insensitive solver recognizes and responds efficiently to stiffness at each step and so is insensitive to the (possibly changing) type of the problem.
One of the most established solvers of this class is LSODA [
H83], [
P83].
Later generations of LSODA such as CVODE no longer incorporate a stiffness detection device. The reason is because LSODA uses norm bounds to estimate the dominant eigenvalue and these bounds, as will be seen later, can be quite inaccurate.
The low order of A(

)-stable BDF methods means that LSODA and CVODE are not very suitable for solving systems with high accuracy or systems where the dominant eigenvalue has a large imaginary part. Alternative methods, such as those based on extrapolation of linearly implicit schemes, do not suffer from these issues.
Much of the work on stiffness detection was carried out in the 1980s and 1990s using standalone FORTRAN codes.
New linear algebra techniques and efficient software have since become available and these are readily accessible in
Mathematica.
Stiffness can be a transient phenomenon, so detecting nonstiffness is equally important [
S77], [
B90].
"StiffnessTest" Method Option
There are several approaches that can be used to switch from a nonstiff to a stiff solver.
Direct Estimation
A convenient way of detecting stiffness is to directly estimate the dominant eigenvalue of the Jacobian

of the problem (see [
S77], [
P83], [
S83], [
S84a], [
S84c], [
R87] and [
HW96]).
Such an estimate is often available as a by-product of the numerical integration and so it is reasonably inexpensive.
If

denotes an approximation to the eigenvector corresponding to the dominant eigenvalue of the Jacobian, with

sufficiently small, then by the mean value theorem a good approximation to the leading eigenvalue is
Richardson's extrapolation provides a sequence of refinements that yields a quantity of this form, as do certain explicit Runge-Kutta methods.
Cost is at most two function evaluations, but often at least one of these is available as a by-product of the numerical integration, so it is reasonably inexpensive.
Let

denote the linear stability boundary—the intersection of the linear stability region with the negative real axis.
The product

gives an estimate that can be compared to the linear stability boundary of a method in order to detect stiffness:
where

is a safety factor.
Description
The methods

,

, and

have the option

, which can be used to identify whether the method applied with the specified
AccuracyGoal and
PrecisionGoal tolerances to a given problem is stiff.
The method option

itself accepts a number of options that implement a weak form of (
5) where the test is allowed to fail a specified number of times.
The reason for this is that some problems can be only mildly stiff in a certain region and an explicit integration method may still be efficient.
"NonstiffTest" Method Option
The

method has the option

, which is used to switch back from a stiff method to a nonstiff method.
The following settings are allowed for the option

:
Switching to a Nonstiff Solver
An approach that is independent of the stiff method is used.
Given the Jacobian
J (or an approximation), compute one of:
Norm Bound:

Spectral Radius:

Dominant Eigenvalue

.
Many linear algebra techniques focus on solving a single problem to high accuracy.
For stiffness detection, a succession of problems with solutions to one or two digits is adequate.
For a numerical discretization
consider a sequence

of matrices in some sub-interval(s)
The spectra of the succession of matrices often change very slowly from step to step.
The goal is to find a way of estimating (bounds on) dominant eigenvalues of

succession of matrices

that:
- Costs less than the work carried out in the linear algebra at each step in the stiff solver.
- Takes account of the step-to-step nature of the solver.
NormBound
A simple and efficient technique of obtaining a bound on the dominant eigenvalue is to use the norm of the Jacobian

where typically

or

.
The method has complexity

, which is less than the work carried out in the stiff solver.
This is the approach used by LSODA.
- Norm bounds for dense matrices overestimate and the bounds become worse as the dimension increases.
Example
The following Jacobian matrix arises in the numerical solution of the van der Pol system using a stiff solver.
Bounds based on norms overestimate the spectral radius by more than an order of magnitude.
| Out[3]= |  |
Direct Eigenvalue Computation
For small problems (

) it can be efficient just to compute the dominant eigenvalue directly.
- Hermitian matrices use the LAPACK function xgeev
- General matrices use the LAPACK function xsyevr
The setting

of the option

computes the dominant eigenvalue of

using the same LAPACK routines as
Eigenvalues.
For larger problems the cost of direct eigenvalue computation is

, which becomes prohibitive when compared to the cost of the linear algebra work in a stiff solver.
A number of iterative schemes have been implemented for this purpose. These effectively work by approximating the dominant eigenspace in a smaller subspace and using dense eigenvalue methods for the smaller problem.
The Power Method
Shampine has proposed the use of the power method for estimating the dominant eigenvalue of the Jacobian [
S91].
The power method is perhaps not a very well-respected method, but has received a resurgence of interest due to its use in Google's page ranking.
The power method can be used when
has
linearly independent eigenvectors (diagonalizable).
- The eigenvalues can be ordered in magnitude as
.
is the dominant eigenvalue of
.
Description
Given a starting vector

, compute
The Rayleigh quotient is used to compute an approximation to the dominant eigenvalue:
In practice, the approximate eigenvector is scaled at each step:
Properties
The power method converges linearly with rate
In particular, the method does not converge when applied to a matrix with a dominant complex conjugate pair of eigenvalues.
Generalizations
The power method can be adapted to overcome the issue of equimodular eigenvalues (e.g. NAPACK).
However the modification does not generally address the issue of the slow rate of convergence for clustered eigenvalues.
There are two main approaches to generalizing the power method:
- Subspace iteration for small to medium dimensions
- Arnoldi iteration for large dimensions
Although the methods work quite differently, there are a number of core components that can be shared and optimized.
Subspace and Krylov iteration cost

operations.
The small matrix represents the dominant eigenspace and approximation uses dense eigenvalue routines.
Subspace Iteration
Subspace (or simultaneous) iteration generalizes the ideas in the power method by acting on

vectors at each step.
Start with an
orthonormal set of vectors

, where usually

:
Form the product with the matrix

:
In order to prevent all vectors from converging to multiples of the same dominant eigenvector

of

, they are orthonormalized:
The orthonormalization step is expensive compared to the matrix product.
Rayleigh-Ritz Projection
Input: matrix

and an orthonormal set of vectors

- Compute the Rayleigh quotient

Convergence
Subspace (or simultaneous) iteration generalizes the ideas in the power method by acting on

vectors at each step.
SRRIT converges linearly with rate
In particular, the rate for the dominant eigenvalue is
Therefore it can be beneficial to take, for example,

or more even if we are only interested in the dominant eigenvalue.
Error Control
A relative error test on successive approximations' dominant eigenvalue is
This is not sufficient since it can be satisfied when convergence is slow.
If

or

then the


column of

is not uniquely determined.
The residual test used in SRRIT is:
where

,

is the


column of

, and

is the


column of

.
This is advantageous since it works for equimodular eigenvalues.
The first column position of the upper triangular matrix

is tested because of the use of an ordered Schur decomposition.
Implementation
There are several implementations of subspace iteration.
- Subspace iteration with Chebyshev acceleration [S84b], [DS93]
- Schur Rayleigh-Ritz iteration [BS97]
"An attractive feature of SRRIT is that it displays monotonic consistency, that is, as the convergence tolerance decreases so does the size of the computed residuals" [
LS96].
SRRIT makes use of an ordered Schur decomposition where eigenvalues of the largest modulus appear in the upper-left entries.
Modified Gram-Schmidt with reorthonormalization is used to form

, which is faster than Householder transformations.
The approximate dominant subspace

at integration time

is used to start the iteration at the next integration step

:
KrylovIteration
Given an



matrix

whose columns

comprise an orthogonal basis of a given subspace

:
The Rayleigh-Ritz procedure consists of computing

and solving the associated eigenproblem

.
The approximate eigenpairs of the original problem

,

satisfy

and

, which are called Ritz values and Ritz vectors.
The process works best when the subspace

approximates an invariant subspace of

.
This process is effective when

is equal to the Krylov subspace associated with a matrix

and a given initial vector

as
Description
The method of Arnoldi is a Krylov-based projection algorithm that computes an orthogonal basis of the Krylov subspace and produces a projected



matrix

with

.
Input: matrix

, the number of steps

, an initial vector

of norm 1.
Output:

with

In the case of Arnoldi,

has an unreduced upper Hessenberg form (upper triangular with an additional nonzero subdiagonal).
Orthogonalization is usually carried out by means of a Gram-Schmidt procedure.
The quantities computed by the algorithm satisfy
The residual

gives an indication of proximity to an invariant subspace and the associated norm

indicates the accuracy of the computed Ritz pairs:
Restarting
The Ritz pairs converge quickly if the initial vector

is rich in the direction of the desired eigenvalues.
When this is not the case then a restarting strategy is required in order to avoid excessive growth in both work and memory.
There are several strategies for restarting, in particular:
- Explicit restart—a new starting vector is a linear combination of a subset of the Ritz vectors.
- Implicit restart—a new starting vector is formed from the Arnoldi process combined with an implicitly shifted QR algorithm.
Explicit restart is relatively simple to implement, but implicit restart is more efficient since it retains the relevant eigeninformation of the larger problem. However implicit restart is difficult to implement in a numerically stable way.
An alternative that is much simpler to implement, but achieves the same effect as implicit restart, is a Krylov-Schur method [
S01].
Implementation
A number of software implementations are available, in particular:
Automatic Strategy
The

setting uses an amalgamation of the methods as follows.
- For
direct eigenvalue computation is used. Either
or
is used, depending on which method is active.
- For
subspace iteration is used with a default basis size of
. If the method succeeds then the resulting basis is used to start the method at the next integration step.
- If subspace iteration fails to converge after
iterations then the dominant vector is used to start the Krylov method with a default basis size of
. Subsequent integration steps use the Krylov method, starting with the resulting vector from the previous step.
- If Krylov iteration fails to converge after
iterations, then norm bounds are used for the current step. The next integration step will continue to try to use Krylov iteration.
- Since they are so inexpensive, norm bounds are always computed when subspace or Krylov iteration is used and the smaller of the absolute values is used.
Step Rejections
Caching of the time of evaluation ensures that the dominant eigenvalue estimate is not recomputed for rejected steps.
Stiffness detection is also performed for rejected steps since:
- Step rejections often occur for nonstiff solvers when working near the stability boundary.
- Step rejections often occur for stiff solvers when resolving fast transients.
Iterative Method Options
The iterative methods of

have options that can be modified:
| Out[20]= |  |
| Out[21]= |  |
The default tolerance aims for just one correct digit, but often obtains substantially more accurate values—especially after a few successful iterations at successive steps.
The default values limiting the number of iterations are:
- For subspace iteration
.
- For Krylov iteration
.
If these values are set too large then a convergence failure becomes too costly.
In difficult problems, it is better to share the work of convergence across steps. Since the methods effectively refine the basis vectors from the previous step, there is a reasonable chance of convergence in subsequent steps.
Latency and Switching
It is important to incorporate some form of latency in order to avoid a cycle where the

method continually tries to switch between stiff and nonstiff methods.
The options

and

of

and

are used for this purpose.
The default settings allow switching to be quite reactive, which is appropriate for one-step integration methods.
is carried out at the end of a step with a nonstiff method. When either value of the option
is reached, a step rejection occurs and the step is recomputed with a stiff method.
is preemptive. It is performed before a step is taken with a stiff solve using the Jacobian matrix from the previous step.
Examples
Van der Pol
Select an example system.
StiffnessTest
The system is integrated successfully with the given method and the default option settings for

.
| Out[23]= |  |
A longer integration is aborted and a message is issued when the stiffness test condition is not satisfied.
| Out[24]= |  |
Using a unit safety factor and specifying that only one stiffness failure is allowed effectively gives a strict test. The specification uses the nested method option syntax.
| Out[25]= |  |
NonstiffTest
For such a small system, direct eigenvalue computation is used.
The example serves as a good test that the overall stiffness switching framework is behaving as expected.
Set up a function to monitor the switch between stiff and nonstiff methods and the step size taken. Data for the stiff and nonstiff solvers is put in separate lists by using a different tag for
Sow.
Solve the system and collect the data for the method switching.
Plot the step sizes taken using an explicit solver (blue) and an implicit solver (red).
| Out[30]= |  |
Compute the number of nonstiff and stiff steps taken (including rejected steps).
| Out[31]= |  |
CUSP
The cusp catastrophe model for the nerve impulse mechanism [
Z72]:
Combining with the van der Pol oscillator gives rise to the CUSP system [
HW96]:
and

and

.
Discretization of the diffusion terms using the method of lines is used to obtain a system of ODEs of dimension

.
Unlike the van der Pol system, because of the size of the problem, iterative methods are used for eigenvalue estimation.
Step Size and Order Selection
Select the problem to solve.
Set up a function to monitor the type of method used and step size. Additionally, the order of the method is included as a tooltip.
Collect the data for the order of the method as the integration proceeds.
Plot the step sizes taken using an explicit solver (blue) and an implicit solver (red). A tooltip shows the order of the method at each step.
| Out[37]= |  |
Compute the total number of nonstiff and stiff steps taken (including rejected steps).
| Out[39]= |  |
Jacobian Example
Define a function to collect the first few Jacobian matrices.
A switch to a stiff method occurs near

and the first test for nonstiffness occurs at the next step

.
Graphical illustration of the Jacobian

.
| Out[45]= |  |
Define a function to compute and display the first few eigenvalues of

,

, ... and the norm bounds.
| Out[47]= |  |
Norm bounds are quite sharp in this example.
Korteweg-deVries
The Korteweg-deVries partial differential equation is a mathematical model of waves on shallow water surfaces:
We consider boundary conditions
and solve over the interval

.
Discretization using the method of lines is used to form a system of 192 ODEs.
Step Sizes
Select the problem to solve.
The backward differentiation formula methods used in LSODA run into difficulties solving this problem.
| Out[49]= |  |
A plot shows that the step sizes rapidly decrease.
| Out[50]= |  |
In contrast,

immediately switches to using the linearly implicit Euler method, which needs very few integration steps.
| Out[51]= |  |
| Out[52]= |  |
The extrapolation methods never switch back to a nonstiff solver once the stiff solver is chosen at the beginning of the integration.
Therefore this is a form of worst case example for the nonstiff detection.
Despite this, the cost of using subspace iteration is only a few percent of the total integration time.
Compute the time taken with switching to a nonstiff method disabled.
| Out[53]= |  |
Jacobian Example
Collect data for the first few Jacobian matrices using the previously defined monitor function.
Graphical illustration of the initial Jacobian

.
| Out[56]= |  |
Compute and display the first few eigenvalues of

,

, ... and the norm bounds.
| Out[57]= |  |
Norm bounds overestimate slightly, but more importantly they give no indication of the relative size of real and imaginary parts.
Option Summary
StiffnessTest
| | |
| "MaxRepetitions" | {3,5} | specify the maximum number of successive and total times that the stiffness test (6) is allowed to fail |
| "SafetyFactor" |  | specify the safety factor to use in the right-hand side of the stiffness test (7) |
Options of the method option
.
NonstiffTest
| | |
| "MaxRepetitions" | {2,∞} | specify the maximum number of successive and total times that the stiffness test (8) is allowed to fail |
| "SafetyFactor" |  | specify the safety factor to use in the right-hand side of the stiffness test (9) |
Options of the method option
.