Stiffness Detection
Overview
Many differential equations exhibit some form of stiffness which restricts the step-size and hence effectiveness of explicit solution methods.
A number of implicit methods have been developed over the years to circumvent this problem.
For the same step size, implicit methods can be substantially less efficient than explicit methods, due to the overhead associated with the intrinsic linear algebra.
This cost can offset by the fact that, in certain regions, implicit methods can take substantially larger step sizes.
Several attempts have been made to provide user-friendly codes that automatically attempt to detect stiffness at runtime and switch between appropriate methods as necessary.
A number of strategies that have been proposed to automatically equip a code with a stiffness detection device are outlined here.
Particular attention is given to the problem of estimation of the dominant eigenvalue of a matrix in order to describe how stiffness detection is implemented in
NDSolve.
Numerical examples illustrate the effectiveness of the strategy.
Initialization
Load some packages with predefined examples and utility functions. |
Introduction
Consider the numerical solution of initial value problems:
Stiffness is a combination of problem, solution method, initial condition and local error tolerances.
Stiffness limits the effectiveness of explicit solution methods due to restrictions on the size of steps that can be taken.
Stiffness arises in many practical systems as well as in the numerical solution of partial differential equations by the method of lines.
Example
The van der Pol oscillator is a non-conservative oscillator with nonlinear damping and is an example of a stiff system of ordinary differential equations:
with
= 3/1000;
Consider initial conditions:
and solve over the interval
t
[0, 10].
The method
"StiffnessSwitching" uses a pair of extrapolation methods by default:
- Explicit modified midpoint (Gragg smoothing), double-harmonic sequence 2, 4, 6,...
- Linearly implicit Euler, sub-harmonic sequence 2, 3, 4,...
Solution
This loads the problem from a package. |
Solve the system numerically using a nonstiff method. |
Solve the system using a method that switches when stiffness occurs. |
Here is a plot of the two solution components. The sharp peaks (in blue) extend out to about 450 in magnitude and have been cropped.
| Out[7]= |  |
|
Stiffness can often occur in regions that follow rapid transients.
This plots the step sizes taken against time.
| Out[8]= |  |
|
The problem is that when the solution is changing rapidly, there is little point using a stiff solver, since local accuracy is the dominant issue.
For efficiency, it would be useful if the method could automatically detect regions where local accuracy (and not stability) is important.
Linear Stability
Linear stability theory arises from the study of Dahlquist's scalar linear test equation:
as a simplified model for studying the initial value problem (
1).
Stability is characterized by analyzing a method applied to (
2) to obtain
where
z = h 
and
R(z) is the (rational) stability function.
The boundary of absolute stability is obtained by considering the region:
Explicit Euler Method
The explicit or forward Euler method:
The shaded region represents instability, where
|R (z)|>1.
| Out[9]= |  |
The Linear Stability Boundary is often taken as the intersection with the negative real axis.
For the explicit Euler method LSB
= -2.
For an eigenvalue of
= -1, linear stability requirements mean that the step-size needs to satisfy
h < 2, which is a very mild restriction.
However, for an eigenvalue of
= -106, linear stability requirements mean that the step size needs to satisfy
h < 2
10-6, which is a very severe restriction.
Example
This example shows the effect of stiffness on the step-size sequence when using an explicit Runge-Kutta method to solve a stiff system.
This system models a chemical reaction. |
The system is solved by disabling the built-in stiffness detection. |
The step-size sequence starts to oscillate when the stability boundary is reached.
| Out[12]= |  |
|
- A large number of step rejections often has a negative impact on performance.
- The large number of steps taken adversely affects the accuracy of the computed solution.
The built-in detection does an excellent job of locating when stiffness occurs. |
Implicit Euler Method
The implicit or backward Euler method:
The method is unconditionally stable for the entire left half-plane.
| Out[14]= |  |
This means that to maintain stability there is no longer a restriction on the step size.
The drawback is that an implicit system of equations now has to be solved at each integration step.
Type Insensitivity
A type-insensitive solver recognizes and responds efficiently to stiffness at each step and so is insensitive to the (possibly changing) type of the problem.
One of the most established solvers of this class is LSODA [
H83], [
P83].
Later generations of LSODA such as CVODE no longer incorporate a stiffness detection device. The reason is because LSODA use norm bounds to estimate the dominant eigenvalue and these bounds, as will be seen later, can be quite inaccurate.
The low order of A(

)-stable BDF methods means that LSODA and CVODE are not very suitable for solving systems with high accuracy or systems where the dominant eigenvalue has a large imaginary part. Alternative methods, such as those based on extrapolation of linearly implicit schemes, do not suffer from these issues.
Much of the work on stiffness detection was carried out in the 1980s and 1990s using standalone FORTRAN codes.
New linear algebra techniques and efficient software have since become available and these are readily accessible in
Mathematica.
Stiffness can be a transient phenomenon, so detecting nonstiffness is equally important [
S77], [
B90].
"StiffnessTest" Method Option
There are several approaches that can be used to switch from a nonstiff to a stiff solver.
Direct Estimation
A convenient way of detecting stiffness is to directly estimate the dominant eigenvalue of the Jacobian
J of the problem (see [
S77], [
P83], [
S83], [
S84a], [
S84c], [
R87] and [
HW96]).
Such an estimate is often available as a by-product of the numerical integration and so it is reasonably inexpensive.
If
v denotes an approximation to the eigenvector corresponding to dominant eigenvalue of the Jacobian, with
v
sufficiently small, then by the mean value theorem a good approximation to the leading eigenvalue is:
Richardson's extrapolation provides a sequence of refinements that yields a quantity of this form, as do certain explicit Runge-Kutta methods.
Cost is at most two function evaluations, but often at least one of these is available as a by-product of the numerical integration, so it is reasonably inexpensive.
Let
LSB denote the linear stability boundary—the intersection of the linear stability region with the negative real axis.
The product

gives an estimate that can be compared to the linear stability boundary of a method in order to detect stiffness:
where
s is a safety factor.
Description
The methods
"DoubleStep",
"Extrapolation", and
"ExplicitRungeKutta" have the option
"StiffnessTest", which can be used to identify whether the method applied with the specified
AccuracyGoal and
PrecisionGoal tolerances to a given problem is stiff.
The method option
"StiffnessTest" itself accepts a number of options that implement a weak form of (
5) where the test is allowed to fail a specified number of times.
The reason for this is that some problems can be only mildly stiff in a certain region and an explicit integration method may still be efficient.
"NonstiffTest" Method Option
The
"StiffnessSwitching" method has the option
"NonstiffTest", which is used to switch back from a stiff method to a nonstiff method.
The following settings are allowed for the option
"NonstiffTest"
- None or False (perform no test).
Switching to a Nonstiff Solver
An approach that is independent of the stiff method is used.
Given the Jacobian
J (or an approximation) compute one of:
Norm Bound:
J 
Spectral Radius:
(J) = max |
i|
Many linear algebra techniques focus on solving a single problem to high accuracy.
For stiffness detection, a succession of problems with solutions to one or two digits are adequate.
For a numerical discretization
0 =
t0 < t1<
< tn
= T
consider a sequence
k of matrices in some sub-interval(s)
Jti
,
Jti+1
, ...
Jti+k-1
The spectra of the succession of matrices often changes very slowly from step to step.
The goal is to find a way of estimating (bounds on) dominant eigenvalues of
a succession of matrices Jti that:
- Costs less than the work carried out in the linear algebra at each step in the stiff solver.
- Takes account of the step-to-step nature of the solver.
NormBound
A simple and efficient technique of obtaining a bound on the dominant eigenvalue is to use the norm of the Jacobian
J
p where typically
p = 1 or
p = 
.
The method has complexity
O (n2), which is less than the work carried out in the stiff solver.
This is the approach used by LSODA.
- Norm bounds for dense matrices overestimate and the bounds become worse as the dimension increases.
- Norm bounds can be tight for sparse or banded matrices of quite large dimension.
The setting
"NormBound" of the option
"NonstiffTest" computes
J
1 and
J 

and returns the smaller of the two values.
Example
The following Jacobian matrix arises in the numerical solution of the van der Pol system using a stiff solver. |
Bounds based on norms overestimate the spectral radius by more than an order of magnitude.
| Out[19]= |  |
| Out[17]= |  |
|
Direct Eigenvalue Computation
For small problems (
n ≤ 32) it can be efficient just to compute the dominant eigenvalue directly.
- Hermitian matrices use the LAPACK function xgeev
- General matrices use the LAPACK function xsyevr
The setting
"Direct" of the option
"NonstiffTest" computes the dominant eigenvalue of
J using the same LAPACK routines as
Eigenvalues.
For larger problems the cost of direct eigenvalue computation is
O (n3) which becomes prohibitive when compared to the cost of the linear algebra work in a stiff solver.
A number of iterative schemes have been implemented for this purpose. These effectively work by approximating the dominant eigenspace in a smaller subspace and using dense eigenvalue methods for the smaller problem.
The Power Method
Shampine has proposed the use of the power method for estimating the dominant eigenvalue of the Jacobian [
S91].
The power method is perhaps not a very well-respected method, but has received a resurgence of interest due to its use in Google's page ranking.
The power method can be used when
- A

n
n has n linearly independent eigenvectors (diagonalizable)
- The eigenvalues can be ordered in magnitude as
1
>
2
≥
≥ 
n
-
1 is the dominant eigenvalue of A.
Description
Given a starting vector
v0
n compute
The Rayleigh quotient is used to compute an approximation to the dominant eigenvalue:
In practice, the approximate eigenvector is scaled at each step:
Properties
The power method converges linearly with rate:
In particular, the method does not converge when applied to a matrix with a dominant complex conjugate pair of eigenvalues.
Generalizations
The power method can be adapted to overcome the issue of equimodular eigenvalues (e.g. NAPACK)
However the modification does not generally address the issue of the slow rate of convergence for clustered eigenvalues.
There are two main approaches to generalizing the power method:
- Subspace iteration for small to medium dimensions
- Arnoldi iteration for large dimensions
Although the methods work quite differently, there are a number of core components that can be shared and optimized.
Subspace and Krylov iteration cost
O (n2 m) operations.
They project an
n
n matrix to an
m
m matrix, where
m << n.
The small matrix represents the dominant eigenspace and approximation uses dense eigenvalue routines.
Subspace Iteration
Subspace (or simultaneous) iteration generalizes the ideas in the power method by acting on
m vectors at each step.
Start with an
orthonormal set of vectors
V (0) =
n
m, where usually
m << n:
Form the product with the matrix
A:
In order to prevent all vectors from converging to multiples of the same dominant eigenvector
v1 of
A, they are orthonormalized:
The orthonormalization step is expensive compared to the matrix product.
Rayleigh-Ritz Projection
Input: matrix
A and an orthonormal set of vectors
V
- Compute the Rayleigh quotient S = V*A V
- Compute the Schur decomposition U*S U = T
The matrix
S has small dimension
m
m.
Note that the Schur decomposition can be computed in real arithmetic when
S
m
m using a quasi upper-triangular matrix
T.
Convergence
Subspace (or simultaneous) iteration generalizes the ideas in the power method by acting on
m vectors at each step.
SRRIT converges linearly with rate:
In particular the rate for the dominant eigenvalue is:
Therefore it can be beneficial to take e.g.
m = 3 or more even if we are only interested in the dominant eigenvalue.
Error Control
A relative error test on successive approximations, dominant eigenvalue is:
This is not sufficient since it can be satisfied when convergence is slow.
If
or

i
=
i+1
then the
ith column of
is not uniquely determined.
The residual test used in SRRIT is:
where

,

is the
ith column of
and

is the
ith column of
T (k).
This is advantageous since it works for equimodular eigenvalues.
The first column position of the upper triangular matrix
T (k) is tested because of the use of an ordered Schur decomposition.
Implementation
There are several implementations of subspace iteration.
- Subspace iteration with Chebyshev acceleration [S84b], [DS93]
The implementation for use in
"NonstiffTest" is based on:
- Schur Rayleigh-Ritz iteration [BS97]
"An attractive feature of SRRIT is that it displays monotonic consistency, that is, as the convergence tolerance decreases so does the size of the computed residuals" [
LS96].
SRRIT makes use of an ordered Schur decomposition where eigenvalues of largest modulus appear in the upper-left entries.
Modified Gram-Schmidt with reorthonormalization is used to form
Q (k), which is faster than Householder transformations.
The approximate dominant subspace

at integration time
ti is used to start the iteration at the next integration step
ti+1:
KrylovIteration
Given an
n
m matrix
V whose columns
vi comprise an orthogonal basis of a given subspace

:
The Rayleigh-Ritz procedure consists of computing
H = VTA V and solving the associated eigenproblem
H yi=
iyi.
The approximate eigenpairs of the original problem

,

satisfy

and
, which are called Ritz values and Ritz vectors.
The process works best when the subspace

approximates an invariant subspace of
A.
This process is effective when

is equal to the Krylov subspace associated with a matrix
A and a given initial vector
x as:
Description
The method of Arnoldi is a Krylov-based projection algorithm that computes an orthogonal basis of the Krylov subspace and produces a projected
m
m matrix
H with
m << n.
Input: matrix
A, the number of steps
m, an initial vector
v1 of norm
1
In the case of Arnoldi,
H has an unreduced upper Hessenberg form (upper triangular with an additional nonzero subdiagonal).
Orthogonalization is usually carried out by means of a Gram-Schmidt procedure.
The quantities computed by the algorithm satisfy:
The residual
f gives an indication of proximity to an invariant subspace and the associated norm

indicates the accuracy of the computed Ritz pairs:
Restarting
The Ritz pairs converge quickly if the initial vector
x is rich in the direction of the desired eigenvalues.
When this is not the case then a restarting strategy is required in order to avoid excessive growth in both work and memory.
There are a several of strategies for restarting, in particular:
- Explicit restart — a new starting vector is a linear combination of a subset of the Ritz vectors.
- Implicit restart — a new starting vector is formed from the Arnoldi process combined with an implicitly shifted QR algorithm.
Explicit restart is relatively simple to implement, but implicit restart is more efficient since it retains the relevant eigeninformation of the larger problem. However implicit restart is difficult to implement in a numerically stable way.
An alternative which is much simpler to implement, but achieves the same effect as implicit restart, is a Krylov-Schur method [
S01].
Implementation
A number of software implementations are available, in particular:
The implementation in
"NonstiffTest" is based on Krylov-Schur Iteration.
Automatic Strategy
The
"Automatic" setting uses an amalgamation of the methods as follows.
- For n ≤ 2*m direct eigenvalue computation is used. Either m = min (n, msi) or m = min (n, mki) is used depending on which method is active.
- For n > 2*m subspace iteration is used with a default basis size of msi = 8. If the method succeeds then the resulting basis is used to start the method at the next integration step.
- If subspace iteration fails to converge after maxsi iterations then the dominant vector is used to start the Krylov method with a default basis size of mki = 16. Subsequent integration steps use the Krylov method, starting with the resulting vector from the previous step.
- If Krylov iteration fails to converge after maxki iterations then norm bounds are used for the current step. The next integration step will continue to try to use Krylov iteration.
- Since they are so inexpensive, norm bounds are always computed when subspace or Krylov iteration is used and the smaller of the absolute values is used.
Step Rejections
Caching of the time of evaluation ensures that the dominant eigenvalue estimate is not recomputed for rejected steps.
Stiffness detection is also performed for rejected steps since:
- Step rejections often occur for nonstiff solvers when working near the stability boundary
- Step rejections often occur for stiff solvers when resolving fast transients
Iterative Method Options
The iterative methods of
"NonstiffTest" have options that can be modified:
| Out[20]= |  |
| Out[21]= |  |
The default tolerance aims for just one correct digit, but often obtains substantially more accurate values—especially after a few successful iterations at successive steps.
The default values limiting the number of iterations are:
- For subspace iteration maxsi = max (25, n/ (2msi)).
- For Krylov iteration maxki = max (50, n/mki).
If these values are set too large then a convergence failure becomes too costly.
In difficult problems, it is better to share the work of convergence across steps. Since the methods effectively refine the basis vectors from the previous step, there is a reasonable chance of convergence in subsequent steps.
Latency and Switching
It is important to incorporate some form of latency in order to avoid a cycle where the
"StiffnessSwitching" method continually tries to switch between stiff and nonstiff methods.
The options
"MaxRepetitions" and
"SafetyFactor" of
"StiffnessTest" and
"NonstiffTest" are used for this purpose.
The default settings allow switching to be quite reactive, which is appropriate for one-step integration methods.
- "StiffnessTest" is carried out at the end of a step with a nonstiff method. When either value of the option "MaxRepetitions" is reached, a step rejection occurs and the step is recomputed with a stiff method.
- "NonstiffTest" is preemptive. It is performed before a step is taken with a stiff solve using the Jacobian matrix from the previous step.
Examples
Van der Pol
Select an example system. |
StiffnessTest
The system is integrated successfully with the given method and the default option settings for "StiffnessTest".
| Out[23]= |  |
|
A longer integration is aborted and a message is issued when the stiffness test condition is not satisfied.
| Out[24]= |  |
|
Using a unit safety factor and specifying that only one stiffness failure is allowed effectively gives a strict test. The specification uses the nested method option syntax.
| Out[25]= |  |
|
NonstiffTest
For such a small system, direct eigenvalue computation is used.
The example serves as a good test that the overall stiffness switching framework is behaving as expected.
Set up a function to monitor the switch between stiff and nonstiff methods and the step size taken.
Data for the stiff and nonstiff solvers is put in separate lists by using a different tag for " Sow". |
Solve the system and collect the data for the method switching. |
Plot the step sizes taken using an explicit solver (blue) and an implicit solver (red).
| Out[30]= |  |
|
Compute the number of nonstiff and stiff steps taken (including rejected steps).
| Out[31]= |  |
|
CUSP
The cusp catastrophe model for the nerve impulse mechanism [
Z72]:
Combining with the van der Pol oscillator gives rise to the CUSP system [
HW96]:
and
= 1/144 and
= 10-4.
Discretization of the diffusion terms using the method of lines is used to obtain a system of ODEs of dimension
3n = 96.
Unlike the van der Pol system, because of the size of the problem, iterative methods are used for eigenvalue estimation.
Step Size and Order Selection
Select the problem to solve. |
Set up a function to monitor the type of method used and step size. Additionally the order of the method is included as a Tooltip. |
Collect the data for the order of the method as the integration proceeds. |
Plot the step sizes taken using an explicit solver (blue) and an implicit solver (red). A Tooltip shows the order of the method at each step.
| Out[37]= |  |
|
Compute the total number of nonstiff and stiff steps taken (including rejected steps).
| Out[39]= |  |
|
Jacobian Example
Define a function to collect the first few Jacobian matrices. |
A switch to a stiff method occurs near
0.00113425 and the first test for nonstiffness occurs at the next step
tk
0.00127887.
Graphical illustration of the Jacobian Jtk.
| Out[45]= |  |
|
Define a function to compute and display the first few eigenvalues of Jtk, Jtk+1,... and the norm bounds.
| Out[47]= |  |
|
Norm bounds are quite sharp in this example.
Korteweg-deVries
The Korteweg-deVries partial differential equation is a mathematical model of waves on shallow water surfaces:
We consider boundary conditions:
and solve over the interval
t
[0, 1].
Discretization using the method of lines is used to form a system of 192 ODEs.
Step Sizes
Select the problem to solve. |
The Backward Differentiation Formula methods used in LSODA run into difficulties solving this problem.
| Out[49]= |  |
|
A plot shows that the step sizes rapidly decrease.
| Out[50]= |  |
|
In contrast StiffnessSwitching immediately switches to using the linearly implicit Euler method which needs very few integration steps.
| Out[51]= |  |
| Out[52]= |  |
|
The extrapolation methods never switch back to a nonstiff solver once the stiff solver is chosen at the beginning of the integration.
Therefore this is a form of worst case example for the nonstiff detection.
Despite this, the cost of using subspace iteration is only a few percent of the total integration time.
Compute the time taken with switching to a nonstiff method disabled.
| Out[53]= |  |
|
Jacobian Example
Collect data for the first few Jacobian matrices using the previously defined monitor function. |
Graphical illustration of the initial Jacobian Jt0.
| Out[56]= |  |
|
Compute and display the first few eigenvalues of Jtk, Jtk+1,... and the norm bounds.
| Out[57]= |  |
|
Norm bounds overestimate slightly, but more importantly they give no indication of the relative size of real and imaginary parts:
Option Summary
StiffnessTest
| | |
| "MaxRepetitions" | {3,5} | specify the maximum number of successive and total times that the stiffness test (6) is allowed to fail |
| "SafetyFactor" |  | specify the safety factor to use in the right-hand side of the stiffness test (7) |
Options of the method option "StiffnessTest".
NonstiffTest
| | |
| "MaxRepetitions" | {2, } | specify the maximum number of successive and total times that the stiffness test (8) is allowed to fail |
| "SafetyFactor" |  | specify the safety factor to use in the right-hand side of the stiffness test (9) |
Options of the method option "NonstiffTest".