Mathematica >
Mathematica 教程

Stiffness Detection

Overview

Many differential equations exhibit some form of stiffness which restricts the step-size and hence effectiveness of explicit solution methods.
A number of implicit methods have been developed over the years to circumvent this problem.
For the same step size, implicit methods can be substantially less efficient than explicit methods, due to the overhead associated with the intrinsic linear algebra.
This cost can offset by the fact that, in certain regions, implicit methods can take substantially larger step sizes.
Several attempts have been made to provide user-friendly codes that automatically attempt to detect stiffness at runtime and switch between appropriate methods as necessary.
A number of strategies that have been proposed to automatically equip a code with a stiffness detection device are outlined here.
Particular attention is given to the problem of estimation of the dominant eigenvalue of a matrix in order to describe how stiffness detection is implemented in NDSolve.
Numerical examples illustrate the effectiveness of the strategy.

Initialization

Load some packages with predefined examples and utility functions.
In[1]:=
Click for copyable input

Introduction

Consider the numerical solution of initial value problems:
Stiffness is a combination of problem, solution method, initial condition and local error tolerances.
Stiffness limits the effectiveness of explicit solution methods due to restrictions on the size of steps that can be taken.
Stiffness arises in many practical systems as well as in the numerical solution of partial differential equations by the method of lines.

Example

The van der Pol oscillator is a non-conservative oscillator with nonlinear damping and is an example of a stiff system of ordinary differential equations:
with CurlyEpsilon = 3/1000;
Consider initial conditions:
and solve over the interval t Element [0, 10].
The method "StiffnessSwitching" uses a pair of extrapolation methods by default:
  • Explicit modified midpoint (Gragg smoothing), double-harmonic sequence 2, 4, 6,...
  • Linearly implicit Euler, sub-harmonic sequence 2, 3, 4,...

Solution

This loads the problem from a package.
In[4]:=
Click for copyable input
Solve the system numerically using a nonstiff method.
Solve the system using a method that switches when stiffness occurs.
In[6]:=
Click for copyable input
Here is a plot of the two solution components. The sharp peaks (in blue) extend out to about 450 in magnitude and have been cropped.
In[7]:=
Click for copyable input
Out[7]=
Stiffness can often occur in regions that follow rapid transients.
This plots the step sizes taken against time.
In[8]:=
Click for copyable input
Out[8]=
The problem is that when the solution is changing rapidly, there is little point using a stiff solver, since local accuracy is the dominant issue.
For efficiency, it would be useful if the method could automatically detect regions where local accuracy (and not stability) is important.

Linear Stability

Linear stability theory arises from the study of Dahlquist's scalar linear test equation:
as a simplified model for studying the initial value problem (1).
Stability is characterized by analyzing a method applied to (2) to obtain
where z = h Lambda and R(z) is the (rational) stability function.
The boundary of absolute stability is obtained by considering the region:

Explicit Euler Method

The explicit or forward Euler method:
applied to (3) gives:
The shaded region represents instability, where |R (z)|>1.
In[9]:=
Click for copyable input
Out[9]=
The Linear Stability Boundary is often taken as the intersection with the negative real axis.
For the explicit Euler method LSB = -2.
For an eigenvalue of Lambda = -1, linear stability requirements mean that the step-size needs to satisfy h < 2, which is a very mild restriction.
However, for an eigenvalue of Lambda = -106, linear stability requirements mean that the step size needs to satisfy h < 2Cross10-6, which is a very severe restriction.

Example

This example shows the effect of stiffness on the step-size sequence when using an explicit Runge-Kutta method to solve a stiff system.
This system models a chemical reaction.
In[10]:=
Click for copyable input
The system is solved by disabling the built-in stiffness detection.
In[11]:=
Click for copyable input
The step-size sequence starts to oscillate when the stability boundary is reached.
In[12]:=
Click for copyable input
Out[12]=
  • A large number of step rejections often has a negative impact on performance.
  • The large number of steps taken adversely affects the accuracy of the computed solution.
The built-in detection does an excellent job of locating when stiffness occurs.

Implicit Euler Method

The implicit or backward Euler method:
applied to (4) gives:
The method is unconditionally stable for the entire left half-plane.
In[14]:=
Click for copyable input
Out[14]=
This means that to maintain stability there is no longer a restriction on the step size.
The drawback is that an implicit system of equations now has to be solved at each integration step.

Type Insensitivity

A type-insensitive solver recognizes and responds efficiently to stiffness at each step and so is insensitive to the (possibly changing) type of the problem.
One of the most established solvers of this class is LSODA [H83], [P83].
Later generations of LSODA such as CVODE no longer incorporate a stiffness detection device. The reason is because LSODA use norm bounds to estimate the dominant eigenvalue and these bounds, as will be seen later, can be quite inaccurate.
The low order of A(Alpha)-stable BDF methods means that LSODA and CVODE are not very suitable for solving systems with high accuracy or systems where the dominant eigenvalue has a large imaginary part. Alternative methods, such as those based on extrapolation of linearly implicit schemes, do not suffer from these issues.
Much of the work on stiffness detection was carried out in the 1980s and 1990s using standalone FORTRAN codes.
New linear algebra techniques and efficient software have since become available and these are readily accessible in Mathematica.
Stiffness can be a transient phenomenon, so detecting nonstiffness is equally important [S77], [B90].

"StiffnessTest" Method Option

There are several approaches that can be used to switch from a nonstiff to a stiff solver.

Direct Estimation

A convenient way of detecting stiffness is to directly estimate the dominant eigenvalue of the Jacobian J of the problem (see [S77], [P83], [S83], [S84a], [S84c], [R87] and [HW96]).
Such an estimate is often available as a by-product of the numerical integration and so it is reasonably inexpensive.
If v denotes an approximation to the eigenvector corresponding to dominant eigenvalue of the Jacobian, with LeftDoubleBracketingBarvRightDoubleBracketingBar sufficiently small, then by the mean value theorem a good approximation to the leading eigenvalue is:
Richardson's extrapolation provides a sequence of refinements that yields a quantity of this form, as do certain explicit Runge-Kutta methods.
Cost is at most two function evaluations, but often at least one of these is available as a by-product of the numerical integration, so it is reasonably inexpensive.
Let LSB denote the linear stability boundary—the intersection of the linear stability region with the negative real axis.
The product gives an estimate that can be compared to the linear stability boundary of a method in order to detect stiffness:
where s is a safety factor.

Description

The methods "DoubleStep", "Extrapolation", and "ExplicitRungeKutta" have the option "StiffnessTest", which can be used to identify whether the method applied with the specified AccuracyGoal and PrecisionGoal tolerances to a given problem is stiff.
The method option "StiffnessTest" itself accepts a number of options that implement a weak form of (5) where the test is allowed to fail a specified number of times.
The reason for this is that some problems can be only mildly stiff in a certain region and an explicit integration method may still be efficient.

"NonstiffTest" Method Option

The "StiffnessSwitching" method has the option "NonstiffTest", which is used to switch back from a stiff method to a nonstiff method.
The following settings are allowed for the option "NonstiffTest"
  • None or False (perform no test).
  • "NormBound".
  • "Direct".
  • "SubspaceIteration".
  • "KrylovIteration".
  • "Automatic".

Switching to a Nonstiff Solver

An approach that is independent of the stiff method is used.
Given the Jacobian J (or an approximation) compute one of:
Norm Bound: LeftDoubleBracketingBar J RightDoubleBracketingBar
Spectral Radius: Rho (J) = max |Lambdai|
Dominant Eigenvalue Lambdai : |Lambdai| > | Lambdaj |
Many linear algebra techniques focus on solving a single problem to high accuracy.
For stiffness detection, a succession of problems with solutions to one or two digits are adequate.
For a numerical discretization
0 = t0 < t1< CenterEllipsis < tn = T
consider a sequence k of matrices in some sub-interval(s)
Jti , Jti+1 , ... Jti+k-1
The spectra of the succession of matrices often changes very slowly from step to step.
The goal is to find a way of estimating (bounds on) dominant eigenvalues of a succession of matrices Jti that:
  • Costs less than the work carried out in the linear algebra at each step in the stiff solver.
  • Takes account of the step-to-step nature of the solver.

NormBound

A simple and efficient technique of obtaining a bound on the dominant eigenvalue is to use the norm of the Jacobian LeftDoubleBracketingBar J RightDoubleBracketingBarp where typically p = 1 or p = Infinity.
The method has complexity O (n2), which is less than the work carried out in the stiff solver.
This is the approach used by LSODA.
  • Norm bounds for dense matrices overestimate and the bounds become worse as the dimension increases.
  • Norm bounds can be tight for sparse or banded matrices of quite large dimension.
The setting "NormBound" of the option "NonstiffTest" computes LeftDoubleBracketingBar J RightDoubleBracketingBar1 and LeftDoubleBracketingBar J RightDoubleBracketingBarInfinity and returns the smaller of the two values.

Example

The following Jacobian matrix arises in the numerical solution of the van der Pol system using a stiff solver.
In[18]:=
Click for copyable input
Bounds based on norms overestimate the spectral radius by more than an order of magnitude.
In[19]:=
Click for copyable input
Out[19]=
Out[17]=

Direct Eigenvalue Computation

For small problems (n ≤ 32) it can be efficient just to compute the dominant eigenvalue directly.
  • Hermitian matrices use the LAPACK function xgeev
  • General matrices use the LAPACK function xsyevr
The setting "Direct" of the option "NonstiffTest" computes the dominant eigenvalue of J using the same LAPACK routines as Eigenvalues.
For larger problems the cost of direct eigenvalue computation is O (n3) which becomes prohibitive when compared to the cost of the linear algebra work in a stiff solver.
A number of iterative schemes have been implemented for this purpose. These effectively work by approximating the dominant eigenspace in a smaller subspace and using dense eigenvalue methods for the smaller problem.

The Power Method

Shampine has proposed the use of the power method for estimating the dominant eigenvalue of the Jacobian [S91].
The power method is perhaps not a very well-respected method, but has received a resurgence of interest due to its use in Google's page ranking.
The power method can be used when
  • AElementDoubleStruckCapitalCn Cross n has n linearly independent eigenvectors (diagonalizable)
  • The eigenvalues can be ordered in magnitude as LeftBracketingBar Lambda1RightBracketingBar > LeftBracketingBar Lambda2 RightBracketingBarCenterEllipsisLeftBracketingBarLambdanRightBracketingBar
  • Lambda1 is the dominant eigenvalue of A.

Description

Given a starting vector v0ElementDoubleStruckCapitalCn compute
The Rayleigh quotient is used to compute an approximation to the dominant eigenvalue:
In practice, the approximate eigenvector is scaled at each step:

Properties

The power method converges linearly with rate:
which can be slow.
In particular, the method does not converge when applied to a matrix with a dominant complex conjugate pair of eigenvalues.

Generalizations

The power method can be adapted to overcome the issue of equimodular eigenvalues (e.g. NAPACK)
However the modification does not generally address the issue of the slow rate of convergence for clustered eigenvalues.
There are two main approaches to generalizing the power method:
  • Subspace iteration for small to medium dimensions
  • Arnoldi iteration for large dimensions
Although the methods work quite differently, there are a number of core components that can be shared and optimized.
Subspace and Krylov iteration cost O (n2 m) operations.
They project an nCrossn matrix to an mCrossm matrix, where m << n.
The small matrix represents the dominant eigenspace and approximation uses dense eigenvalue routines.

Subspace Iteration

Subspace (or simultaneous) iteration generalizes the ideas in the power method by acting on m vectors at each step.
Start with an orthonormal set of vectors V (0) =DoubleStruckCapitalCn Cross m, where usually m << n:
Form the product with the matrix A:
In order to prevent all vectors from converging to multiples of the same dominant eigenvector v1 of A, they are orthonormalized:
The orthonormalization step is expensive compared to the matrix product.

Rayleigh-Ritz Projection

Input: matrix A and an orthonormal set of vectors V
  • Compute the Rayleigh quotient S = V*A V
  • Compute the Schur decomposition U*S U = T
The matrix S has small dimension m Cross m.
Note that the Schur decomposition can be computed in real arithmetic when SElementDoubleStruckCapitalRm Cross m using a quasi upper-triangular matrix T.

Convergence

Subspace (or simultaneous) iteration generalizes the ideas in the power method by acting on m vectors at each step.
SRRIT converges linearly with rate:
In particular the rate for the dominant eigenvalue is:
Therefore it can be beneficial to take e.g. m = 3 or more even if we are only interested in the dominant eigenvalue.

Error Control

A relative error test on successive approximations, dominant eigenvalue is:
This is not sufficient since it can be satisfied when convergence is slow.
If
or LeftBracketingBarLambdaiRightBracketingBar =LeftBracketingBarLambdai+1RightBracketingBar then the ith column of
is not uniquely determined.
The residual test used in SRRIT is:
where , is the ith column of
and is the ith column of T (k).
This is advantageous since it works for equimodular eigenvalues.
The first column position of the upper triangular matrix T (k) is tested because of the use of an ordered Schur decomposition.

Implementation

There are several implementations of subspace iteration.
  • Subspace iteration with Chebyshev acceleration [S84b], [DS93]
The implementation for use in "NonstiffTest" is based on:
  • Schur Rayleigh-Ritz iteration [BS97]
"An attractive feature of SRRIT is that it displays monotonic consistency, that is, as the convergence tolerance decreases so does the size of the computed residuals" [LS96].
SRRIT makes use of an ordered Schur decomposition where eigenvalues of largest modulus appear in the upper-left entries.
Modified Gram-Schmidt with reorthonormalization is used to form Q (k), which is faster than Householder transformations.
The approximate dominant subspace at integration time ti is used to start the iteration at the next integration step ti+1:

KrylovIteration

Given an n Cross m matrix V whose columns vi comprise an orthogonal basis of a given subspace ScriptCapitalV:
The Rayleigh-Ritz procedure consists of computing H = VTA V and solving the associated eigenproblem H yi= Thetaiyi.
The approximate eigenpairs of the original problem , satisfy and
, which are called Ritz values and Ritz vectors.
The process works best when the subspace ScriptCapitalV approximates an invariant subspace of A.
This process is effective when ScriptCapitalV is equal to the Krylov subspace associated with a matrix A and a given initial vector x as:

Description

The method of Arnoldi is a Krylov-based projection algorithm that computes an orthogonal basis of the Krylov subspace and produces a projected m Cross m matrix H with m << n.
Input: matrix A, the number of steps m, an initial vector v1 of norm 1
Output: (Vm, Hm, f, Beta) with Beta =LeftDoubleBracketingBarfRightDoubleBracketingBar2
In the case of Arnoldi, H has an unreduced upper Hessenberg form (upper triangular with an additional nonzero subdiagonal).
Orthogonalization is usually carried out by means of a Gram-Schmidt procedure.
The quantities computed by the algorithm satisfy:
The residual f gives an indication of proximity to an invariant subspace and the associated norm Beta indicates the accuracy of the computed Ritz pairs:

Restarting

The Ritz pairs converge quickly if the initial vector x is rich in the direction of the desired eigenvalues.
When this is not the case then a restarting strategy is required in order to avoid excessive growth in both work and memory.
There are a several of strategies for restarting, in particular:
  • Explicit restart — a new starting vector is a linear combination of a subset of the Ritz vectors.
  • Implicit restart — a new starting vector is formed from the Arnoldi process combined with an implicitly shifted QR algorithm.
Explicit restart is relatively simple to implement, but implicit restart is more efficient since it retains the relevant eigeninformation of the larger problem. However implicit restart is difficult to implement in a numerically stable way.
An alternative which is much simpler to implement, but achieves the same effect as implicit restart, is a Krylov-Schur method [S01].

Implementation

A number of software implementations are available, in particular:
The implementation in "NonstiffTest" is based on Krylov-Schur Iteration.

Automatic Strategy

The "Automatic" setting uses an amalgamation of the methods as follows.
  • For n ≤ 2*m direct eigenvalue computation is used. Either m = min (n, msi) or m = min (n, mki) is used depending on which method is active.
  • For n > 2*m subspace iteration is used with a default basis size of msi = 8. If the method succeeds then the resulting basis is used to start the method at the next integration step.
  • If subspace iteration fails to converge after maxsi iterations then the dominant vector is used to start the Krylov method with a default basis size of mki = 16. Subsequent integration steps use the Krylov method, starting with the resulting vector from the previous step.
  • If Krylov iteration fails to converge after maxki iterations then norm bounds are used for the current step. The next integration step will continue to try to use Krylov iteration.
  • Since they are so inexpensive, norm bounds are always computed when subspace or Krylov iteration is used and the smaller of the absolute values is used.

Step Rejections

Caching of the time of evaluation ensures that the dominant eigenvalue estimate is not recomputed for rejected steps.
Stiffness detection is also performed for rejected steps since:
  • Step rejections often occur for nonstiff solvers when working near the stability boundary
  • Step rejections often occur for stiff solvers when resolving fast transients

Iterative Method Options

The iterative methods of "NonstiffTest" have options that can be modified:
In[20]:=
Click for copyable input
Out[20]=
In[21]:=
Click for copyable input
Out[21]=
The default tolerance aims for just one correct digit, but often obtains substantially more accurate values—especially after a few successful iterations at successive steps.
The default values limiting the number of iterations are:
  • For subspace iteration maxsi = max (25, n/ (2msi)).
  • For Krylov iteration maxki = max (50, n/mki).
If these values are set too large then a convergence failure becomes too costly.
In difficult problems, it is better to share the work of convergence across steps. Since the methods effectively refine the basis vectors from the previous step, there is a reasonable chance of convergence in subsequent steps.

Latency and Switching

It is important to incorporate some form of latency in order to avoid a cycle where the "StiffnessSwitching" method continually tries to switch between stiff and nonstiff methods.
The options "MaxRepetitions" and "SafetyFactor" of "StiffnessTest" and "NonstiffTest" are used for this purpose.
The default settings allow switching to be quite reactive, which is appropriate for one-step integration methods.
  • "StiffnessTest" is carried out at the end of a step with a nonstiff method. When either value of the option "MaxRepetitions" is reached, a step rejection occurs and the step is recomputed with a stiff method.
  • "NonstiffTest" is preemptive. It is performed before a step is taken with a stiff solve using the Jacobian matrix from the previous step.

Examples

Van der Pol

Select an example system.
In[22]:=
Click for copyable input

StiffnessTest

The system is integrated successfully with the given method and the default option settings for "StiffnessTest".
In[23]:=
Click for copyable input
Out[23]=
A longer integration is aborted and a message is issued when the stiffness test condition is not satisfied.
In[24]:=
Click for copyable input
Out[24]=
Using a unit safety factor and specifying that only one stiffness failure is allowed effectively gives a strict test. The specification uses the nested method option syntax.
In[25]:=
Click for copyable input
Out[25]=

NonstiffTest

For such a small system, direct eigenvalue computation is used.
The example serves as a good test that the overall stiffness switching framework is behaving as expected.
Set up a function to monitor the switch between stiff and nonstiff methods and the step size taken. Data for the stiff and nonstiff solvers is put in separate lists by using a different tag for "Sow".
In[26]:=
Click for copyable input
Solve the system and collect the data for the method switching.
In[28]:=
Click for copyable input
Plot the step sizes taken using an explicit solver (blue) and an implicit solver (red).
In[30]:=
Click for copyable input
Out[30]=
Compute the number of nonstiff and stiff steps taken (including rejected steps).
In[31]:=
Click for copyable input
Out[31]=

CUSP

The cusp catastrophe model for the nerve impulse mechanism [Z72]:
Combining with the van der Pol oscillator gives rise to the CUSP system [HW96]:
where
and Sigma = 1/144 and CurlyEpsilon = 10-4.
Discretization of the diffusion terms using the method of lines is used to obtain a system of ODEs of dimension 3n = 96.
Unlike the van der Pol system, because of the size of the problem, iterative methods are used for eigenvalue estimation.

Step Size and Order Selection

Select the problem to solve.
In[32]:=
Click for copyable input
Set up a function to monitor the type of method used and step size. Additionally the order of the method is included as a Tooltip.
In[33]:=
Click for copyable input
Collect the data for the order of the method as the integration proceeds.
In[35]:=
Click for copyable input
Plot the step sizes taken using an explicit solver (blue) and an implicit solver (red). A Tooltip shows the order of the method at each step.
In[37]:=
Click for copyable input
Out[37]=
Compute the total number of nonstiff and stiff steps taken (including rejected steps).
In[39]:=
Click for copyable input
Out[39]=

Jacobian Example

Define a function to collect the first few Jacobian matrices.
In[41]:=
Click for copyable input
In[43]:=
Click for copyable input
A switch to a stiff method occurs near 0.00113425 and the first test for nonstiffness occurs at the next step tk TildeTilde0.00127887.
Graphical illustration of the Jacobian Jtk.
In[45]:=
Click for copyable input
Out[45]=
Define a function to compute and display the first few eigenvalues of Jtk, Jtk+1,... and the norm bounds.
In[46]:=
Click for copyable input
In[47]:=
Click for copyable input
Out[47]=
Norm bounds are quite sharp in this example.

Korteweg-deVries

The Korteweg-deVries partial differential equation is a mathematical model of waves on shallow water surfaces:
We consider boundary conditions:
and solve over the interval t Element [0, 1].
Discretization using the method of lines is used to form a system of 192 ODEs.

Step Sizes

Select the problem to solve.
In[48]:=
Click for copyable input
The Backward Differentiation Formula methods used in LSODA run into difficulties solving this problem.
In[49]:=
Click for copyable input
Out[49]=
A plot shows that the step sizes rapidly decrease.
In[50]:=
Click for copyable input
Out[50]=
In contrast StiffnessSwitching immediately switches to using the linearly implicit Euler method which needs very few integration steps.
In[51]:=
Click for copyable input
Out[51]=
In[52]:=
Click for copyable input
Out[52]=
The extrapolation methods never switch back to a nonstiff solver once the stiff solver is chosen at the beginning of the integration.
Therefore this is a form of worst case example for the nonstiff detection.
Despite this, the cost of using subspace iteration is only a few percent of the total integration time.
Compute the time taken with switching to a nonstiff method disabled.
In[53]:=
Click for copyable input
Out[53]=

Jacobian Example

Collect data for the first few Jacobian matrices using the previously defined monitor function.
In[54]:=
Click for copyable input
Graphical illustration of the initial Jacobian Jt0.
In[56]:=
Click for copyable input
Out[56]=
Compute and display the first few eigenvalues of Jtk, Jtk+1,... and the norm bounds.
In[57]:=
Click for copyable input
Out[57]=
Norm bounds overestimate slightly, but more importantly they give no indication of the relative size of real and imaginary parts:

Option Summary

StiffnessTest

option namedefault value
"MaxRepetitions"{3,5}specify the maximum number of successive and total times that the stiffness test (6) is allowed to fail
"SafetyFactor"specify the safety factor to use in the right-hand side of the stiffness test (7)

Options of the method option "StiffnessTest".

NonstiffTest

option namedefault value
"MaxRepetitions"{2,Infinity}specify the maximum number of successive and total times that the stiffness test (8) is allowed to fail
"SafetyFactor"specify the safety factor to use in the right-hand side of the stiffness test (9)

Options of the method option "NonstiffTest".

Ask a question about this page  |  Suggest an improvement  |  Leave a message for the team