Plot Diagnostics for Fitted Models—Wolfram Documentation

Wolfram Language & System Documentation Center

How to | Plot Diagnostics for Fitted Models

Diagnostics are an important part of analyzing models of data. Plots of residuals and leverage or influence measures provide valuable insight into whether assumptions of the model are reasonable and whether there are data points that exert too much influence on the fitting. You can create these types of graphics in the Wolfram Language by using results from functions such as LinearModelFit together with built‐in plotting functions.

First, define a dataset to work with:

Wolfram Language code:

data1 = {{1.36, 15.36}, {1.69, 16.12}, {4.91, 104.62}, {4.85, 110.06}, {4.41, 99.07}, {2.15, 23.83}, {2.98, 48.86}, {1.19, 16.29}, {4.08, 79.91}, {0.05, 0.89}, {3.69, 70.13}, {2.34, 32.07}, {2.32, 29.39}, {1.77, 26.41}, {0.47, 6.64}};

A linear model is a model that is linear in the parameters that need to be estimated. A model of the form is quadratic in , but is a linear model in the parameters , , and .

Use LinearModelFit to fit this model:

Wolfram Language code: lm = LinearModelFit[data1, {x, x ^ 2}, x]

You can evaluate the FittedModel object at values of , so the fitted curve can be plotted using Plot for the function lm[x].

Plot the fitted function:

Wolfram Language code: p1 = Plot[lm[x], {x, 0, 5}]

You can plot the data using ListPlot and use Show to display the data along with the fitted curve:

Wolfram Language code: Show[ListPlot[data1], p1]

You can ask the FittedModel object for a number of diagnostics and results.

For a linear model, the residuals should look like white noise, random normally distributed values with mean 0 and a constant variance. Any trends in the residuals may indicate a need to modify the model.

Use the "FitResiduals" property with the FittedModel object (lm) to obtain the residuals and plot them for each data point:

Wolfram Language code: ListPlot[lm["FitResiduals"]]

As with any other graphic, you can use additional options to add features such as frames, labels, and filling.

Use the Frame option to add a frame around the residual plot and the Filling option to draw stems from the points to the axis:

Wolfram Language code: ListPlot[lm["FitResiduals"], Frame -> True, Filling -> Axis]

Standardized residuals can be useful because they effectively remove the overall scale in the residuals. In the case of linear and nonlinear regression, standardized residuals should look like white noise with variance equal to 1.

Generate a plot of standardized residuals for the fitted linear model:

Wolfram Language code: ListPlot[lm["StandardizedResiduals"], Frame -> True, Filling -> Axis]

While the previous residual plots display the residuals for each data point, it can also be useful to plot residuals against predictor variables. In the following example, is the predictor variable.

You can plot the residuals against a predictor variable by creating pairs from the data values and the associated residuals. The predictor values are the first elements of each data point in data:

Wolfram Language code: xvals = data1[[All, 1]]

Construct pairs of values and residuals using Transpose, and then plot the pairs using ListPlot:

Wolfram Language code: ListPlot[Transpose[{xvals, lm["FitResiduals"]}]]

Add a frame, stems, and labels to the plot:

Wolfram Language code:

ListPlot[Transpose[{data1[[All, 1]], lm["FitResiduals"]}], Frame -> True, Filling -> 0, FrameLabel -> {"x", "residual"}, PlotLabel -> "Residual vs. Predictor"]

Similarly, measures of the influence or leverage of points in the dataset can be plotted to assess if any individual points have too much influence on the fitting.

You can, for instance, plot Cook distances for each data point. The Cook distance for a data point provides a measure of that point's influence on the overall fit:

Wolfram Language code: ListPlot[lm["CookDistances"], Filling -> 0]

Other pointwise diagnostics include "FitDifferences" and "BetaDifferences", which are often referred to as DFFITS and DFBETAS, respectively. These values can be visualized to assess other types of influence.

Plotting the "FitDifferences" provides a measure of each point's influence on the predicted values:

Wolfram Language code: ListPlot[lm["FitDifferences"], Filling -> 0]

Diagnostic plots can be obtained in a similar manner for nonlinear and generalized linear models fitted using NonlinearModelFit, GeneralizedLinearModelFit, LogitModelFit, and ProbitModelFit.

The same values used for the linear model will now be used for nonlinear and generalized linear fitting:

Wolfram Language code:

data2 = {{1.36, 15.36}, {1.69, 16.12}, {4.91, 104.62}, {4.85, 110.06}, {4.41, 99.07}, {2.15, 23.83}, {2.98, 48.86}, {1.19, 16.29}, {4.08, 79.91}, {0.05, 0.89}, {3.69, 70.13}, {2.34, 32.07}, {2.32, 29.39}, {1.77, 26.41}, {0.47, 6.64}};

For example, use Exp with NonlinearModelFit to fit an exponential model to the data:

Wolfram Language code: nlm = NonlinearModelFit[data2, a Exp[b + c x ^ (1 / 2)], {a, b, c}, x]

You can again use ListPlot and Plot to visualize the fitting:

Wolfram Language code: Show[ListPlot[data2], Plot[nlm[x], {x, 0, 5}]]

As in the linear examples, you can also plot pointwise diagnostics such as residuals using ListPlot:

Wolfram Language code: ListPlot[nlm["FitResiduals"], Filling -> 0]

The next example fits a model of the form , but assumes the responses (the values in the data) follow an inverse Gaussian distribution. This is specified by setting the ExponentialFamily option in GeneralizedLinearModelFit to "InverseGaussian":

Wolfram Language code: glm = GeneralizedLinearModelFit[data2, {x, x ^ 2}, x, ExponentialFamily -> "InverseGaussian", LinkFunction -> Identity]

Unlike in the linear and nonlinear regression models, the error variance is no longer expected to be constant. For an inverse Gaussian model, the errors are expected to increase as the response values increase.

Various types of transformed residuals can be used with these models, and you may want to view several at once. First, specify the residuals that you want to view together:

Wolfram Language code: glmresids = glm[{"FitResiduals", "AnscombeResiduals", "PearsonResiduals", "StandardizedPearsonResiduals"}];

Just as in linear and nonlinear regression, "FitResiduals" are the differences between the response and predicted values. "AnscombeResiduals" transforms the fit residuals such that they should be closer to normal noise for the assumed distribution. "PearsonResiduals" scales based on the estimated variance at each point, while "StandardizedPearsonResiduals" scales out the overall variance.

Next, define the labels for each residual plot you are about to create:

Wolfram Language code: labels = {"fit", "Anscombe", "Pearson", "standardized Pearson"};

Create plots for the different types of residuals and add the labels to each. The output is suppressed here using a semicolon (;) and the plots are displayed in a grid in the next step:

Wolfram Language code: plots = MapThread[ListPlot[#1, Frame -> True, PlotLabel -> #2]&, {glmresids, labels}];

Use Partition to make a 2×2 array and then display the plots in a grid using GraphicsGrid:

Wolfram Language code: GraphicsGrid[Partition[plots, 2], ImageSize -> 400, PlotLabel -> "Types of Residuals"]

Top

More Learning

Tech Support

Wolfram Solutions

Wolfram Solutions For Education

Get Started

Grow Your Skills

Work with Us

Educational Programs for Adults

Educational Programs for Youth

Read

How to | Plot Diagnostics for Fitted Models

How to | Plot Diagnostics for Fitted Models

Tech Notes

Related Demonstrations

Related Links

See Also

Related Guides