此为 Mathematica 7 文档,内容基于更早版本的 Wolfram 语言
查看最新文档(版本11.1)

Statistical Model Analysis

When fitting models of data, it is often useful to analyze how well the model fits the data and how well the fitting meets assumptions of the fitting. For a number of common statistical models, this is accomplished in Mathematica by way of fitting functions that construct FittedModel objects.
FittedModelrepresents a symbolic fitted model

Object for fitted model information.

FittedModel objects can be evaluated at a point or queried for results and diagnostic information. Diagnostics vary somewhat across model types. Available model fitting functions fit linear, generalized linear, and nonlinear models.
LinearModelFitconstructs a linear model
GeneralizedLinearModelFitconstructs a generalized linear model
LogitModelFitconstructs a binomial logistic regression model
ProbitModelFitconstructs a binomial probit regression model
NonlinearModelFitconstructs a nonlinear least squares model

Functions that generate FittedModel objects.

This fits a linear model assuming x values 1, 2, ....
In[1]:=
Click for copyable input
Out[1]=
Here is the functional form of the fitted model.
In[2]:=
Click for copyable input
Out[2]=
This evaluates the model for x = 2.5.
In[3]:=
Click for copyable input
Out[3]=
Here is a shortened list of available results for the linear fitted model.
In[4]:=
Click for copyable input
Out[4]//Short=
The major difference between model fitting functions such as LinearModelFit and functions such as Fit and FindFit is the ability to easily obtain diagnostic information from the FittedModel objects. The results are accessible without re-fitting the model.
This gives the residuals for the fitting.
In[5]:=
Click for copyable input
Out[5]=
Here multiple results are obtained at once.
In[6]:=
Click for copyable input
Out[6]=
Fitting options relevant to property computations can be passed to FittedModel objects to override defaults.
This gives default 95% confidence intervals.
In[7]:=
Click for copyable input
Out[7]=
Here 90% intervals are obtained.
In[8]:=
Click for copyable input
Out[8]=
Typical data for these model fitting functions takes the same form as data in other fitting functions such as Fit and FindFit.
{y1,y2,...}data points with a single predictor variable taking values 1, 2, ...
{{x11,x12,...,y1},{x21,x22,...,y2},...}data points with explicit coordinates

Data specifications.

Linear Models

Linear models with assumed independent normally distributed errors are among the most common models for data. Models of this type can be fitted using the LinearModelFit function.
LinearModelFit[{y1,y2,...},{f1,f2,...},x]obtain a linear model with basis functions fi and a single predictor variable x
LinearModelFit[{{x11,x12,...,y1},{x21,x22,...,y2}},{f1,f2,...},{x1,x2,...}]obtain a linear model of multiple predictor variables xi
LinearModelFit[{m,v}]obtain a linear model based on a design matrix m and response vector v

Linear model fitting.

Linear models have the form where is the fitted or predicted value, the Betai are parameters to be fitted, and the fi are functions of the predictor variables xi. The models are linear in the parameters Betai. The fi can be any functions of the predictor variables. Quite often the fi are simply the predictor variables xi.
This fits a linear model to the first 20 primes.
In[9]:=
Click for copyable input
Out[9]=
Options for model specification and for model analysis are available.
option namedefault value
ConfidenceLevel95/100confidence level to use for parameters and predictions
IncludeConstantBasisTruewhether to include a constant basis function
LinearOffsetFunctionNoneknown offset in the linear predictor
NominalVariablesNonevariables considered as nominal or categorical
VarianceEstimatorFunctionAutomaticfunction for estimating the error variance
WeightsAutomaticweights for data elements
WorkingPrecisionAutomaticprecision used in internal computations

Options for LinearModelFit.

The Weights option specifies weight values for weighted linear regression. The NominalVariables option specifies which predictor variables should be treated as nominal or categorical. With NominalVariables->All, the model is an analysis of variance (ANOVA) model. With NominalVariables->{x1, ..., xi-1, xi+1, ..., xn} the model is an analysis of covariance (ANCOVA) model with all but the ith predictor treated as nominal. Nominal variables are represented by a collection of binary variables indicating equality and inequality to the observed nominal categorical values for the variable.
ConfidenceLevel, VarianceEstimatorFunction, and WorkingPrecision are relevant to the computation of results after the initial fitting. These options can be set within LinearModelFit to specify the default settings for results obtained from the FittedModel object. These options can also be set within an already constructed FittedModel object to override the option values originally given to LinearModelFit.
Here are the default and mean squared error variance estimates.
In[10]:=
Click for copyable input
Out[10]=
IncludeConstantBasis, LinearOffsetFunction, NominalVariables, and Weights are relevant only to the fitting. Setting these options within an already constructed FittedModel object will have no further impact on the result.
A major feature of the model fitting framework is the ability to obtain results after the fitting. The full list of available results can be obtained from the "Properties" value.
This is the number of properties available for linear models.
In[11]:=
Click for copyable input
Out[11]=
The properties include basic information about the data, fitted model, and numerous results and diagnostics.
"BestFit"fitted function
"BestFitParameters"parameter estimates
"Data"the input data or design matrix and response vector
"DesignMatrix"design matrix for the model
"Function"best-fit pure function
"Response"response values in the input data

Properties related to data and the fitted function.

The "BestFitParameters" property gives the fitted parameter values {Beta0, Beta1, ...}. "BestFit" is the fitted function Beta0+Beta1 f1+Beta2 f2+CenterEllipsis and "Function" gives the fitted function as a pure function. The "DesignMatrix" is the design or model matrix for the data. "Response" gives the list of the response or y values from the original data.
"FitResiduals"difference between actual and predicted responses
"StandardizedResiduals"fit residuals divided by the standard error for each residual
"StudentizedResiduals"fit residuals divided by single deletion error estimates

Types of residuals.

Residuals give a measure of the point-wise difference between the fitted values and the original responses. "FitResiduals" gives the differences between the observed and fitted values . "StandardizedResiduals" and "StudentizedResiduals" are scaled forms of the residuals. The ith standardized residual is where is the estimated error variance, hii is the ith diagonal element of the hat matrix, and wi is the weight for the ith data point. The ith studentized residual uses the same formula with replaced by , the variance estimate omitting the ith data point.
"ANOVATable"analysis of variance table
"ANOVATableDegreesOfFreedom"degrees of freedom from the ANOVA table
"ANOVATableEntries"unformatted array of values from the table
"ANOVATableFStatistics"F statistics from the table
"ANOVATableMeanSquares"mean square errors from the table
"ANOVATablePValues"p-values from the table
"ANOVATableSumsOfSquares"sums of squares from the table
"CoefficientOfVariation"response mean divided by the estimated standard deviation
"EstimatedVariance"estimate of the error variance
"PartialSumOfSquares"changes in model sum of squares as nonconstant basis functions are removed
"SequentialSumOfSquares"the model sum of squares partitioned componentwise

Properties related to the sum of squared errors.

"ANOVATable" gives a formatted analysis of variance table for the model. "ANOVATableEntries" gives the numeric entries in the table and the remaining ANOVATable properties give the elements of columns in the table so individual parts of the table can easily be used in further computations.
This gives a formatted ANOVA table for the fitted model.
In[12]:=
Click for copyable input
Out[12]=
Here are the elements of the MS column of the table.
In[13]:=
Click for copyable input
Out[13]=
"CorrelationMatrix"parameter correlation matrix
"CovarianceMatrix"parameter covariance matrix
"EigenstructureTable"eigenstructure of the parameter correlation matrix
"EigenstructureTableEigenvalues"eigenvalues from the table
"EigenstructureTableEntries"unformatted array of values from the table
"EigenstructureTableIndexes"index values from the table
"EigenstructureTablePartitions"partitioning from the table
"ParameterConfidenceIntervals"parameter confidence intervals
"ParameterConfidenceIntervalTable"table of confidence interval information for the fitted parameters
"ParameterConfidenceIntervalTableEntries"unformatted array of values from the table
"ParameterConfidenceRegion"ellipsoidal parameter confidence region
"ParameterErrors"standard errors for parameter estimates
"ParameterPValues"p-values for parameter t statistics
"ParameterTable"table of fitted parameter information
"ParameterTableEntries"unformatted array of values from the table
"ParameterTStatistics"t statistics for parameter estimates
"VarianceInflationFactors"list of inflation factors for the estimated parameters

Properties and diagnostics for parameter estimates.

"CovarianceMatrix" gives the covariance between fitted parameters. The matrix is where is the variance estimate, X is the design matrix, and W is the diagonal matrix of weights. "CorrelationMatrix" is the associated correlation matrix for the parameter estimates. "ParameterErrors" is equivalent to the square root of the diagonal elements of the covariance matrix.
"ParameterTable" and "ParameterConfidenceIntervalTable" contain information about the individual parameter estimates, tests of parameter significance, and confidence intervals.
Here is some data.
In[14]:=
Click for copyable input
This fits a model using both predictor variables.
In[15]:=
Click for copyable input
Out[15]=
These are the formatted parameter and parameter confidence interval tables.
In[16]:=
Click for copyable input
Out[16]=
Here 99% confidence intervals are used in the table.
In[17]:=
Click for copyable input
Out[17]=
The Estimate column of these tables is equivalent to "BestFitParameters". The t statistics are the estimates divided by the standard errors. Each p-value is the two-sided p-value for the t statistic and can be used to assess whether the parameter estimate is statistically significantly different from 0. Each confidence interval gives the upper and lower bounds for the parameter confidence interval at the level prescribed by the ConfidenceLevel option. The various ParameterTable and ParameterConfidenceIntervalTable properties can be used to get the columns or the unformatted array of values from the table.
"VarianceInflationFactors" is used to measure the multicollinearity between basis functions. The ith inflation factor is equal to where is the coefficient of variation from fitting the ith basis function to a linear function of the other basis functions. With IncludeConstantBasis->True, the first inflation factor is for the constant term.
"EigenstructureTable" gives the eigenvalues, condition indices, and variance partitions for the nonconstant basis functions. The Index column gives the square root of the ratios of the eigenvalues to the largest eigenvalue. The column for each basis function gives the proportion of variation in that basis function explained by the associated eigenvector. "EigenstructureTablePartitions" gives the values in the variance partitioning for all basis functions in the table.
"BetaDifferences"DFBETAS measures of influence on parameter values
"CatcherMatrix"catcher matrix
"CookDistances"list of Cook distances
"CovarianceRatios"COVRATIO measures of observation influence
"DurbinWatsonD"Durbin-Watson d statistic for autocorrelation
"FitDifferences"DFFITS measures of influence on predicted values
"FVarianceRatios"FVARATIO measures of observation influence
"HatDiagonal"diagonal elements of the hat matrix
"SingleDeletionVariances"list of variance estimates with the ith data point omitted

Properties related to influence measures.

Point-wise measures of influence are often employed to assess whether individual data points have large impact on the fitting. The hat matrix and catcher matrix play important rolls in such diagnostics. The hat matrix is the matrix H such that where y is the observed response vector and is the predicted response vector. "HatDiagonal" gives the diagonal elements of the hat matrix. "CatcherMatrix" is the matrix C such that Beta=C y where Beta is the fitted parameter vector.
"FitDifferences" gives the DFFITS values that provide a measure of influence of each data point on the fitted or predicted values. The ith DFFITS value is given by where hii is the ith hat diagonal and rti is the ith studentized residual.
"BetaDifferences" gives the DFBETAS values that provide measures of influence of each data point on the parameters in the model. For a model with p parameters, the ith element of "BetaDifferences" is a list of length p with the jth value giving the measure of the influence of data point i on the jth parameter in the model. The ith "BetaDifferences" vector can be written as where cjk is the j,kth element of the catcher matrix.
"CookDistances" gives the Cook distance measures of leverage given. The ith Cook distance is given by (hii/(1-hii)rsi/p where rsi is the ith standardized residual.
The ith element of "CovarianceRatios" is given by and the ith "FVarianceRatios" value is equal to where is the ith single deletion variance.
The Durbin-Watson d statistic "DurbinWatsonD" is used for testing the existence of a first-order autoregressive process. The d statistic is equivalent to where ri is the ithresidual.
This plots the Cook distances for the bivariate model.
In[18]:=
Click for copyable input
Out[18]=
"MeanPredictionBands"confidence bands for mean predictions
"MeanPredictionConfidenceIntervals"confidence intervals for the mean predictions
"MeanPredictionConfidenceIntervalTable"table of confidence intervals for the mean predictions
"MeanPredictionConfidenceIntervalTableEntries"unformatted array of values from the table
"MeanPredictionErrors"standard errors for mean predictions
"PredictedResponse"fitted values for the data
"SinglePredictionBands"confidence bands based on single observations
"SinglePredictionConfidenceIntervals"confidence intervals for the predicted response of single observations
"SinglePredictionConfidenceIntervalTable"table of confidence intervals for the predicted response of single observations
"SinglePredictionConfidenceIntervalTableEntries"unformatted array of values from the table
"SinglePredictionErrors"standard errors for the predicted response of single observations

Properties of predicted values.

Tabular results for confidence intervals are given by "MeanPredictionConfidenceIntervalTable" and "SinglePredictionConfidenceIntervalTable". These include the observed and predicted responses, standard error estimates, and confidence intervals for each point. Mean prediction confidence intervals are often referred to simply as confidence intervals and single prediction confidence intervals are often referred to as prediction intervals.
"MeanPredictionBands" and "SinglePredictionBands" give functions of the predictor variables.
Here is the mean prediction table.
In[19]:=
Click for copyable input
Out[19]=
This gives the 90% mean prediction intervals.
In[20]:=
Click for copyable input
Out[20]=
"AdjustedRSquared"R2 adjusted for the number of model parameters
"AIC"Akaike Information Criterion
"BIC"Bayesian Information Criterion
"RSquared"coefficient of determination R2

Goodness of fit measures.

Goodness-of-fit measures are used to assess how well a model fits or to compare models. The coefficient of determination "RSquared" is the ratio of the model sum of squares to the total sum of squares. "AdjustedRSquared" penalizes for the number of parameters in the model and is given by .
"AIC" and "BIC" are likelihood-based goodness-of-fit measures. Both are equal to -2 times the log-likelihood for the model plus k p where p is the number of parameters to be estimated including the estimated variance. For "AIC" k is 2, and for "BIC" k is log (n).

Generalized Linear Models

The linear model can be seen as a model where each response value y is an observation from a normal distribution with mean value . The generalized linear model extends to models of the form with each y assumed to be an observation from a distribution of known exponential family form with mean , and g being an invertible function over the support of the exponential family. Models of this sort can be obtained via GeneralizedLinearModelFit.
GeneralizedLinearModelFit[{y1,y2,...},{f1,f2,...},x]obtain a generalized linear model with basis functions fi and a single predictor variable x
GeneralizedLinearModelFit[{{x11,x12,...,y1},{x21,x22,...,y2}},{f1,f2,...},{x1,x2,...}]obtain a generalized linear model of multiple predictor variables xi
GeneralizedLinearModelFit[{m,v}]obtain a generalized linear model based on a design matrix m and response vector v

Generalized linear model fitting.

The invertible function g is called the link function and the linear combination Beta0+Beta1 f1+Beta2 f2+CenterEllipsis is referred to as the linear predictor. Common special cases include the linear regression model with the identity link function and Gaussian or normal exponential family distribution, logit and probit models for probabilities, Poisson models for count data, and gamma and inverse Gaussian models.
The error variance is a function of the prediction and is defined by the distribution up to a constant Phi, which is referred to as the dispersion parameter. The error variance for a fitted value can be written as , where is an estimate of the dispersion parameter obtained from the observed and predicted response values, and is the variance function associated with the exponential family evaluated at the value .
This fits a linear regression model.
In[21]:=
Click for copyable input
Out[21]=
This fits a canonical gamma regression model to the same data.
In[22]:=
Click for copyable input
Out[22]=
Here are the functional forms of the models.
In[23]:=
Click for copyable input
Out[23]=
Logit and probit models are common binomial models for probabilities. The link function for the logit model is and the link for the probit model is the inverse CDF for a standard normal distribution . Models of this type can be fitted via GeneralizedLinearModelFit with ExponentialFamily->"Binomial" and the appropriate LinkFunction or via LogitModelFit and ProbitModelFit.
LogitModelFit[data,funs,vars]obtain a logit model with basis functions funs and predictor variables vars
LogitModelFit[{m,v}]obtain a logit model based on a design matrix m and response vector v
ProbitModelFit[data,funs,vars]obtain a probit model fit to data
ProbitModelFit[{m,v}]obtain a probit model fit to a design matrix m and response vector v

Logit and probit model fitting.

Parameter estimates are obtained via iteratively reweighted least squares with weights obtained from the variance function of the assumed distribution. Options for GeneralizedLinearModelFit include options for iteration fitting such as PrecisionGoal, options for model specification such as LinkFunction, and options for further analysis such as ConfidenceLevel.
option namedefault value
AccuracyGoalAutomaticthe accuracy sought
ConfidenceLevel95/100confidence level to use for parameters and predictions
CovarianceEstimatorFunction"ExpectedInformation"estimation method for the parameter covariance matrix
DispersionEstimatorFunctionAutomaticfunction for estimating the dispersion parameter
ExponentialFamilyAutomaticexponential family distribution for y
IncludeConstantBasisTruewhether to include a constant basis function
LinearOffsetFunctionNoneknown offset in the linear predictor
LinkFunctionAutomaticlink function for the model
MaxIterationsAutomaticmaximum number of iterations to use
NominalVariablesNonevariables considered as nominal or categorical
PrecisionGoalAutomaticthe precision sought
WeightsAutomaticweights for data elements
WorkingPrecisionAutomaticprecision used in internal computations

Options for GeneralizedLinearModelFit.

The options for LogitModelFit and ProbitModelFit are the same as for GeneralizedLinearModelFit except that ExponentialFamily and LinkFunction are defined by the logit or probit model and so are not options to LogitModelFit and ProbitModelFit.
ExponentialFamily can be "Binomial", "Gamma", "Gaussian", "InverseGaussian", "Poisson", or "QuasiLikelihood". Binomial models are valid for responses from 0 to 1. Poisson models are valid for non-negative integer responses. Gaussian or normal models are valid for real responses. Gamma and inverse Gaussian models are valid for positive responses. Quasi-likelihood models define the distributional structure in terms of a variance function v (Mu) such that the log of the quasi-likelihood function for the ith data point is given by . The variance function for a "QuasiLikelihood" model can be optionally set via ExponentialFamily->{"QuasiLikelihood", "VarianceFunction"->fun} where fun is a pure function to be applied to fitted values.
DispersionEstimatorFunction defines a function for estimating the dispersion parameter Phi. The estimate is analogous to in linear and nonlinear regression models.
ExponentialFamily, IncludeConstantBasis, LinearOffsetFunction, LinkFunction, NominalVariables, and Weights all define some aspect of the model structure and optimization criterion and can only be set within GeneralizedLinearModelFit. All other options can be set either within GeneralizedLinearModelFit or passed to the FittedModel object when obtaining results and diagnostics. Options set in evaluations of FittedModel objects take precedence over settings given to GeneralizedLinearModelFit at the time of the fitting.
This gives 95% and 99% confidence intervals for the parameters in the gamma model.
In[24]:=
Click for copyable input
Out[24]=
"BestFit"fitted function
"BestFitParameters"parameter estimates
"Data"the input data or design matrix and response vector
"DesignMatrix"design matrix for the model
"Function"best fit pure function
"LinearPredictor"fitted linear combination
"Response"response values in the input data

Properties related to data and the fitted function.

"BestFitParameters" gives the parameter estimates for the basis functions. "BestFit" gives the fitted function , and "LinearPredictor" gives the linear combination . "DesignMatrix" is the design or model matrix for the basis functions.
"Deviances"deviances
"DevianceTable"deviance table
"DevianceTableDegreesOfFreedom"degrees of freedom differences from the table
"DevianceTableDeviances"deviance differences from the table
"DevianceTableEntries"unformatted array of values from the table
"DevianceTableResidualDegreesOfFreedom"residual degrees of freedom from the table
"DevianceTableResidualDeviances"residual deviances from the table
"EstimatedDispersion"estimated dispersion parameter
"NullDeviance"deviance for the null model
"NullDegreesOfFreedom"degrees of freedom for the null model
"ResidualDeviance"difference between the model deviance and null deviance
"ResidualDegreesOfFreedom"difference between the model degrees of freedom and null degrees of freedom

Properties related to dispersion and model deviances.

Deviances and deviance tables generalize the model decomposition given by analysis of variance in linear models. The deviance for a single data point is where ScriptLm is the log-likelihood function for the fitted model. "Deviances" gives a list of the deviance values for all data points. The sum of all deviances gives the model deviance. The model deviance can be decomposed as sums of squares are in an ANOVA table for linear models.
Here is some data with two predictor variables.
In[31]:=
Click for copyable input
This fits the data to an inverse Gaussian model.
In[32]:=
Click for copyable input
Out[32]=
Here is the deviance table for the model.
In[33]:=
Click for copyable input
Out[33]=
As with sums of squares, deviances are additive. The Deviance column of the table gives the increase in the model deviance when the given basis function is added. The Residual Deviance column gives the difference between the model deviance and the deviance for the submodel containing all previous terms in the table. For large samples, the increase in deviance is approximately Chi2 distributed with degrees of freedom equal to those for the basis function in the table.
"NullDeviance" is the deviance for the null model, the constant model equal to the mean of all observed responses for models including a constant or g-1(0) if a constant term is not included.
As with "ANOVATable", a number of properties are included to extract the columns or unformatted array of entries from "DevianceTable".
"AnscombeResiduals"Anscombe residuals
"DevianceResiduals"deviance residuals
"FitResiduals"difference between actual and predicted responses
"LikelihoodResiduals"likelihood residuals
"PearsonResiduals"Pearson residuals
"StandardizedDevianceResiduals"standardized deviance residuals
"StandardizedPearsonResiduals"standardized Pearson residuals
"WorkingResiduals"working residuals

Types of residuals.

"FitResiduals" is the list of residuals, differences between the observed and predicted responses. Given the distributional assumptions, the magnitude of the residuals is expected to change as a function of the predicted response value. Various types of scaled residuals are employed in the analysis of generalized linear models.
If di and are the deviance and residual for the ith data point, the ith deviance residual is given by . The ith Pearson residual is defined as where v is the variance function for the exponential family distribution. Standardized deviance residuals and standardized Pearson residuals include division by where hii is the ith diagonal of the hat matrix. "LikelihoodResiduals" values combine deviance and Pearson residuals. The ith likelihood residual is given by .
"AnscombeResiduals" provide a transformation of the residuals toward normality, so a plot of these residuals should be expected to look roughly like white noise. The ith Anscombe residual can be written as .
"WorkingResiduals" gives the residuals from the last step of the iterative fitting. The ith working residual can be obtained as evaluated at .
This plots the residuals and Anscombe residuals for the inverse Gaussian model.
In[41]:=
Click for copyable input
Out[41]=
"CorrelationMatrix"asymptotic parameter correlation matrix
"CovarianceMatrix"asymptotic parameter covariance matrix
"ParameterConfidenceIntervals"parameter confidence intervals
"ParameterConfidenceIntervalTable"table of confidence interval information for the fitted parameters
"ParameterConfidenceIntervalTableEntries"unformatted array of values from the table
"ParameterConfidenceRegion"ellipsoidal parameter confidence region
"ParameterTableEntries"unformatted array of values from the table
"ParameterErrors"standard errors for parameter estimates
"ParameterPValues"p-values for parameter z-statistics
"ParameterTable"table of fitted parameter information
"ParameterZStatistics"z-statistics for parameter estimates

Properties and diagnostics for parameter estimates.

"CovarianceMatrix" gives the covariance between fitted parameters and is very similar to the definition for linear models. With CovarianceEstimatorFunction->"ExpectedInformation" the expected information matrix obtained from the iterative fitting is used. The matrix is where X is the design matrix, and W is the diagonal matrix of weights from the final stage of the fitting. The weights include both weights specified via the Weights option and the weights associated with the distribution's variance function. With CovarianceEstimatorFunction->"ObservedInformation" the matrix is given by -Phi I-1 where I is the observed Fisher information matrix, which is the Hessian of the log-likelihood function with respect to parameters of the model.
"CorrelationMatrix" is the associated correlation matrix for the parameter estimates. "ParameterErrors" is equivalent to the square root of the diagonal elements of the covariance matrix. "ParameterTable" and "ParameterConfidenceIntervalTable" contain information about the individual parameter estimates, tests of parameter significance, and confidence intervals. The test statistics for generalized linear models asymptotically follow normal distributions.
"CookDistances"list of Cook distances
"HatDiagonal"diagonal elements of the hat matrix

Properties related to influence measures.

"CookDistances" and "HatDiagonal" extend the leverage measures from linear regression to generalized linear models. The hat matrix from which the diagonal elements are extracted is defined using the final weights of the iterative fitting.
The Cook distance measures of leverage are defined as in linear regression with standardized residuals replaced by standardized Pearson residuals. The ith Cook distance is given by (hii/(1-hii)rspi/p where rspi is the ith standardized Pearson residual.
"PredictedResponse"fitted values for the data

Properties of predicted values.

"AdjustedLikelihoodRatioIndex"Ben-Akiva and Lerman's adjusted likelihood ratio index
"AIC"Akaike Information Criterion
"BIC"Bayesian Information Criterion
"CoxSnellPseudoRSquared"Cox and Snell's pseudo R2
"CraggUhlerPseudoRSquared"Cragg and Uhler's pseudo R2
"EfronPseudoRSquared"Efron's pseudo R2
"LikelihoodRatioIndex"McFadden's likelihood ratio index
"LikelihoodRatioStatistic"likelihood ratio
"LogLikelihood"log likelihood for the fitted model
"PearsonChiSquare"Pearson's Chi2 statistic

Goodness of fit measures.

"LogLikelihood" is the log-likelihood for the fitted model. "AIC" and "BIC" are penalized log-likelihood measures 2 ScriptL+k p where ScriptL is the log-likelihood for the fitted model, p is the number of parameters estimated including the dispersion parameter, and k is 2 for "AIC" and log (n) for "BIC" for a model of n data points. "LikelihoodRatioStatistic" is given by 2 (ScriptL-ScriptL0) where ScriptL0 is the log-likelihood for the null model.
A number of the goodness of fit measures generalize R2 from linear regression as either a measure of explained variation or as a likelihood-based measure. "CoxSnellPseudoRSquared" is given by 1- (ExponentialEScriptL0-ScriptL)2/n. "CraggUhlerPseudoRSquared" is a scaled version of Cox and Snell's measure (1- (ExponentialEScriptL0-ScriptL)2/n)/ (1- (ExponentialEScriptL0)2/n). "LikelihoodRatioIndex" involves the ratio of log-likelihoods 1-ScriptL/ScriptL0, and "AdjustedLikelihoodRatioIndex" adjusts by penalizing for the number of parameters 1- (ScriptL-p)/ScriptL0. "EfronPseudoRSquared" uses the sum of squares interpretation of R2 and is given as where ri is the ith residual and is the mean of the responses yi.
"PearsonChiSquare" is equal to where the rpi are Pearson residuals.

Nonlinear Models

A nonlinear least squares model is an extension of the linear model where the model need not be a linear combination of basis function. The errors are still assumed to be independent and normally distributed. Models of this type can be fitted using the NonlinearModelFit function.
NonlinearModelFit[{y1,y2,...},form,{Beta1,...},x]obtain a nonlinear model of the function form with parameters Betai a single parameter predictor variable x
NonlinearModelFit[{{x11,...,y1},{x21,...,y2}},form,{Beta1,...},{x1,...}]obtain a nonlinear model as a function of multiple predictor variables xi
NonlinearModelFit[data,{form,cons},{Beta1,...},{x1,...}]obtain a nonlinear model subject to the constraints cons

Nonlinear model fitting.

Nonlinear models have the form where is the fitted or predicted value, the Betai are parameters to be fitted, and the xi are predictor variables. As with any nonlinear optimization problem, a good choice of starting values for the parameters may be necessary. Starting values can be given using the same parameter specifications as for FindFit.
This fits a nonlinear model to a sequence of square roots.
In[25]:=
Click for copyable input
Out[25]=
Options for model fitting and for model analysis are available.
option namedefault value
AccuracyGoalAutomaticthe accuracy sought
ConfidenceLevel95/100confidence level to use for parameters and predictions
EvaluationMonitorNoneexpression to evaluate whenever expr is evaluated
MaxIterationsAutomaticmaximum number of iterations to use
MethodAutomaticmethod to use
PrecisionGoalAutomaticthe precision sought
StepMonitorNonethe expression to evaluate whenever a step is taken
VarianceEstimatorFunctionAutomaticfunction for estimating the error variance
WeightsAutomaticweights for data elements
WorkingPrecisionAutomaticprecision used in internal computations

Options for NonlinearModelFit.

General numeric options such as AccuracyGoal, Method, and WorkingPrecision are the same as for FindFit.
The Weights option specifies weight values for weighted nonlinear regression. The optimal fit is for a weighted sum of squared errors.
All other options can be relevant to computation of results after the initial fitting. They can be set within NonlinearModelFit for use in the fitting and to specify the default settings for results obtained from the FittedModel object. These options can also be set within an already constructed FittedModel object to override the option values originally given to NonlinearModelFit.
"BestFit"fitted function
"BestFitParameters"parameter estimates
"Data"the input data
"Function"best fit pure function
"Response"response values in the input data

Properties related to data and the fitted function.

Basic properties of the data and fitted function for nonlinear models behave like the same properties for linear and generalized linear models with the exception that "BestFitParameters" returns a rule as is done for the result of FindFit.
This gives the fitted function and rules for the parameter estimates.
In[26]:=
Click for copyable input
Out[26]=
Many diagnostics for nonlinear models extend or generalize concepts from linear regression. These extensions often rely on linear approximations or large sample approximations.
"FitResiduals"difference between actual and predicted responses
"StandardizedResiduals"fit residuals divided by the standard error for each residual
"StudentizedResiduals"fit residuals divided by single deletion error estimates

Types of residuals.

As in linear regression, "FitResiduals" gives the differences between the observed and fitted values , and "StandardizedResiduals" and "StudentizedResiduals" are scaled forms of these differences.
The ith standardized residual is where is the estimated error variance, hii is the ith diagonal element of the hat matrix, wi is the weight for the ith data point, and the ith studentized residual is obtained by replacing with the ith single deletion variance . For nonlinear models a first-order approximation is used for the design matrix, which is needed to compute the hat matrix.
"ANOVATable"analysis of variance table
"ANOVATableDegreesOfFreedom"degrees of freedom from the ANOVA table
"ANOVATableEntries"unformatted array of values from the table
"ANOVATableMeanSquares"mean square errors from the table
"ANOVATableSumsOfSquares"sums of squares from the table
"EstimatedVariance"estimate of the error variance

Properties related to the sum of squared errors.

"ANOVATable" provides a decomposition of the variation in the data attributable to the fitted function and to the errors or residuals.
This gives the ANOVA table for the nonlinear model.
In[27]:=
Click for copyable input
Out[27]=
The uncorrected total sums of squares gives the sum of squared responses, while the corrected total gives the sum of squared differences between the responses and their mean value.
"CorrelationMatrix"asymptotic parameter correlation matrix
"CovarianceMatrix"asymptotic parameter covariance matrix
"ParameterBias"estimated bias in the parameter estimates
"ParameterConfidenceIntervals"parameter confidence intervals
"ParameterConfidenceIntervalTable"table of confidence interval information for the fitted parameters
"ParameterConfidenceIntervalTableEntries"unformatted array of values from the table
"ParameterConfidenceRegion"ellipsoidal parameter confidence region
"ParameterErrors"standard errors for parameter estimates
"ParameterPValues"p-values for parameter t statistics
"ParameterTable"table of fitted parameter information
"ParameterTableEntries"unformatted array of values from the table
"ParameterTStatistics"t statistics for parameter estimates

Properties and diagnostics for parameter estimates.

"CovarianceMatrix" gives the approximate covariance between fitted parameters. The matrix is where is the variance estimate, X is the design matrix for the linear approximation to the model, and W is the diagonal matrix of weights. "CorrelationMatrix" is the associated correlation matrix for the parameter estimates. "ParameterErrors" is equivalent to the square root of the diagonal elements of the covariance matrix.
"ParameterTable" and "ParameterConfidenceIntervalTable" contain information about the individual parameter estimates, tests of parameter significance, and confidence intervals obtained using the error estimates.
"CurvatureConfidenceRegion"confidence region for curvature diagnostics
"FitCurvatureTable"table of curvature diagnostics
"FitCurvatureTableEntries"unformatted array of values from the table
"MaxIntrinsicCurvature"measure of maximum intrinsic curvature
"MaxParameterEffectsCurvature"measure of maximum parameter effects curvature

Curvature diagnostics.

The first-order approximation used for many diagnostics is equivalent to the model being linear in the parameters. If the parameter space near the parameter estimates is sufficiently flat, the linear approximations and any results that rely on first-order approximations can be deemed reasonable. Curvature diagnostics are used to assess whether the approximate linearity is reasonable. "FitCurvatureTable" is a table of curvature diagnostics.
"MaxIntrinsicCurvature" and "MaxParameterEffectsCurvature" are scaled measures of the normal and tangential curvatures of the parameter spaces at the best-fit parameter values. "CurvatureConfidenceRegion" is a scaled measure of the radius of curvature of the parameter space at the best-fit parameter values. If the normal and tangential curvatures are small relative to the value of the "CurvatureConfidenceRegion", the linear approximation is considered reasonable. Some rules of thumb suggest comparing the values directly, while others suggest comparing with half the "CurvatureConfidenceRegion".
Here is the curvature table for the nonlinear model.
In[28]:=
Click for copyable input
Out[28]=
"HatDiagonal"diagonal elements of the hat matrix
"SingleDeletionVariances"list of variance estimates with the ith data point omitted

Properties related to influence measures.

The hat matrix is the matrix H such that where y is the observed response vector and is the predicted response vector. "HatDiagonal" gives the diagonal elements of the hat matrix. As with other properties, H uses the design matrix for the linear approximation to the model.
The ith element of "SingleDeletionVariances" is equivalent to where n is the number of data points, p is the number of parameters, hii is the ith hat diagonal, is the variance estimate for the full dataset, and ri is the ith residual.
"MeanPredictionBands"confidence bands for mean predictions
"MeanPredictionConfidenceIntervals"confidence intervals for the mean predictions
"MeanPredictionConfidenceIntervalTable"table of confidence intervals for the mean predictions
"MeanPredictionConfidenceIntervalTableEntries"unformatted array of values from the table
"MeanPredictionErrors"standard errors for mean predictions
"PredictedResponse"fitted values for the data
"SinglePredictionBands"confidence bands based on single observations
"SinglePredictionConfidenceIntervals"confidence intervals for the predicted response of single observations
"SinglePredictionConfidenceIntervalTable"table of confidence intervals for the predicted response of single observations
"SinglePredictionConfidenceIntervalTableEntries"unformatted array of values from the table
"SinglePredictionErrors"standard errors for the predicted response of single observations

Properties of predicted values.

Tabular results for confidence intervals are given by "MeanPredictionConfidenceIntervalTable" and "SinglePredictionConfidenceIntervalTable". These results are analogous to those for linear models obtained via LinearModelFit, again with first-order approximations used for the design matrix.
"MeanPredictionBands" and "SinglePredictionBands" give functions of the predictor variables.
Here the fitted function and mean prediction bands are obtained.
In[29]:=
Click for copyable input
Out[29]=
This plots the fitted curve and confidence bands.
In[30]:=
Click for copyable input
Out[30]=
"AdjustedRSquared"R2 adjusted for the number of model parameters
"AIC"Akaike Information Criterion
"BIC"Bayesian Information Criterion
"RSquared"coefficient of determination R2

Goodness of fit measures.

"AdjustedRSquared", "AIC", "BIC", and "RSquared" are all direct extensions of the measures as defined for linear models. The coefficient of determination "RSquared" is the ratio of the model sum of squares to the total sum of squares. "AdjustedRSquared" penalizes for the number of parameters in the model and is given by .
"AIC" and "BIC" are equal to -2 times the log-likelihood for the model plus k p where p is the number of parameters to be estimated including the estimated variance. For "AIC" k is 2, and for "BIC" k is log (n).