Wolfram Language & System Documentation Center

ShapiroWilkTest

ShapiroWilkTest[data]

tests whether data is normally distributed using the Shapiro–Wilk test.

ShapiroWilkTest[data,"property"]

returns the value of "property".

Details and Options

ShapiroWilkTest performs the Shapiro–Wilk goodness-of-fit test with null hypothesis that data was drawn from a NormalDistribution and alternative hypothesis that it was not.
By default, a probability value or -value is returned.
A small -value suggests that it is unlikely that the data is normally distributed.
The data can be univariate {x₁,x₂,…} or multivariate {{x₁,y₁,…},{x₂,y₂,…},…}.
The Shapiro–Wilk test effectively compares the order statistics of data to the theoretical order statistics of a NormalDistribution.
ShapiroWilkTest[data,dist,"HypothesisTestData"] returns a HypothesisTestData object htd that can be used to extract additional test results and properties using the form htd["property"].
ShapiroWilkTest[data,dist,"property"] can be used to directly give the value of "property".
Properties related to the reporting of test results include:

	"PValue"	-value
	"PValueTable"	formatted version of "PValue"
	"ShortTestConclusion"	a short description of the conclusion of a test
	"TestConclusion"	a description of the conclusion of a test
	"TestData"	test statistic and -value
	"TestDataTable"	formatted version of "TestData"
	"TestStatistic"	test statistic
	"TestStatisticTable"	formatted "TestStatistic"

The following properties are independent of which test is being performed.
Properties related to the data distribution include:
"FittedDistribution" fitted distribution of data

"FittedDistributionParameters" distribution parameters of data
The following options can be given:
Method Automatic the method to use for computing -values

SignificanceLevel 0.05 cutoff for diagnostics and reporting
For a test for goodness of fit, a cutoff is chosen such that is rejected only if . The value of used for the "TestConclusion" and "ShortTestConclusion" properties is controlled by the SignificanceLevel option. By default, is set to 0.05.
With the setting Method->"MonteCarlo", datasets of the same length as the input are generated under using the fitted distribution. The EmpiricalDistribution from ShapiroWilkTest[s_i,"TestStatistic"] is then used to estimate the -value.

Examples

open all close all

Basic Examples (2)

Perform a Shapiro–Wilk test for normality:

Perform a test for multivariate normality:

The full test table:

The test statistic and -value:

Scope (6)

Testing (3)

Perform a Shapiro–Wilk test for normality:

The -value for the normal data is large compared to the -value for the non-normal data:

Test for multivariate normality:

Create a HypothesisTestData object for repeated property extraction:

The properties available for extraction:

Reporting (3)

Tabulate the results of the Shapiro–Wilk test:

The full test table:

A -value table:

The test statistic:

Retrieve the entries from a Shapiro–Wilk test table for custom reporting:

Report test conclusions using "ShortTestConclusion" and "TestConclusion":

The conclusion may differ at a different significance level:

Options (3)

Method (3)

Use Monte Carlo-based methods or a computation formula:

Set the number of samples to use for Monte Carlo-based methods:

The Monte Carlo estimate converges to the true -value with increasing samples:

Set the random seed used in Monte Carlo-based methods:

The seed affects the state of the generator and has some effect on the resulting -value:

Applications (2)

A power curve for the Shapiro–Wilk test:

Visualize the approximate power curve:

Estimate the power of the Shapiro–Wilk test when the underlying distribution is a CauchyDistribution[0,1], the test size is 0.05, and the sample size is 12:

The boiling point of water was measured at varying altitudes in the Alps. The barometric pressure was recorded for each boiling point. Determine if a linear model is appropriate for use in predicting boiling points given pressure:

A plot of the model and the data:

For the model to be appropriate, the residuals should be normally distributed:

A QuantilePlot confirms that the linear model is not appropriate for this data:

Properties & Relations (3)

ShapiroWilkTest compares the order statistics of the data to their expectations under :

Expected values of the order statistics and an estimate of their covariance matrix:

These are used to compute weights:

The statistic using the estimated covariance matrix is slightly different from the reported value:

For tests of multivariate normality, a transformation to univariate data is made:

The data has been transformed to approximate univariate normal data:

Perform the test on the transformed data:

The result agrees with a test of the original data:

The Shapiro–Wilk test works with the values only when the input is a TimeSeries:

Possible Issues (1)

The Shapiro–Wilk test requires sample sizes be less than 5000 for -values to be valid:

Neat Examples (1)

Compute the statistic when the null hypothesis is true:

The test statistic given a particular alternative:

Compare the distributions of the test statistics:

Top

More Learning

Tech Support

Wolfram Solutions

Wolfram Solutions For Education

Get Started

Grow Your Skills

Work with Us

Educational Programs for Adults

Educational Programs for Youth

Read

ShapiroWilkTest

Details and Options

Examples

Basic Examples (2)