BUILT-IN MATHEMATICA SYMBOL
DistributionFitTest
- DistributionFitTest performs a goodness-of-fit hypothesis test with null hypothesis
that data was drawn from a population with distribution dist and alternative hypothesis
that it was not.
- By default, a probability value or
-value is returned.
- A small
-value suggests that it is unlikely that the data came from dist.
- The dist can be any symbolic distribution with numeric and symbolic parameters or a dataset.
- The data can be univariate
or multivariate
.
- DistributionFitTest[data, dist, Automatic] will choose the most powerful test that applies to data and dist for a general alternative hypothesis.
- DistributionFitTest[data, dist, All] will choose all tests that apply to data and dist.
- DistributionFitTest[data, dist, "test"] reports the
-value according to
.
- Many of the tests use the CDF
of the test distribution dist and the empirical CDF
of the data as well as their difference
and
=Expectation[d(x), ...]. The CDFs
and
should be the same under the null hypothesis
.
- The following tests can be used for univariate or multivariate distributions:
-
| "AndersonDarling" | distribution, data | based on Expectation[ ] |
| "CramerVonMises" | distribution, data | based on Expectation[d(x)2] |
| "JarqueBeraALM" | normality | based on skewness and kurtosis |
| "KolmogorovSmirnov" | distribution, data | based on ![sup_x TemplateBox[{{d, (, x, )}}, Abs] sup_x TemplateBox[{{d, (, x, )}}, Abs]](Files/DistributionFitTest.en/18.png) |
| "Kuiper" | distribution, data | based on  |
| "PearsonChiSquare" | distribution, data | based on expected and observed histogram |
| "ShapiroWilk" | normality | based on quantiles |
| "WatsonUSquare" | distribution, data | based on Expectation[(d(x)- )2] |
- The following tests can be used for multivariate distributions:
-
| "DistanceToBoundary" | uniformity | based on distance to uniform boundaries |
| "MardiaCombined" | normality | combined Mardia skewness and kurtosis |
| "MardiaKurtosis" | normality | based on multivariate kurtosis |
| "MardiaSkewness" | normality | based on multivariate skewness |
| "SzekelyEnergy" | data | based on Newton's potential energy |
- DistributionFitTest[data, dist, "HypothesisTestData"] returns a HypothesisTestData object htd that can be used to extract additional test results and properties using the form htd["property"].
- DistributionFitTest[data, dist, "property"] can be used to directly give the value of
.
- Properties related to the reporting of test results include:
-
| "AllTests" | list of all applicable tests |
| "AutomaticTest" | test chosen if Automatic is used |
| "DegreesOfFreedom" | the degrees of freedom used in a test |
| "PValue" | list of -values |
| "PValueTable" | formatted table of -values |
| "ShortTestConclusion" | a short description of the conclusion of a test |
| "TestConclusion" | a description of the conclusion of a test |
| "TestData" | list of pairs of test statistics and -values |
| "TestDataTable" | formatted table of -values and test statistics |
| "TestStatistic" | list of test statistics |
| "TestStatisticTable" | formatted table of test statistics |
- The following properties are independent of which test is being performed.
- Properties related to the data distribution include:
-
| "FittedDistribution" | fitted distribution of data |
| "FittedDistributionParameters" | distribution parameters of data |
- The following options can be given:
-
- For a test for goodness of fit, a cutoff
is chosen such that
is rejected only if
. The value of
used for the
and
properties is controlled by the SignificanceLevel option. By default
is set to 0.05.
- With the setting Method->"MonteCarlo",
datasets of the same length as the input
are generated under
using the fitted distribution. The EmpiricalDistribution from DistributionFitTest[si, dist, {"TestStatistic", test}] is then used to estimate the
-value.
Test some data for normality:
| Out[2]= |  |
Create a HypothesisTestData object for further property extraction:
The full test table:
| Out[4]= |  |
Compare the histogram of the data to the PDF of the test distribution:
| Out[5]= |  |
Test the fit of a set of data to a particular distribution:
Extract the Anderson-Darling test table:
| Out[2]= |  |
Verify the test results with ProbabilityPlot:
| Out[3]= |  |
Test data for goodness of fit to a multivariate distribution:
| Out[3]= |  |
Plot the marginal PDFs of the test distribution against the data to confirm the test results:
| Out[4]= |  |
New in 8