Hypothesis Testing Package
This package contains functions for testing hypotheses for and computing confidence intervals from data. Tests and confidence intervals for means, differences of means, variances and ratios of variances are included along with
p-values and confidence intervals for distributions related to the normal distribution.
Hypothesis Tests
The tests in this package are parametric tests. Given a null hypothesis value
0 for a parameter

and an estimate

of

obtained from data, hypothesis test functions give the probability of observing a value at least as extreme as

if
0 is the true value of

. A sufficiently small probability, or
p-value, provides evidence that the true value of

is significantly different from
0.
MeanTest and
MeanDifferenceTest provide tests of means based on the Central Limit Theorem.
MeanTest[list, 0] | performs a test with null hypothesis = 0 |
MeanDifferenceTest[list1,list2, 0] | performs a test with null hypothesis 1- 2= 0 |
Hypothesis tests for means.
This tests whether the population mean is significantly different from 35.
| Out[13]= |  |
|
Assumptions about variances of the populations from which the data were sampled will affect the distribution of the test statistic. The
KnownVariance and
EqualVariances options can be used to specify assumptions about population variances.
Option for MeanTest and MeanDifferenceTest.
Tests of the mean and of the difference of means are based on a normal distribution if the population variances are assumed known.
Tests of the mean are based on Student's
t distribution with
n-1 degrees of freedom when the population variance must be estimated from a list of
n elements.
This tests whether the population mean is significantly different from 35, assuming the population variance is 8.
| Out[14]= |  |
|
Option for MeanDifferenceTest.
Tests of the difference of means are also based on Student's
t distribution if the variances are not known. If the variances are assumed equal,
MeanDifferenceTest is based on Student's
t distribution with
Length[list1]+Length[list2]-2 degrees of freedom. If the population variances are not assumed equal, Welch's approximation for the degrees of freedom is used.
Here is a second dataset. |
This tests whether the difference between the means of the two populations is significantly different from 0.
| Out[15]= |  |
|
This tests the same hypothesis with the additional assumption that the population variances are the same.
| Out[16]= |  |
|
VarianceTest and
VarianceRatioTest provide tests of variances for normally distributed samples.
Hypothesis tests for variances.
The variance test statistic follows a
2 distribution and the variance ratio test statistic follows an
F-ratio distribution.
Here is another set of data. |
This tests whether the variance of the population from which these data were sampled is significantly different from 8.
| Out[17]= |  |
|
By default hypothesis test functions return a one-sided
p-value. If the parameter estimate is smaller than the null hypothesis value, the
p-value is the probability of observing a value less than or equal to the parameter estimate. If the estimate is greater than the null hypothesis value, the
p-value is the probability of observing a value greater than or equal to the parameter estimate.
Additional options are included to obtain two-sided
p-values and to obtain more information about the test results.
Options for all hypothesis test functions.
A two-sided test can be requested using
TwoSided->True. More details about a test can be obtained using
FullReport->True. A full report includes the parameter estimate, the test statistic and the distribution of the test statistic. With
SignificanceLevel->
, a conclusion of the test at significance level

is given, stating whether or not the null hypothesis is rejected at that level of significance.
This is the full report for a mean test of data1.
| Out[19]= |  |
|
This is a two-sided variance test of data.
| Out[20]= |  |
|
Given a test statistic in terms of the normal, chi-square, Student's
t, or
F-ratio distribution, a
p-value can be computed using the appropriate
p-value function. For example,
NormalPValue computes a
p-value for a test statistic using a normal distribution with mean zero and unit variance. A two-sided
p-value can be obtained by setting
TwoSided->True.
| NormalPValue[teststat] | gives the p-value for teststat in terms of the normal distribution with mean 0 and unit variance |
| StudentTPValue[teststat,dof] | gives the p-value for teststat in terms of Student's t distribution with dof degrees of freedom |
| ChiSquarePValue[teststat,dof] | gives the p-value for teststat in terms of the 2 distribution with dof degrees of freedom |
| FRatioPValue[teststat,numdof,dendof] | gives the p-value for teststat in terms of the F-ratio distribution with numdof numerator and dendof denominator degrees of freedom |
Functions for p-values of test statistics.
This is the lower tail probability at -1.96 for a normal distribution with mean 0 and unit variance.
| Out[9]= |  |
|
A TwoSidedPValue gives the probability of the absolute value of the test statistic being at least as extreme as 1.96.
| Out[10]= |  |
|
A p-value is not always equivalent to the cumulative distribution function.
| Out[11]= |  |
|
A one-sided p-value has a maximum value of 0.5.
| Out[12]= |  |
|
A two-sided p-value is twice the one-sided p-value.
| Out[13]= |  |
|
Confidence Intervals
A confidence interval gives bounds within which a parameter value is expected to lie with a certain probability. Interval estimation of a parameter is often useful in observing the accuracy of an estimator as well as in making statistical inferences about the parameter in question.
MeanCI and
MeanDifferenceCI provide confidence intervals of means and differences of means based on the Central Limit Theorem.
| MeanCI[list] | gives a confidence interval for the population mean estimated from list |
| MeanDifferenceCI[list1,list2] | gives a confidence interval for the difference between the population means estimated from list1 and list2 |
Confidence intervals for means.
Here is a list of sample values. |
This gives a 95% confidence interval for the mean.
| Out[25]= |  |
|
Assumptions about variances of the populations from which the data were sampled will affect the distribution of the parameter estimate. The
KnownVariance and
EqualVariances options can be used to specify assumptions about population variances.
Option for MeanCI and MeanDifferenceCI.
Confidence intervals for the mean and for the difference between means are based on a normal distribution if the population variances are assumed known.
Intervals for the mean are based on Student's
t distribution with
n-1 degrees of freedom when the population variance must be estimated from a list of
n elements.
Confidence interval for the mean assuming a population variance of .25.
| Out[27]= |  |
|
Option for MeanDifferenceCI.
Confidence intervals for the difference between means are also based on Student's
t distribution if the variances are not known. If the variances are assumed equal,
MeanDifferenceCI is based on Student's
t distribution with
Length[list1]+Length[list2]-2 degrees of freedom. If the population variances are not assumed equal, Welch's approximation for the degrees of freedom is used.
This is a second dataset. |
This gives a 95% confidence interval for the difference between means.
| Out[31]= |  |
|
This gives the confidence interval assuming equal population variances.
| Out[32]= |  |
|
VarianceCI and
VarianceRatioCI provide tests of variances for normally distributed samples.
| VarianceCI[list] | gives a confidence interval for the population variance estimated from list |
| VarianceRatioCI[list1,list2] | gives a confidence interval for the ratio of the population variances estimated from list1 and from list2 |
Confidence intervals for variances.
The variance confidence interval is based on a
2 distribution and the variance ratio confidence interval is based on an
F-ratio distribution.
Here is a variance confidence interval for data1.
| Out[33]= |  |
|
The default confidence level for confidence interval functions is
.95. Other levels can be specified via the
ConfidenceLevel option.
Option for all confidence interval functions.
Here is a 90% confidence interval for the population variance of the first sample.
| Out[34]= |  |
|
Given an estimate of the mean, variance or ratio of variances and necessary standard deviations or degrees of freedom confidence intervals can also be obtained for normal, chi-square, Student
t, or
F-ratio distributions.
| NormalCI[mean,sd] | gives the confidence interval centered at mean with standard deviation sd |
| StudentTCI[mean,se,dof] | gives the confidence interval centered at mean with standard error se and dof degrees of freedom |
| ChiSquareCI[variance,dof] | gives the confidence interval for the population variance given the sample variance variance and dof degrees of freedom |
| FRatioCI[ratio,numdof,dendof] | gives the confidence interval for the ratio of population variances, given the ratio of sample variances ratio and where the sample variances in the numerator and denominator have numdof and dendof degrees of freedom |
Confidence intervals given sample estimates.
This calculates the mean of data1.
| Out[35]= |  |
|
This estimates the standard error of the mean.
| Out[36]= |  |
|
This is equivalent to MeanCI for data1.
| Out[38]= |  |
|