3.2.15 Statistical Distributions and Related FunctionsThere are standard Mathematica packages for evaluating functions related to common statistical distributions. Mathematica represents the statistical distributions themselves in the symbolic form name[ , , ... ], where the are parameters for the distributions. Functions such as Mean, which give properties of statistical distributions, take the symbolic representation of the distribution as an argument.
BetaDistribution[ , ] | continuous beta distribution | | CauchyDistribution[a, b] | Cauchy distribution with location parameter and scale parameter | | ChiSquareDistribution[n] | chi-square distribution with n degrees of freedom | ExponentialDistribution[ ] | exponential distribution with scale parameter | ExtremeValueDistribution[ , ] | extreme value (Fisher-Tippett) distribution | FRatioDistribution[ , ] | -ratio distribution with numerator and denominator degrees of freedom | GammaDistribution[ , ] | gamma distribution with shape parameter and scale parameter | NormalDistribution[ , ] | normal (Gaussian) distribution with mean and standard deviation | LaplaceDistribution[ , ] | Laplace (double exponential) distribution with mean and variance parameter | LogNormalDistribution[ , ] | lognormal distribution with mean parameter and variance parameter | LogisticDistribution[ , ] | logistic distribution with mean and variance parameter | RayleighDistribution[ ] | Rayleigh distribution | | StudentTDistribution[n] | Student distribution with degrees of freedom | | UniformDistribution[min, max] | uniform distribution on the interval {min, max} | WeibullDistribution[ , ] | Weibull distribution |
Statistical distributions from the package Statistics`ContinuousDistributions`. Most of the continuous statistical distributions commonly used are derived from the normal or Gaussian distribution NormalDistribution[ , ]. This distribution has probability density . If you take random variables that follow any distribution with bounded variance, then the Central Limit Theorem shows that the mean of a large number of these variables always approaches a normal distribution. The logarithmic normal distribution or lognormal distribution LogNormalDistribution[ , ] is the distribution followed by the exponential of a normal-distributed random variable. This distribution arises when many independent random variables are combined in a multiplicative fashion. The chi-square distribution ChiSquareDistribution[n] is the distribution of the quantity , where the are random variables which follow a normal distribution with mean zero and unit variance. The chi-square distribution gives the distribution of variances of samples from a normal distribution. The Student t distribution StudentTDistribution[n] is the distribution followed by the ratio of a variable that follows the normal distribution to the square root of one that follows the chi-square distribution with degrees of freedom. The distribution characterizes the uncertainty in a mean when both the mean and variance are obtained from data. The F-ratio distribution, F-distribution or variance ratio distribution FRatioDistribution[ , ] is the distribution of the ratio of two chi-square variables with and degrees of freedom. The -ratio distribution is used in the analysis of variance for comparing variances from different models. The extreme value distribution ExtremeValueDistribution[ , ] is the limiting distribution for the smallest or largest values in large samples drawn from a variety of distributions, including the normal distribution.
| PDF[dist, x] | probability density function (frequency function) at | | CDF[dist, x] | cumulative distribution function at | | Quantile[dist, q] |  quantile | | Mean[dist] | mean | | Variance[dist] | variance | | StandardDeviation[dist] | standard deviation | | Skewness[dist] | coefficient of skewness | | Kurtosis[dist] | coefficient of kurtosis | | CharacteristicFunction[dist, t] | characteristic function | | Random[dist] | pseudorandom number with specified distribution |
Functions of statistical distributions. The cumulative distribution function (cdf) CDF[dist, x] is given by the integral of the probability density function for the distribution up to the point . For the normal distribution, the cdf is usually denoted . Cumulative distribution functions are used in evaluating probabilities for statistical hypotheses. For discrete distributions, the cdf is given by the sum of the probabilities up to the point . The cdf is sometimes called simply the distribution function. The cdf at a particular point for a given distribution is often denoted , where the are parameters of the distribution. The upper tail area is given in terms of the cdf by . Thus, for example, the upper tail area for a chi-square distribution with degrees of freedom is denoted and is given by 1 - CDF[ChiSquareDistribution[nu], chi2]. The quantile Quantile[dist, q] is effectively the inverse of the cdf. It gives the value of x at which CDF[dist, x] reaches q. The median is given by Quantile[dist, 1/2]; quartiles, deciles and percentiles can also be expressed as quantiles. Quantiles are used in constructing confidence intervals for statistical parameter estimates. The characteristic function CharacteristicFunction[dist, t] is given by , where is the probability density for a distribution. The  central moment of a distribution is given by the  derivative . Random[dist] gives pseudorandom numbers that follow the specified distribution. The numbers can be seeded as discussed in Section 3.2.4. | This loads the package which defines continuous statistical distributions. | |
In[1]:=
<<Statistics`ContinuousDistributions`
|
|
| This represents a normal distribution with mean zero and unit variance. | |
In[2]:=
ndist = NormalDistribution[0, 1]
|
Out[2]=
|
|
| Here is a symbolic result for the cumulative distribution function of the normal distribution. | |
Out[3]=
|
|
This gives the value of at which the cdf of the normal distribution reaches the value . | |
In[4]:=
Quantile[ndist, 0.9] // N
|
Out[4]=
|
|
| Here is a list of five normal-distributed pseudorandom numbers. | |
In[5]:=
Table[ Random[ndist], {5} ]
|
Out[5]=
|
|
Statistical distributions from the package Statistics`DiscreteDistributions`. Most of the common discrete statistical distributions can be derived by considering a sequence of "trials", each with two possible outcomes, say "success" and "failure". The Bernoulli distribution BernoulliDistribution[p] is the probability distribution for a single trial in which success, corresponding to value 1, occurs with probability , and failure, corresponding to value 0, occurs with probability . The binomial distribution BinomialDistribution[n, p] is the distribution of the number of successes that occur in independent trials when the probability for success in an individual trial is . The distribution is given by . The negative binomial distribution NegativeBinomialDistribution[r, p] gives the distribution of the number of failures that occur in a sequence of trials before successes have occurred, given that the probability for success in each individual trial is . The geometric distribution GeometricDistribution[p] gives the distribution of the total number of trials before the first success occurs in a sequence of trials where the probability for success in each individual trial is . The hypergeometric distribution HypergeometricDistribution[n, , ] is used in place of the binomial distribution for experiments in which the trials correspond to sampling without replacement from a population of size with potential successes. The discrete uniform distribution DiscreteUniformDistribution[n] represents an experiment with outcomes that occur with equal probabilities.
|