## 3.2.15 Statistical Distributions and Related Functions

There are standard Mathematica packages for evaluating functions related to common statistical distributions. Mathematica represents the statistical distributions themselves in the symbolic form name[, , ... ], where the are parameters for the distributions. Functions such as Mean, which give properties of statistical distributions, take the symbolic representation of the distribution as an argument.

 BetaDistribution[, ] continuous beta distribution CauchyDistribution[a, b] Cauchy distribution with location parameter and scale parameter ChiSquareDistribution[n] chi-square distribution with n degrees of freedom ExponentialDistribution[] exponential distribution with scale parameter ExtremeValueDistribution[, ] extreme value (Fisher-Tippett) distribution FRatioDistribution[, ] -ratio distribution with numerator and denominator degrees of freedom GammaDistribution[, ] gamma distribution with shape parameter and scale parameter NormalDistribution[, ] normal (Gaussian) distribution with mean and standard deviation LaplaceDistribution[, ] Laplace (double exponential) distribution with mean and variance parameter LogNormalDistribution[, ] lognormal distribution with mean parameter and variance parameter LogisticDistribution[, ] logistic distribution with mean and variance parameter RayleighDistribution[] Rayleigh distribution StudentTDistribution[n] Student distribution with degrees of freedom UniformDistribution[min, max] uniform distribution on the interval {min, max} WeibullDistribution[, ] Weibull distribution

Statistical distributions from the package Statistics`ContinuousDistributions`.

Most of the continuous statistical distributions commonly used are derived from the normal or Gaussian distribution NormalDistribution[, ]. This distribution has probability density . If you take random variables that follow any distribution with bounded variance, then the Central Limit Theorem shows that the mean of a large number of these variables always approaches a normal distribution.

The logarithmic normal distribution or lognormal distribution LogNormalDistribution[, ] is the distribution followed by the exponential of a normal-distributed random variable. This distribution arises when many independent random variables are combined in a multiplicative fashion.

The chi-square distribution ChiSquareDistribution[n] is the distribution of the quantity , where the are random variables which follow a normal distribution with mean zero and unit variance. The chi-square distribution gives the distribution of variances of samples from a normal distribution.

The Student t distribution StudentTDistribution[n] is the distribution followed by the ratio of a variable that follows the normal distribution to the square root of one that follows the chi-square distribution with degrees of freedom. The distribution characterizes the uncertainty in a mean when both the mean and variance are obtained from data.

The F-ratio distribution, F-distribution or variance ratio distribution FRatioDistribution[, ] is the distribution of the ratio of two chi-square variables with and degrees of freedom. The -ratio distribution is used in the analysis of variance for comparing variances from different models.

The extreme value distribution ExtremeValueDistribution[, ] is the limiting distribution for the smallest or largest values in large samples drawn from a variety of distributions, including the normal distribution.

 PDF[dist, x] probability density function (frequency function) at CDF[dist, x] cumulative distribution function at Quantile[dist, q] quantile Mean[dist] mean Variance[dist] variance StandardDeviation[dist] standard deviation Skewness[dist] coefficient of skewness Kurtosis[dist] coefficient of kurtosis CharacteristicFunction[dist, t] characteristic function Random[dist] pseudorandom number with specified distribution

Functions of statistical distributions.

The cumulative distribution function (cdf) CDF[dist, x] is given by the integral of the probability density function for the distribution up to the point . For the normal distribution, the cdf is usually denoted . Cumulative distribution functions are used in evaluating probabilities for statistical hypotheses. For discrete distributions, the cdf is given by the sum of the probabilities up to the point . The cdf is sometimes called simply the distribution function. The cdf at a particular point for a given distribution is often denoted , where the are parameters of the distribution. The upper tail area is given in terms of the cdf by . Thus, for example, the upper tail area for a chi-square distribution with degrees of freedom is denoted and is given by 1 - CDF[ChiSquareDistribution[nu], chi2].

The quantile Quantile[dist, q] is effectively the inverse of the cdf. It gives the value of x at which CDF[dist, x] reaches q. The median is given by Quantile[dist, 1/2]; quartiles, deciles and percentiles can also be expressed as quantiles. Quantiles are used in constructing confidence intervals for statistical parameter estimates.

The characteristic function CharacteristicFunction[dist, t] is given by , where is the probability density for a distribution. The central moment of a distribution is given by the derivative .

Random[dist] gives pseudorandom numbers that follow the specified distribution. The numbers can be seeded as discussed in Section 3.2.4.

This loads the package which defines continuous statistical distributions.
 In[1]:=  <
This represents a normal distribution with mean zero and unit variance.
 In[2]:=  ndist = NormalDistribution[0, 1]
 Out[2]=
Here is a symbolic result for the cumulative distribution function of the normal distribution.
 In[3]:=  CDF[ndist, x]
 Out[3]=
This gives the value of at which the cdf of the normal distribution reaches the value .
 In[4]:=  Quantile[ndist, 0.9] // N
 Out[4]=
Here is a list of five normal-distributed pseudorandom numbers.
 In[5]:=  Table[ Random[ndist], {5} ]
 Out[5]=

 BernoulliDistribution[p] discrete Bernoulli distribution with mean BinomialDistribution[n, p] binomial distribution for trials with probability DiscreteUniformDistribution[n] discrete uniform distribution with states GeometricDistribution[p] discrete geometric distribution with mean HypergeometricDistribution[n, , ] hypergeometric distribution for trials with successes in a population of size NegativeBinomialDistribution[r, p] negative binomial distribution for failure count and probability PoissonDistribution[mu] Poisson distribution with mean

Statistical distributions from the package Statistics`DiscreteDistributions`.

Most of the common discrete statistical distributions can be derived by considering a sequence of "trials", each with two possible outcomes, say "success" and "failure".

The Bernoulli distribution BernoulliDistribution[p] is the probability distribution for a single trial in which success, corresponding to value 1, occurs with probability , and failure, corresponding to value 0, occurs with probability .

The binomial distribution BinomialDistribution[n, p] is the distribution of the number of successes that occur in independent trials when the probability for success in an individual trial is . The distribution is given by .

The negative binomial distribution NegativeBinomialDistribution[r, p] gives the distribution of the number of failures that occur in a sequence of trials before successes have occurred, given that the probability for success in each individual trial is .

The geometric distribution GeometricDistribution[p] gives the distribution of the total number of trials before the first success occurs in a sequence of trials where the probability for success in each individual trial is .

The hypergeometric distribution HypergeometricDistribution[n, , ] is used in place of the binomial distribution for experiments in which the trials correspond to sampling without replacement from a population of size with potential successes.

The discrete uniform distribution DiscreteUniformDistribution[n] represents an experiment with outcomes that occur with equal probabilities.

THIS IS DOCUMENTATION FOR AN OBSOLETE PRODUCT.
SEE THE DOCUMENTATION CENTER FOR THE LATEST INFORMATION.