This is documentation for Mathematica 3, which was
based on an earlier version of the Wolfram Language.
View current documentation (Version 11.1)
 Documentation / Mathematica / The Mathematica Book / Advanced Mathematics / Mathematical Functions  /

3.2.14 Statistical Distributions and Related Functions










There are standard Mathematica packages for evaluating functions related to common statistical distributions. Mathematica represents the statistical distributions themselves in the symbolic form name[


,


, ... ], where the are parameters for the distributions. Functions such as Mean


, which give properties of statistical distributions, take the symbolic representation of the distribution as an argument.


Statistical distributions from the package Statistics`ContinuousDistributions`.































Most of the continuous statistical distributions commonly used are derived from the normal or Gaussian distributionNormalDistribution[,]. This distribution has probability density . If you take random variables that follow any distribution with bounded variance, then the Central Limit Theorem shows that the mean of a large number of these variables always approaches a normal distribution.
The logarithmic normal distribution or lognormal distribution


LogNormalDistribution[,] is the distribution followed by the exponential of a normal-distributed random variable. This distribution arises when many independent random variables are combined in a multiplicative fashion.
The chi-square distributionChiSquareDistribution[n] is the distribution of the quantity , where the are random variables which follow a normal distribution with mean zero and unit variance. The chi-square distribution gives the distribution of variances of samples from a normal distribution.
The Student t distribution


StudentTDistribution[n] is the distribution followed by the ratio of a variable that follows the normal distribution to the square root of one that follows the chi-square distribution with degrees of freedom. The distribution characterizes the uncertainty in a mean when both the mean and variance are obtained from data.
The F-ratio distribution, F-distribution or variance ratio distribution


FRatioDistribution[


,


] is the distribution of the ratio of two chi-square variables with and degrees of freedom. The -ratio distribution is used in the analysis of variance for comparing variances from different models.
The extreme value distribution


ExtremeValueDistribution[,] is the limiting distribution for the smallest or largest values in large samples drawn from a variety of distributions, including the normal distribution.


Functions of statistical distributions.

















































The cumulative distribution function (cdf) CDF[dist,x] is given by the integral of the probability density function for the distribution up to the point . For the normal distribution, the cdf is usually denoted . Cumulative distribution functions are used in evaluating probabilities for statistical hypotheses. For discrete distributions, the cdf is given by the sum of the probabilities up to the point . The cdf is sometimes called simply the distribution function. The cdf at a particular point for a given distribution is often denoted , where the are parameters of the distribution. The upper tail area is given in terms of the cdf by . Thus, for example, the upper tail area for a chi-square distribution with degrees of freedom is denoted and is given by 1


-CDF[ChiSquareDistribution[nu],chi2].
The quantileQuantile[dist,q] is effectively the inverse of the cdf. It gives the value of x at which CDF[dist,x] reaches q. The median is given by Quantile[dist,1/2]; quartiles, deciles and percentiles can also be expressed as quantiles. Quantiles are used in constructing confidence intervals for statistical parameter estimates.
The characteristic function CharacteristicFunction[dist,t] is given by , where is the probability density for a distribution. The





central moment of a distribution is given by the


derivative .
Random[


dist] gives pseudorandom numbers that follow the specified distribution. The numbers can be seeded as discussed in Section 3.2.3.

  • This loads the package which defines continuous statistical distributions.
  • In[1]:= <<Statistics`ContinuousDistributions`

  • This represents a normal distribution with mean zero and unit variance.
  • In[2]:= ndist = NormalDistribution[0, 1]

    Out[2]=

  • Here is a symbolic result for the cumulative distribution function of the normal distribution.
  • In[3]:= CDF[ndist, x]

    Out[3]=







  • This gives the value of at which the cdf of the normal distribution reaches the value


    .
  • In[4]:= Quantile[ndist, 0.9] // N

    Out[4]=

  • Here is a list of five normal-distributed pseudorandom numbers.
  • In[5]:= Table[ Random[ndist], {5} ]

    Out[5]=


    Statistical distributions from the package Statistics`DiscreteDistributions`.











































    Most of the common discrete statistical distributions can be derived by considering a sequence of "trials", each with two possible outcomes, say "success" and "failure".
    The Bernoulli distributionBernoulliDistribution[p] is the probability distribution for a single trial in which success, corresponding to value 1, occurs with probability , and failure, corresponding to value 0, occurs with probability .
    The binomial distribution


    BinomialDistribution[n,p] is the distribution of the number of successes that occur in independent trials when the probability for success in an individual trial is . The distribution is given by .
    The negative binomial distribution


    NegativeBinomialDistribution[r,p] gives the distribution of the number of failures that occur in a sequence of trials before successes have occurred, given that the probability for success in each individual trial is .
    The geometric distribution


    GeometricDistribution[p] gives the distribution of the total number of trials before the first success occurs in a sequence of trials where the probability for success in each individual trial is .
    The hypergeometric distribution


    HypergeometricDistribution[n,


    ,


    ] is used in place of the binomial distribution for experiments in which the trials correspond to sampling without replacement from a population of size with potential successes.
    The discrete uniform distribution


    DiscreteUniformDistribution[n] represents an experiment with


    outcomes that occur with equal probabilities.