Continuous Distributions
The functions described here are among the most commonly used continuous statistical distributions. You can compute their densities, means, variances, and other related properties. The distributions themselves are represented in the symbolic form
name[param1, param2, ...]. Functions such as
Mean, which give properties of statistical distributions, take the symbolic representation of the distribution as an argument.
"Discrete Distributions" describes many discrete statistical distributions.
Distributions related to the normal distribution.
The lognormal distribution
LogNormalDistribution[
,
] is the distribution followed by the exponential of a normally distributed random variable. This distribution arises when many independent random variables are combined in a multiplicative fashion. The half-normal distribution
HalfNormalDistribution[
] is proportional to the distribution
NormalDistribution[0, 1/(
Sqrt[2/
])] limited to the domain
[0,
).
The inverse Gaussian distribution
InverseGaussianDistribution[
,
], sometimes called the Wald distribution, is the distribution of first passage times in Brownian motion with positive drift.
Distributions related to normally distributed samples.
If
X1,...,
X
are independent normal random variables with unit variance and mean zero, then

has a
2 distribution with

degrees of freedom. If a normal variable is standardized by subtracting its mean and dividing by its standard deviation, then the sum of squares of such quantities follows this distribution. The
2 distribution is most typically used when describing the variance of normal samples.
A variable that has a Student
t distribution can also be written as a function of normal random variables. Let
X and
Z be independent random variables, where
X is a standard normal distribution and
Z is a
2 variable with

degrees of freedom. In this case,

has a
t distribution with

degrees of freedom. The Student
t distribution is symmetric about the vertical axis, and characterizes the ratio of a normal variable to its standard deviation. When
=1, the
t distribution is the same as the Cauchy distribution.
The
F-ratio distribution is the distribution of the ratio of two independent
2 variables divided by their respective degrees of freedom. It is commonly used when comparing the variances of two populations in hypothesis testing.
Distributions that are derived from normal distributions with nonzero means are called
noncentral distributions.
The sum of the squares of

normally distributed random variables with variance
2=1 and nonzero means follows a noncentral
2 distribution
NoncentralChiSquareDistribution[
,
]. The noncentrality parameter

is the sum of the squares of the means of the random variables in the sum. Note that in various places in the literature,
/2 or

is used as the noncentrality parameter.
The noncentral Student
t distribution
NoncentralStudentTDistribution[
,
] describes the ratio

where

is a central
2 random variable with

degrees of freedom, and
X is an independent normally distributed random variable with variance
2=1 and mean

.
The noncentral
F-ratio distribution
NoncentralFRatioDistribution[n, m,
] is the distribution of the ratio of

to

, where

is a noncentral
2 random variable with noncentrality parameter

and
n1 degrees of freedom and

is a central
2 random variable with
m degrees of freedom.
Piecewise linear distributions.
The triangular distribution
TriangularDistribution[{a, b}, c] is a triangular distribution for
a<X<b with maximum probability at
c and
a<c<b. If
c is

,
TriangularDistribution[{a, b}, c] is the symmetric triangular distribution
TriangularDistribution[{a, b}].
The uniform distribution
UniformDistribution[{min, max}], commonly referred to as the rectangular distribution, characterizes a random variable whose value is everywhere equally likely. An example of a uniformly distributed random variable is the location of a point chosen randomly on a line from
min to
max.
Other continuous statistical distributions.
If
X is uniformly distributed on
[-
,
], then the random variable
tan (X) follows a Cauchy distribution
CauchyDistribution[a, b], with
a=0 and
b=1.
When
=n/2 and
=2, the gamma distribution
GammaDistribution[
,
] describes the distribution of a sum of squares of
n-unit normal random variables. This form of the gamma distribution is called a
2 distribution with

degrees of freedom. When
=1, the gamma distribution takes on the form of the exponential distribution
ExponentialDistribution[
], often used in describing the waiting time between events.
When
X1 and
X2 have independent gamma distributions with equal scale parameters, the random variable

follows the beta distribution
BetaDistribution[
,
], where

and

are the shape parameters of the gamma variables.
The

distribution
ChiDistribution[
] is followed by the square root of a
2 random variable. For
n=1, the

distribution is identical to
HalfNormalDistribution[
] with

. For
n=2, the

distribution is identical to the Rayleigh distribution
RayleighDistribution[
] with
=1. For
n=3, the

distribution is identical to the Maxwell-Boltzmann distribution
MaxwellDistribution[
] with
=1.
The Laplace distribution
LaplaceDistribution[
,
] is the distribution of the difference of two independent random variables with identical exponential distributions. The logistic distribution
LogisticDistribution[
,
] is frequently used in place of the normal distribution when a distribution with longer tails is desired.
The Pareto distribution
ParetoDistribution[k,
] may be used to describe income, with
k representing the minimum income possible.
The Weibull distribution
WeibullDistribution[
,
] is commonly used in engineering to describe the lifetime of an object. The extreme value distribution
ExtremeValueDistribution[
,
] is the limiting distribution for the largest values in large samples drawn from a variety of distributions, including the normal distribution. The limiting distribution for the smallest values in such samples is the Gumbel distribution,
GumbelDistribution[
,
]. The names extreme value and Gumbel distribution are sometimes used interchangeably because the distributions of the largest and smallest extreme values are related by a linear change of variable. The extreme value distribution is also sometimes referred to as the log-Weibull distribution because of logarithmic relationships between an extreme value-distributed random variable and a properly shifted and scaled Weibull-distributed random variable.
| PDF[dist,x] | probability density function at x |
| CDF[dist,x] | cumulative distribution function at x |
| InverseCDF[dist,q] | the value of x such that CDF[dist, x] equals q |
| Quantile[dist,q] | qth quantile |
| Mean[dist] | mean |
| Variance[dist] | variance |
| StandardDeviation[dist] | standard deviation |
| Skewness[dist] | coefficient of skewness |
| Kurtosis[dist] | coefficient of kurtosis |
| CharacteristicFunction[dist,t] | characteristic function (t) |
| ExpectedValue[f,dist] | expected value of the pure function f in dist |
| ExpectedValue[f[x],dist,x] | expected value of f[x] for x in dist |
| RandomReal[dist] | pseudorandom number with specified distribution |
| RandomReal[dist,dims] | pseudorandom array with dimensionality dims, and elements from the specified distribution |
Functions of statistical distributions.
The cumulative distribution function (cdf) at
x is given by the integral of the probability density function (pdf) up to
x. The pdf can therefore be obtained by differentiating the cdf (perhaps in a generalized sense). In this package the distributions are represented in symbolic form.
PDF[dist, x] evaluates the density at
x if
x is a numerical value, and otherwise leaves the function in symbolic form. Similarly,
CDF[dist, x] gives the cumulative distribution.
Domain[dist] gives the domain of
PDF[dist, x] and
CDF[dist, x].
The inverse cdf
InverseCDF[dist, q] gives the value of
x at which
CDF[dist, x] reaches
q. The median is given by
InverseCDF[dist, 1/2]. Quartiles, deciles and percentiles are particular values of the inverse cdf. Inverse cdfs are used in constructing confidence intervals for statistical parameters.
InverseCDF[dist, q] and
Quantile[dist, q] are equivalent for continuous distributions.
The mean
Mean[dist] is the expectation of the random variable distributed according to
dist and is usually denoted by

. The mean is given by
y
x f (x)
x, where
f (x) is the pdf of the distribution. The variance
Variance[dist] is given by
(x-
)2f (x)
x. The square root of the variance is called the standard deviation, and is usually denoted by

.
The
Skewness[dist] and
Kurtosis[dist] functions give shape statistics summarizing the asymmetry and the peakedness of a distribution, respectively. Skewness is given by

and kurtosis is given by

.
The characteristic function
CharacteristicFunction[dist, t] is given by
(t)=
f (x)exp (itx)
x. In the discrete case,
(t)=
f (x)exp (itx). Each distribution has a unique characteristic function, which is sometimes used instead of the pdf to define a distribution.
The expected value
ExpectedValue[g, dist] of a function
g is given by
f (x)g (x)
x. In the discrete case, the expected value of
g is given by
f (x)g (x).
ExpectedValue[g[x], dist, x] is equivalent to
ExpectedValue[g, dist].
RandomReal[dist] gives pseudorandom numbers from the specified distribution.
This gives a symbolic representation of the gamma distribution with =3 and =1.
| Out[1]= |  |
|
Here is the cumulative distribution function evaluated at 10.
| Out[2]= |  |
|
This is the cumulative distribution function. It is given in terms of the built-in function GammaRegularized.
| Out[3]= |  |
|
Here is a plot of the cumulative distribution function.
| Out[4]= |  |
|
This is a pseudorandom array with elements distributed according to the gamma distribution.
| Out[5]= |  |
|