How to | Work with Statistical Distributions

Statistical distributions have applications in many fields, including the biological, social, and physical sciences. The Wolfram Language represents statistical distributions as symbolic objects. You can obtain properties, results, and random numbers for hundreds of built-in or custom distributions by applying builtin functions to the objects.

Statistical distributions are simply Wolfram Language objects:

You can use the PDF function to get the probability density function for the distribution:

You can get numeric results by inserting numbers for , , and .

Compute the density for numeric values of , , and :

Symbolic results can be used in other functions as well.

Here the density function is plotted for specified values of and :

You can directly obtain common properties such as the mean, variance, cumulative distribution function (CDF), and characteristic function using builtin functions.

This is the mean for a binomial distribution for 100 trials with success probability .3:

Like a PDF, a characteristic function uniquely defines a distribution.

Obtain the general formula for the characteristic function of a Cauchy distribution:

You can also compute more general expected values, which give the value expected for a given function applied to a random variable from a given distribution. The ^(th) raw moment is the expected value of raised to the ^(th) power.

Obtain the ^(th) raw moment for a Poisson-distributed random variable :

You can generate random numbers from distributions using RandomVariate.

These are 10 numbers simulated from a distribution with 15 degrees of freedom:

A geometric distribution describes the number of trials before a failure when there is a probability of success in each trial.

Simulate 20 numbers from a geometric distribution with success probability parameter :

You could even visualize a sample against a theoretical distribution because plots of data and functions can be combined.

Here gamma-distributed numbers are generated and stored to the symbol data:

You can use Histogram to generate a histogram of these values on a probability density scale:

You can visualize the theoretical density function using Plot:

You can then use Show to display the two graphics together:

You might also want to estimate parameter values assuming a dataset follows a particular distribution. For instance, you could find the maximum likelihood estimate for parameters by using FindDistributionParameters:

The results can be packaged up into a distribution object using EstimatedDistribution:

The loglikelihood could also be computed using LogLikelihood with the estimated distribution:

The loglikelihood value is mostly relevant compared to loglikelihood values for other parameters. Creating a ContourPlot near the obtained values can provide a qualitative comparison. Points on a given contour have the same loglikelihood.

Here a white point is placed at the optimal point: