# BenfordDistribution

represents a Benford distribution with base parameter b.

# Details • BenfordDistribution is also known as the first-digit distribution.
• The probability for integer value in a Benford distribution is proportional to for , and is otherwise.
• BenfordDistribution allows to be any integer such that .
• BenfordDistribution can be used with such functions as Mean, CDF, and RandomVariate.

# Background & Context

• represents a discrete statistical distribution defined at integer values , where the parameter b is an integer known as the base parameter satisfying . The Benford distribution is sometimes referred to as the first-digit distribution. It has a discrete probability density function (PDF) with monotonically decreasing values.
• The Benford distribution is associated with American physicist Frank Benford, whose eponymous "Benford's law" (sometimes also referred to as the NewcombBenford law in honor of Canadian-American mathematician Simon Newcomb, who published the result some 50 years before Benford) serves as the cornerstone for the distribution. Benford's law states that for base , the probability that the first digit of numbers in many classes of real-world datasets is 1 is not given by 1/9 11.1% (as would be naively expected) but is actually closer to 30%. Furthermore, the overall probability that a digit occurs as an initial digit is approximately equal to . (The result generalizes to other bases using the change of base formula for logarithms.) Benford's law has been observed to occur empirically across a large number of unrelated datasets, including catalogs of physical and mathematical constants, stock prices, population counts, and death rates. In general, Benford's distribution best approximates distributions of values spanning multiple orders of magnitude. It has also been extended to look at the frequency of second and later digits and at leading sequences of digits for .
• RandomVariate can be used to give one or more machine- or arbitrary-precision (the latter via the WorkingPrecision option) pseudorandom variates from a Benford distribution. Distributed[x,BenfordDistribution[b]], written more concisely as xBenfordDistribution[b], can be used to assert that a random variable x is distributed according to a Benford distribution. Such an assertion can then be used in functions such as Probability, NProbability, Expectation, and NExpectation.
• The probability density and cumulative distribution functions may be given using PDF[BenfordDistribution[b],x] and CDF[BenfordDistribution[b],x]. The mean, median, variance, raw moments, and central moments may be computed using Mean, Median, Variance, Moment, and CentralMoment, respectively. These quantities can be visualized using DiscretePlot.
• DistributionFitTest can be used to test if a given dataset is consistent with a Benford distribution, EstimatedDistribution to estimate a Benford parametric distribution from given data, and FindDistributionParameters to fit data to a Benford distribution. ProbabilityPlot can be used to generate a plot of the CDF of given data against the CDF of a symbolic Benford distribution, and QuantilePlot to generate a plot of the quantiles of given data against the quantiles of a symbolic Benford distribution.
• TransformedDistribution can be used to represent a transformed Benford distribution, CensoredDistribution to represent the distribution of values censored between upper and lower values, and TruncatedDistribution to represent the distribution of values truncated between upper and lower values. CopulaDistribution can be used to build higher-dimensional distributions that contain a Benford distribution, and ProductDistribution can be used to compute a joint distribution with independent component distributions involving Benford distributions.
• BenfordDistribution is related to a number of other probability distributions, including ZipfDistribution and ParetoDistribution. Other distributions are related to BenfordDistribution through Benford's law. For example, empirical testing of random numbers distributed according to ExponentialDistribution, WeibullDistribution, GammaDistribution, LogLogisticDistribution, and ExponentialPowerDistribution shows adherence to Benford's law, whereas random numbers generated according to UniformDistribution, HalfNormalDistribution, NormalDistribution, and GumbelDistribution do not. Several other distributions have relationships with Benford's law depending on their input parameters. For example, random numbers generated according to satisfy Benford's law, though the adherence decreases for as ν increases. Similarly, samples of FRatioDistribution[n,m] random variates obey Benford's law for small values of n and m, with decreasing adherence as n and m increase, and random variates distributed according to have increased agreement with Benford's law for large values of μ and ν (with perturbations of ν having a greater effect than perturbations of μ).

# Examples

open allclose all

## Basic Examples(5)

Probability mass function:

Cumulative distribution function:

Mean:

Variance:

Median:

## Scope(6)

Generate a sample of pseudorandom numbers from a Benford distribution:

Compare its histogram to the PDF:

Distribution parameters estimation:

Estimate the distribution parameters from sample data:

Compare the density histogram of the sample with the PDF of the estimated distribution:

Skewness is defined for :

Kurtosis is defined for :

Hazard function:

Quantile function:

## Applications(3)

Benford's distribution approximates distributions of values spanning multiple orders of magnitude. Consider a sample from a heavy-tailed distribution:

Find the order of magnitude between minimum and maximum:

Extract first digits:

Compare the histogram with the PDF of the corresponding BenfordDistribution:

Now consider a sample from a light-tailed distribution:

Find the order of magnitude between minimum and maximum:

Compare the histogram with the PDF of the corresponding BenfordDistribution:

Check whether the population of the largest cities in the United States follows Benford distribution:

The population of the 100 largest cities does not follow Benford distribution very well:

Consider physical constants:

Find the first digits, not taking units into account:

The first digits are not uniformly distributed; it is more likely that their distribution follows Benford's law:

Check if the hypothesis can be rejected: