BenfordDistribution

BenfordDistribution[b]
represents a Benford distribution with base parameter b.

DetailsDetails

Background
Background

  • BenfordDistribution[b] represents a discrete statistical distribution defined at integer values , where the parameter b is an integer known as the base parameter satisfying . The Benford distribution is sometimes referred to as the first-digit distribution. It has a discrete probability distribution function (PDF) with monotonically decreasing values.
  • The Benford distribution is associated with American physicist Frank Benford, whose eponymous "Benford's law" (sometimes also referred to as the NewcombBenford law in honor of Canadian-American mathematician Simon Newcomb, who published the result some 50 years before Benford) serves as the cornerstone for the distribution. Benford's law states that for base , the probability that the first digit of numbers in many classes of real-world datasets is 1 is not given by 1/9 11.1% (as would be naively expected) but is actually closer to 30%. Furthermore, the overall probability that a digit occurs as an initial digit is approximately equal to . (The result generalizes to other bases using the change of base formula for logarithms.) Benford's law has been observed to occur empirically across a large number of unrelated datasets, including catalogs of physical and mathematical constants, stock prices, population counts, and death rates. In general, Benford's distribution best approximates distributions of values spanning multiple orders of magnitude. It has also been extended to look at the frequency of second and later digits and at leading sequences of digits for .
  • RandomVariate can be used to give one or more machine- or arbitrary-precision (the latter via the WorkingPrecision option) pseudorandom variates from a Benford distribution. Distributed[x,BenfordDistribution[b]], written more concisely as , can be used to assert that a random variable x is distributed according to a Benford distribution. Such an assertion can then be used in functions such as Probability, NProbability, Expectation, and NExpectation.
  • The probability distribution and cumulative density functions may be given using PDF[BenfordDistribution[b],x] and CDF[BenfordDistribution[b],x]. The mean, median, variance, raw moments, and central moments may be computed using Mean, Median, Variance, Moment, and CentralMoment, respectively. These quantities can be visualized using DiscretePlot.
  • DistributionFitTest can be used to test if a given dataset is consistent with a Benford distribution, EstimatedDistribution to estimate a Benford parametric distribution from given data, and FindDistributionParameters to fit data to a Benford distribution. ProbabilityPlot can be used to generate a plot of the CDF of given data against the CDF of a symbolic Benford distribution, and QuantilePlot to generate a plot of the quantiles of given data against the quantiles of a symbolic Benford distribution.
  • TransformedDistribution can be used to represent a transformed Benford distribution, CensoredDistribution to represent the distribution of values censored between upper and lower values, and TruncatedDistribution to represent the distribution of values truncated between upper and lower values. CopulaDistribution can be used to build higher-dimensional distributions that contain a Benford distribution, and ProductDistribution can be used to compute a joint distribution with independent component distributions involving Benford distributions.
  • BenfordDistribution is related to a number of other probability distributions, including ZipfDistribution and ParetoDistribution. Other distributions are related to BenfordDistribution through Benford's law. For example, empirical testing of random numbers distributed according to ExponentialDistribution, WeibullDistribution, GammaDistribution, LogLogisticDistribution, and ExponentialPowerDistribution shows adherence to Benford's law, whereas random numbers generated according to UniformDistribution, HalfNormalDistribution, NormalDistribution, and GumbelDistribution do not. Several other distributions have relationships with Benford's law depending on their input parameters. For example, random numbers generated according to ChiSquareDistribution[1] satisfy Benford's law, though the adherence decreases for ChiSquareDistribution[ν] as ν increases. Similarly, samples of FRatioDistribution[n,m] random variates obey Benford's law for small values of n and m, with decreasing adherence as n and m increase, and random variates distributed according to LogNormalDistribution[μ,σ] have increased agreement with Benford's law for large values of μ and ν (with perturbations of ν having a greater effect than perturbations of μ).

ExamplesExamplesopen allclose all

Basic Examples  (5)Basic Examples  (5)

Probability density function:

In[1]:=
Click for copyable input
Out[1]=
In[2]:=
Click for copyable input
Out[2]=

Cumulative distribution function:

In[1]:=
Click for copyable input
Out[1]=
In[2]:=
Click for copyable input
Out[2]=

Mean:

In[1]:=
Click for copyable input
Out[1]=
In[2]:=
Click for copyable input
Out[2]=

Variance:

In[1]:=
Click for copyable input
Out[1]=
In[2]:=
Click for copyable input
Out[2]=

Median:

In[1]:=
Click for copyable input
Out[1]=
Introduced in 2010
(8.0)