This is documentation for Mathematica 8, which was
based on an earlier version of the Wolfram Language.
View current documentation (Version 11.1)

EmpiricalDistribution

EmpiricalDistribution
represents an empirical distribution based on the data values .
EmpiricalDistribution
represents a multivariate empirical distribution based on the data values .
EmpiricalDistribution
represents an empirical distribution where data values occur with weights .
  • The weights can be numeric or symbolic.
Create an empirical distribution of univariate data:
Visualize distribution functions:
Compute moments and quantiles:
Create an empirical distribution of bivariate data:
Visualize the estimated CDF:
Compute covariance and general moments:
Create an empirical distribution of univariate data:
In[1]:=
Click for copyable input
In[2]:=
Click for copyable input
Visualize distribution functions:
In[3]:=
Click for copyable input
Out[3]=
Compute moments and quantiles:
In[4]:=
Click for copyable input
Out[4]=
In[5]:=
Click for copyable input
Out[5]=
 
Create an empirical distribution of bivariate data:
In[1]:=
Click for copyable input
In[2]:=
Click for copyable input
Visualize the estimated CDF:
In[3]:=
Click for copyable input
Out[3]=
Compute covariance and general moments:
In[4]:=
Click for copyable input
Out[4]//MatrixForm=
In[5]:=
Click for copyable input
Out[5]=
Create an empirical distribution of univariate data:
Larger datasets lead to better approximations of the underlying distribution:
Use exact numeric data:
Specify a list of weights corresponding to each data value:
Use symbolic weights:
A general moment of the distribution:
The CDF evaluated at 4:
Create an empirical distribution of bivariate data:
Larger datasets produce smoother estimates:
Specify a list of weights for bivariate data:
Create an empirical distribution of data in higher dimensions:
Plot the univariate marginal CDFs:
Plot the bivariate marginal CDFs:
Obtain empirical estimates of distribution functions:
PDF and HazardFunction are discrete:
CDF and SurvivalFunction are piecewise constant:
Compute moments:
Special moments:
General moments:
Estimate the quantile function:
Special quantile values:
Generate a set of random numbers:
Compare the histogram to the PDF of the underlying density:
Compute probabilities and expectations:
Generating functions:
Estimate bivariate distribution functions:
CDF and SurvivalFunction are piecewise constant:
Compute bivariate moments:
Special moments:
General moments:
Generate a set of random numbers:
Compare the distribution of data to a theoretical distribution:
Compare multivariate data to a theoretical distribution:
The difference:
Produce a smoothed representation with SmoothKernelDistribution:
Using HistogramDistribution with bin delimiters set to the data creates a linear interpolation of EmpiricalDistribution:
Ten letters published in 1861 under the name Quintus Curtius Snodgrass are claimed to have been authored by Mark Twain. Compare the word length distribution for the letters to some works by Mark Twain:
Comparison to the English language in general emphasizes the similarity:
A test for goodness of fit suggests, however, that Twain did not write the QCS letters:
Compare the distributions of winning times in Scottish hill races for those who take the high road and those who take the low road:
It appears that it is faster to take the low road:
The National Institutes of Health estimates that 2% of the population has a certain disease. A test for the disease is proposed that detects its presence 95% of the time with a false positive rate of 5%. Given that a patient tests positive, find the probability that he or she actually has the disease:
Equations for the unknown probabilities based on the information given:
Solve the equations assuming the probabilities sum to unity:
The probability a patient has the disease given a positive test result:
A group of 21 students was selected at random to participate in a new directed reading program. A control group of 23 students was educated with traditional methods. Reading test scores for students in the two groups were recorded following their programs. Perform a permutation-based test on the scores to determine if the directed reading program was successful:
The mean difference in test scores across the groups can be used as a test statistic:
Simulate the null distribution of the test statistic by randomly permuting the groups:
At the 5% level there is evidence that the new program made a difference:
LocationTest could have been used to test the hypothesis directly:
Random number generation from an empirical distribution returns a bootstrapped sample:
EmpiricalDistribution is a consistent estimator of the underlying distribution:
Moments and their equivalence to those of the data:
The population rather than the sample variance is used for empirical distributions:
Quantiles are equivalent to Quantile applied directly to the data:
EmpiricalDistribution is equivalent to SurvivalDistribution with no censoring:
Use the union of data values as bin delimiters for HistogramDistribution:
The resulting PDF is a zero-order interpolation of the PDF for EmpiricalDistribution:
Applying N to exact data can reduce memory consumption:
The CDFs are equivalent:
New in 8