This is documentation for Mathematica 8, which was
based on an earlier version of the Wolfram Language.

# KernelMixtureDistribution

 KernelMixtureDistribution represents a kernel mixture distribution based on the data values . KernelMixtureDistributionrepresents a multivariate kernel mixture distribution based on data values . KernelMixtureDistributionrepresents a kernel mixture distribution with bandwidth bw. KernelMixtureDistributionrepresents a kernel mixture distribution with bandwidth bw and smoothing kernel ker.
• The probability density function for KernelMixtureDistribution for a value is given by for a smoothing kernel and bandwidth parameter .
• The following bandwidth specifications bw can be given:
 h bandwidth to use {"Standardized",h} bandwidth in units of standard deviation  » {"Adaptive",h,s} adaptive bandwidth with initial bandwidth h and sensitivity s Automatic automatically computed bandwidth  » "name" use a named bandwidth selection method {bwx,bwy,...} separate bandwidth specifications for x, y, etc.
• For multivariate densities, h can be a positive definite symmetric matrix.
• For adaptive bandwidths the sensitivity s must be a real number between 0 and 1 or Automatic. If Automatic is used, s is set to , where is the dimensionality of the data.
• Possible named bandwidth selection methods include:
 "LeastSquaresCrossValidation" uses the method of least-squares cross-validation "Oversmooth" 1.08 times wider than the standard Gaussian "Scott" uses Scott's rule to determine bandwidth "SheatherJones" uses the Sheather-Jones plugin estimator "Silverman" uses Silverman's rule to determine bandwidth "StandardDeviation" uses the standard deviation as bandwidth "StandardGaussian" optimal bandwidth for standard normal data
• By default the method is used.
• The following kernel specifications ker can be given:
 "Biweight" "Cosine" "Epanechnikov" "Gaussian" "Rectangular" "SemiCircle" "Triangular" "Triweight" func
• In order for KernelMixtureDistribution to generate a true density estimate, the function fn should be a valid univariate probability density function.
• By default the kernel is used.
• For multivariate densities, the kernel function ker can be specified as product and radial types using and respectively. Product-type kernels are used if no type is specified.
• The precision used for density estimation is the minimum precision given in the bw and data.
• The following options can be given:
 MaxMixtureKernels Automatic max number of kernels to use
Create a kernel density estimate of univariate data:
Use the resulting distribution to perform analysis including visualizing distribution functions:
Compute moments and quantiles:
Create a kernel density estimate of some bivariate data:
Visualize the estimated PDF and CDF:
Compute covariance and general moments:
Create symbolic representations of kernel density estimates:
Investigate symbolic properties:
Create a kernel density estimate of univariate data:
Use the resulting distribution to perform analysis including visualizing distribution functions:
 Out[3]=
Compute moments and quantiles:
 Out[4]=
 Out[5]=

Create a kernel density estimate of some bivariate data:
Visualize the estimated PDF and CDF:
 Out[3]=
Compute covariance and general moments:
 Out[4]//MatrixForm=
 Out[5]=

Create symbolic representations of kernel density estimates:
Investigate symbolic properties:
 Out[3]=
 Out[4]=
 Out[5]=
 Scope   (46)
Create a smooth density estimate for some data:
Compute probabilities from the distribution:
Increase the bandwidth for smoother estimates:
Allow the bandwidth to vary adaptively with local density:
Identify features in data to aid in parametric model fitting:
The estimate suggests both the form and starting values for maximum likelihood estimation:
Use kernel density estimation in higher dimensions:
A four-dimensional kernel density estimate:
Sample from the distribution:
Explore properties of kernel density estimators using custom kernel functions:
Specify radial or product type kernels for multivariate estimates:
Estimate distribution functions:
The first few terms of the PDF and CDF:
Compute moments of the distribution:
Special moments:
General moments:
Moments can often be computed in closed form:
Compute a closed form expression for the variance with a symbolic adaptive bandwidth:
Quantile function:
Special quantile values:
Generate random numbers:
Compute probabilities and expectations:
Generating functions:
Estimate bivariate distribution functions:
Compute moments of a bivariate distribution:
Special moments:
General moments:
Generate random numbers:
Automatically select the bandwidth to use:
More data yields better approximations to the underlying distribution:
Explicitly specify the bandwidth to use:
Use bandwidths of and :
Larger bandwidths yield smoother estimates:
The bandwidth need not be numeric:
The PDF and CDF of the estimate:
Specify bandwidths in units of standard deviation:
Use bandwidths of and of the standard deviation:
Allow the bandwidth to vary adaptively with local density:
Vary the local sensitivity from (none) to (full):
Setting the sensitivity to Automatic uses where is the dimension of the data:
The PDFs are equivalent:
Vary the initial bandwidth for an adaptive estimate:
Specify an initial bandwidth of and , respectively:
Use any of several automatic bandwidth selection methods:
Silverman's method is used by default:
The PDFs are equivalent:
In the multivariate case, the bandwidth is a symmetric positive definite × matrix:
Giving a scalar h effectively uses h IdentityMatrix[p]:
Specifying diagonal elements effectively uses DiagonalMatrix[d]:
Any × matrix that could be symmetric positive definite can be given:
By default, Silverman's method is used to independently select bandwidths in each dimension:
Any automated method can be used to independently select diagonal bandwidth elements:
Methods used to estimate the diagonal need not be the same:
Use adaptive, oversmoothed, and constant bandwidths in the respective dimensions:
Plot the univariate marginal PDFs:
Give a scalar value to use the same bandwidth in all dimensions:
To use nonzero off-diagonal elements, give a fully specified bandwidth matrix:
The bandwidth matrix controls the variance and orientation of individual kernels:
Scalar bandwidths:
Dimension-wise bandwidths:
Fully specified bandwidth matrices:
Some named bandwidth methods follow a rule-of-thumb approach:
Formulas for some named bandwidth methods:
The estimates are equivalent:
The method of least squares cross-validation:
The expectation of the PDF using a Gaussian kernel and bandwidth :
The expectation of the PDF of the leave-one-out density estimator:
The bandwidth is found by minimizing the least squares cross-validation function over :
The method of Sheather and Jones uses a plug-in estimator to solve for the bandwidth:
The Sheather and Jones estimator:
The estimates are equivalent:
Specify any one of several kernel functions:
Define the kernel function as a pure function:
By default, the Gaussian kernel is used:
This is equivalent to using the PDF of a NormalDistribution:
Shapes of some univariate kernel functions:
Specify any one of several kernel functions for multivariate data:
Shapes of some bivariate product kernels:
Choose between product and radial-type kernel functions for multivariate data:
Computation of a single biweight kernel in two dimensions:
Bandwidths have similar effect for both radial and product type kernels:
Scalar bandwidths stretch the kernel equally in each dimension:
Diagonal elements stretch the kernel independently along each axis:
Nonzero off-diagonal elements change the orientation:
The PDFs of the various kernel functions:
The efficiency of kernels under the assumption of normally distributed data:
The built-in kernel functions all have relatively high statistical efficiency:
 Options   (7)
By default a kernel is placed at each data point for sample sizes less than 300:
For larger sample sizes, a maximum of 300 uniformly spaced kernels are used by default:
Specify the maximum number of kernels to use in the estimate:
Place at most 5 kernels:
A larger number of kernels gives a better estimate of the underlying distribution:
Place a kernel at each data point:
Vary the bandwidth used for the same number of kernels:
Specify the number of kernels to use in each dimension for bivariate data:
Place at most 10 and 100 kernels, respectively:
Set a different maximum number of kernels in each dimension:
Specify a maximum of 5 and 50 kernels or 50 and 5:
 Applications   (6)
Compare an estimated density to a theoretical model:
Use an adaptive bandwidth and many mixture kernels when high resolution is desired:
The moments for the model and the estimate are similar:
Estimate the distribution of daily point changes for Apple stocks on the NASDAQ:
Increase the MaxMixtureKernels option with heavy-tailed data for a smoother estimate:
Compute the probability of a 10% point change or more on a given day:
Estimate the distribution of snowfall in Buffalo, New York:
Different bandwidths yield different descriptions of the snowfall distribution:
Identify which of six measures might be most useful for identifying counterfeit bank notes:
Measure 6 appears to best separate the two classes of notes:
Using measure 6 as a classifier with a cutoff of 140.5, find the probability of misclassification:
Find the bandwidth that minimizes the mean squared error (MSE) of the PDF:
Use the bandwidth to estimate the PDF:
KernelMixtureDistribution can be used to create an elliptical distribution. Elliptical distributions are a generalization of multivariate normal distributions:
Using NormalDistribution for the marginal gives MultinormalDistribution:
Some other elliptical distributions:
The resulting density estimate integrates to unity:
The density is a weighted sum of kernel functions:
KernelMixtureDistribution is a consistent estimator of the underlying distribution:
The number of kernels actually used will be no larger than the sample size:
Placing at most 10000 kernels:
The number of terms corresponds to the number of kernels used:
As the bandwidth approaches infinity, the estimate approaches the shape of the kernel:
The kernel function needs to be a PDF:
The resulting density estimate is not a PDF:
Automatic adaptive bandwidths may be too small with large samples:
Try increasing the initial bandwidth, MaxMixtureKernels, or decreasing the sensitivity:
A kernel must be placed at each data point with symbolic data:
Symbolic data cannot be used with the and methods:
Specify bandwidths that do not require estimation:
Some of the kernel functions are bounded and trigger exclusions in plots:
Set the Exclusions option to None to avoid spurious gaps and to decrease plot timings:
Use KernelMixtureDistribution to apply a Gaussian blur to a binarized image:
Compute a completely symbolic trivariate density estimate:
New in 8