LearnDistribution
✖
LearnDistribution
generates a LearnedDistribution[…] that attempts to represent an underlying distribution for the examples given.
Details and Options




- LearnDistribution can be used on many types of data, including numerical, nominal and images.
- Each examplei can be a single data element, a list of data elements or an association of data elements. Examples can also be given as a Dataset or a Tabular object.
- LearnDistribution effectively assumes that each of the examplei is independently drawn from an underlying distribution, which LearnDistribution attempts to infer.
- LearnDistribution[examples] yields a LearnedDistribution[…] on which the following functions can be used:
-
PDF[dist,…] probability or probability density for data RandomVariate[dist] random samples generated from the distribution SynthesizeMissingValues[dist,…] fill in missing values according to the distribution RarerProbability[dist,…] compute the probability to generate a sample with lower PDF than a given example - The following options can be given:
-
FeatureExtractor Identity how to extract features from which to learn FeatureNames Automatic feature names to assign for input data FeatureTypes Automatic feature types to assume for input data Method Automatic which modeling algorithm to use PerformanceGoal Automatic aspects of performance to try to optimize RandomSeeding 1234 what seeding of pseudorandom generators should be done internally TimeGoal Automatic how long to spend training the classifier TrainingProgressReporting Automatic how to report progress during training ValidationSet Automatic the set of data on which to evaluate the model during training - Possible settings for PerformanceGoal include:
-
"DirectTraining" train directly on the full dataset, without model searching "Memory" minimize storage requirements of the distribution "Quality" maximize the modeling quality of the distribution "Speed" maximize speed for PDF queries "SamplingSpeed" maximize speed for generating random samples "TrainingSpeed" minimize time spent producing the distribution Automatic automatic tradeoff among speed, quality and memory {goal1,goal2,…} automatically combine goal1, goal2, etc. - Possible settings for Method include:
-
"ContingencyTable" discretize data and store each possible probability "DecisionTree" use a decision tree to compute probabilities "GaussianMixture" use a mixture of Gaussian (normal) distributions "KernelDensityEstimation" use a kernel mixture distribution "Multinormal" use a multivariate normal (Gaussian) distribution - The following settings for TrainingProgressReporting can be used:
-
"Panel" show a dynamically updating graphical panel "Print" periodically report information using Print "ProgressIndicator" show a simple ProgressIndicator "SimplePanel" dynamically updating panel without learning curves None do not report any information - Possible settings for RandomSeeding include:
-
Automatic automatically reseed every time the function is called Inherited use externally seeded random numbers seed use an explicit integer or strings as a seed - Only reversible feature extractors can be given in the option FeatureExtractor.
- LearnDistribution[…,FeatureExtractor"Minimal"] indicates that the internal preprocessing should be as simple as possible.
- All images are first conformed using ConformImages.
- Information[LearnedDistribution[…]] generates an information panel about the distribution and its estimated performances.
Examples
open allclose allBasic Examples (3)Summary of the most common use cases
Train a distribution on a numeric dataset:

https://wolfram.com/xid/0ywjl2umsqsa-5hhlm7

Generate a new example based on the learned distribution:

https://wolfram.com/xid/0ywjl2umsqsa-38oizs

Compute the PDF of a new example:

https://wolfram.com/xid/0ywjl2umsqsa-ed58y3

Train a distribution on a nominal dataset:

https://wolfram.com/xid/0ywjl2umsqsa-1giaj9

Generate a new example based on the learned distribution:

https://wolfram.com/xid/0ywjl2umsqsa-67g5d3

Compute the probability of the examples "A" and "B":

https://wolfram.com/xid/0ywjl2umsqsa-wmcrl5

Train a distribution on a two-dimensional dataset:

https://wolfram.com/xid/0ywjl2umsqsa-9eb8g

Generate a new example based on the learned distribution:

https://wolfram.com/xid/0ywjl2umsqsa-mxqvzs

Compute the probability of two examples:

https://wolfram.com/xid/0ywjl2umsqsa-q48rx3

Impute the missing value of an example:

https://wolfram.com/xid/0ywjl2umsqsa-u4nvqc

Scope (3)Survey of the scope of standard use cases
Train a distribution on a dataset containing numeric and nominal variables:

https://wolfram.com/xid/0ywjl2umsqsa-4k23aj

Generate a new example based on the learned distribution:

https://wolfram.com/xid/0ywjl2umsqsa-632fcq

Impute the missing value of an example:

https://wolfram.com/xid/0ywjl2umsqsa-7pqpk0

Train a distribution on colors:

https://wolfram.com/xid/0ywjl2umsqsa-uggcn4

Generate 100 new examples based on the learned distribution:

https://wolfram.com/xid/0ywjl2umsqsa-hk1n8f

Compute the probability density of some colors:

https://wolfram.com/xid/0ywjl2umsqsa-w1cs1m

Train a distribution on dates:

https://wolfram.com/xid/0ywjl2umsqsa-88g0s5

Generate 10 new examples based on the learned distribution:

https://wolfram.com/xid/0ywjl2umsqsa-pj6kpg

Compute the probability density of some new dates:

https://wolfram.com/xid/0ywjl2umsqsa-85tleo

Options (6)Common values & functionality for each option
FeatureTypes (1)
Specify that the data is nominal:

https://wolfram.com/xid/0ywjl2umsqsa-9he65q


https://wolfram.com/xid/0ywjl2umsqsa-rlcxrs

Without specification, the data is considered numerical:

https://wolfram.com/xid/0ywjl2umsqsa-165wx1


https://wolfram.com/xid/0ywjl2umsqsa-ge4l6x

Method (2)
Train a "Multinormal" distribution on a numeric dataset:

https://wolfram.com/xid/0ywjl2umsqsa-y6t5a

Plot the PDF along with the training data:

https://wolfram.com/xid/0ywjl2umsqsa-jfmxrj

Train a distribution on a two-dimensional dataset with all available methods ("Multinormal", "ContingencyTable", "KernelDensityEstimation", "DecisionTree" and "GaussianMixture"):

https://wolfram.com/xid/0ywjl2umsqsa-xaxv0z

https://wolfram.com/xid/0ywjl2umsqsa-uro8uq

https://wolfram.com/xid/0ywjl2umsqsa-4q6tc5
Visualize the probability density of these distributions:

https://wolfram.com/xid/0ywjl2umsqsa-fgbu12

TimeGoal (2)
Learn a distribution while specifying a total training time of 5 seconds:

https://wolfram.com/xid/0ywjl2umsqsa-idsajs


https://wolfram.com/xid/0ywjl2umsqsa-c4k8t4

Load 1000 images of the "MNIST" dataset:

https://wolfram.com/xid/0ywjl2umsqsa-1w6ors

https://wolfram.com/xid/0ywjl2umsqsa-qpozuj

Learn its distribution while specifying a target training time of 3 seconds:

https://wolfram.com/xid/0ywjl2umsqsa-8yeble

The loss value obtained (cross-entropy) is about -0.43:

https://wolfram.com/xid/0ywjl2umsqsa-zarmrq

Learn its distribution while specifying a target training time of 30 seconds:

https://wolfram.com/xid/0ywjl2umsqsa-ljkkil

The loss value obtained (cross-entropy) is about -0.978:

https://wolfram.com/xid/0ywjl2umsqsa-hd7qga

Compare the learning curves for both trainings:

https://wolfram.com/xid/0ywjl2umsqsa-1xo0an

TrainingProgressReporting (1)

https://wolfram.com/xid/0ywjl2umsqsa-pac3cx
Show training progress interactively during training:

https://wolfram.com/xid/0ywjl2umsqsa-735k7e

Show training progress interactively without plots:

https://wolfram.com/xid/0ywjl2umsqsa-fy4ogh

Print training progress periodically during training:

https://wolfram.com/xid/0ywjl2umsqsa-me00oh
Show a simple progress indicator:

https://wolfram.com/xid/0ywjl2umsqsa-jyd4wa


https://wolfram.com/xid/0ywjl2umsqsa-uhwsza
Applications (4)Sample problems that can be solved with this function

https://wolfram.com/xid/0ywjl2umsqsa-hamat6

https://wolfram.com/xid/0ywjl2umsqsa-qzzjzs

Train a distribution on the images:

https://wolfram.com/xid/0ywjl2umsqsa-mwubth

Generate 50 new examples based on the learned distribution:

https://wolfram.com/xid/0ywjl2umsqsa-8e6w9u

Compare the probability density for an image of the training set, an image of a test set, a sample from the learned distribution, an image of another dataset and a random image:

https://wolfram.com/xid/0ywjl2umsqsa-tc046x

https://wolfram.com/xid/0ywjl2umsqsa-5qu85

Obtain the probability to generate a sample with a lower PDF for each of these images:

https://wolfram.com/xid/0ywjl2umsqsa-llovue


https://wolfram.com/xid/0ywjl2umsqsa-xwcdv7

Train a distribution directly from the Tabular object:

https://wolfram.com/xid/0ywjl2umsqsa-jc15fs


https://wolfram.com/xid/0ywjl2umsqsa-izb4jx

Generate several random samples:

https://wolfram.com/xid/0ywjl2umsqsa-mk6okz

Visualize random samples of the variables "PetalLength" and "SepalLength" from the distribution and compare them with the dataset:

https://wolfram.com/xid/0ywjl2umsqsa-28azfq

Load the Titanic survival dataset:

https://wolfram.com/xid/0ywjl2umsqsa-rb9dcz

Train a distribution on the dataset:

https://wolfram.com/xid/0ywjl2umsqsa-6srmog

Use the distribution and SynthesizeMissingValues to generate complete examples from incomplete ones:

https://wolfram.com/xid/0ywjl2umsqsa-72ab23


https://wolfram.com/xid/0ywjl2umsqsa-ftucl8

Use the distribution to predict the survival probability of a given passenger:

https://wolfram.com/xid/0ywjl2umsqsa-o4dmz


https://wolfram.com/xid/0ywjl2umsqsa-ddaxd7

Train a distribution on a two-dimensional dataset:

https://wolfram.com/xid/0ywjl2umsqsa-g4novm

https://wolfram.com/xid/0ywjl2umsqsa-4ewt2p

Plot the PDF along with the training data:

https://wolfram.com/xid/0ywjl2umsqsa-dds9l3

Use SynthesizeMissingValues to impute missing values using the learned distribution:

https://wolfram.com/xid/0ywjl2umsqsa-og5slp

Obtain the histogram of possible imputed values:

https://wolfram.com/xid/0ywjl2umsqsa-feqqk1

Wolfram Research (2019), LearnDistribution, Wolfram Language function, https://reference.wolfram.com/language/ref/LearnDistribution.html (updated 2025).
Text
Wolfram Research (2019), LearnDistribution, Wolfram Language function, https://reference.wolfram.com/language/ref/LearnDistribution.html (updated 2025).
Wolfram Research (2019), LearnDistribution, Wolfram Language function, https://reference.wolfram.com/language/ref/LearnDistribution.html (updated 2025).
CMS
Wolfram Language. 2019. "LearnDistribution." Wolfram Language & System Documentation Center. Wolfram Research. Last Modified 2025. https://reference.wolfram.com/language/ref/LearnDistribution.html.
Wolfram Language. 2019. "LearnDistribution." Wolfram Language & System Documentation Center. Wolfram Research. Last Modified 2025. https://reference.wolfram.com/language/ref/LearnDistribution.html.
APA
Wolfram Language. (2019). LearnDistribution. Wolfram Language & System Documentation Center. Retrieved from https://reference.wolfram.com/language/ref/LearnDistribution.html
Wolfram Language. (2019). LearnDistribution. Wolfram Language & System Documentation Center. Retrieved from https://reference.wolfram.com/language/ref/LearnDistribution.html
BibTeX
@misc{reference.wolfram_2025_learndistribution, author="Wolfram Research", title="{LearnDistribution}", year="2025", howpublished="\url{https://reference.wolfram.com/language/ref/LearnDistribution.html}", note=[Accessed: 04-April-2025
]}
BibLaTeX
@online{reference.wolfram_2025_learndistribution, organization={Wolfram Research}, title={LearnDistribution}, year={2025}, url={https://reference.wolfram.com/language/ref/LearnDistribution.html}, note=[Accessed: 04-April-2025
]}