generates a PredictorFunction[] based on the example input-output pairs given.


generates the same result.


attempts to predict the output associated with input from the training examples given.


uses the built-in predictor function represented by "name".


takes an existing predictor function and modifies it with the new options given.

Details and Options

  • Predict can be used on many types of data, including numerical, textual, sounds and images, as well as combinations of these.
  • Each inputi can be a single data element, a list of data elements, an association of data elements or a Dataset object. In Predict[training,], training can be a Dataset object.
  • Predict[training] returns a PredictorFunction[] that can then be applied to specific data.
  • In Predict[,input], input can be a single item or a list of items.
  • In Predict[,input,prop], properties are as given in PredictorFunction[]; they include:
  • "Decision"best prediction according to distribution and utility function
    "Distribution"distribution of value conditioned on input
    "SHAPValues"Shapley additive feature explanations for each example
    "Properties"list of all properties available
  • "SHAPValues" assesses the contribution of features by comparing predictions with different sets of features removed and then synthesized. The option MissingValueSynthesis can be used to specify how the missing features are synthesized. SHAP explanations are given as deviation from the training output mean. "SHAPValues"n can be used to control the number of samples used for the numeric estimations of SHAP explanations.
  • Examples of built-in predictor functions include:
  • "NameAge"age of a person, given their first name
  • The following options can be given:
  • AnomalyDetector Noneanomaly detector used by the predictor
    AcceptanceThreshold Automaticrarer probability threshold for anomaly detector
    FeatureExtractor Identityhow to extract features from which to learn
    FeatureNames Automaticfeature names to assign for input data
    FeatureTypes Automaticfeature types to assume for input data
    IndeterminateThreshold 0below what probability density to return Indeterminate
    Method Automaticwhich regression algorithm to use
    MissingValueSynthesis Automatichow to synthesize missing values
    PerformanceGoal Automaticaspects of performance to try to optimize
    RecalibrationFunction Automatichow to post-process predicted value
    RandomSeeding1234what seeding of pseudorandom generators should be done internally
    TargetDevice "CPU"the target device on which to perform training
    TimeGoal Automatichow long to spend training the classifier
    TrainingProgressReporting Automatichow to report progress during training
    UtilityFunction Automaticutility as function of actual and predicted value
    ValidationSet Automaticdata on which to validate the model generated
  • Possible settings for PerformanceGoal include:
  • "DirectTraining"train directly on the full dataset, without model searching
    "Memory"minimize storage requirements of the predictor
    "Quality"maximize accuracy of the predictor
    "Speed"maximize speed of the predictor
    "TrainingSpeed"minimize time spent producing the predictor
    Automaticautomatic tradeoff among speed, accuracy and memory
    {goal1,goal2,}automatically combine goal1, goal2, etc.
  • Possible settings for Method include:
  • "DecisionTree"predict using a decision tree
    "GradientBoostedTrees"predict using an ensemble of trees trained with gradient boosting
    "LinearRegression"predict from linear combinations of features
    "NearestNeighbors"predict from nearest neighboring examples
    "NeuralNetwork"predict using an artificial neural network
    "RandomForest" predict from BreimanCutler ensembles of decision trees
    "GaussianProcess"predict using a Gaussian process prior over functions
  • The following settings for TrainingProgressReporting can be used:
  • "Panel"show a dynamically updating graphical panel
    "Print"periodically report information using Print
    "ProgressIndicator"show a simple ProgressIndicator
    "SimplePanel"dynamically updating panel without learning curves
    Nonedo not report any information
  • Possible settings for RandomSeeding include:
  • Automaticautomatically reseed every time the function is called
    Inheriteduse externally seeded random numbers
    seeduse an explicit integer or strings as a seed
  • Predict[{assoc1,assoc2,}"key",] can be used to specify that the output is given by the value of "key" in each association associ.
  • Predict[{list1,list2,}n,] can be used to specify that the output is given by the value of part n in each list listi.
  • Predict[Dataset[]part,] can be used to specify that the outputs are given by the value of part of each row of the dataset.
  • Predict[FittedModel[]] can be used to convert a fitted model into a PredictorFunction[].
  • Predict[,FeatureExtractor"Minimal"] indicates that the internal preprocessing should be as simple as possible.
  • In Predict[PredictorFunction[],FeatureExtractorfe], the FeatureExtractorFunction[] fe will be prepended to the existing feature extractor.
  • Information can be used on the PredictorFunction[] obtained.


open allclose all

Basic Examples  (2)

Train a predictor function on a set of examples:

Predict the value of a new example, given its feature:

Get the conditional distribution of the value, given the example feature:

Plot this distribution:

Predict multiple examples:

Plot the predicted values as a function of the feature value and show the training examples:

Train a predictor on a dataset with multiple features:

Predict the value of a new example, given its features:

Predict the value of a new example that has a missing feature:

Scope  (8)

Train a predictor to predict the colored area of an image:

Predict the values of new examples:

Train a predictor on data where the feature is a sequence of tokens:

Predict a new example:

Train a predictor on a dataset with features and values in separate lists:

Obtain information about the predictor:

Train a nearest-neighbors predictor on a dataset containing missing features:

Predict the value of a new example:

Predict values on examples containing missing features:

Train a predictor on a dataset with named features. The order of the keys does not matter. Keys can be missing:

Predict a new example:

Predict examples containing missing features:

Construct a Dataset with a list of associations:

Train a predictor to predict the feature "age" as a function of the other features:

Once the predictor is trained, any input format can be used. Predict an example formatted as an association:

Find out the order of the features and predict an example formatted as a list:

Predict examples in a Dataset:

Create and visualize an artificial dataset from the expression Cos[x*y]:

Train a predictor on the dataset:

Visualize the prediction surface:

Use the built-in predictor "NameAge" to predict the age of a person from their first name:

Visualize the distribution of age for a given name:

Options  (23)

AcceptanceThreshold  (1)

Create a predictor with an anomaly detector:

Change the value of the acceptance threshold when evaluating the predictor:

Permanently change the value of the acceptance threshold in the predictor:

AnomalyDetector  (1)

Create a predictor and specify that an anomaly detector should be included:

Evaluate the predictor on a non-anomalous input:

Evaluate the predictor on an anomalous input:

The "Distribution" property is not affected by the anomaly detector:

Temporarily remove the anomaly detector from the predictor:

Permanently remove the anomaly detector from the predictor:

FeatureExtractor  (2)

Generate a predictor function using FeatureExtractor to preprocess the data using a custom function:

Add the "StandardizedVector" method to the preprocessing pipeline:

Use the predictor on new data:

Create a feature extractor and extract features from a dataset:

Train a predictor on the extracted features:

Join the feature extractor to the predictor:

The predictor can now be used on the initial input type:

FeatureNames  (2)

Train a predictor and give a name to each feature:

Use the association format to predict a new example:

The list format can still be used:

Train a predictor on a training set with named features and use FeatureNames to set their order:

Features are ordered as specified:

Predict a new example from a list:

FeatureTypes  (2)

Train a predictor on textual and nominal data:

The first feature has been wrongly interpreted as a nominal feature:

Specify that the first feature should be considered textual:

Predict a new example:

Train a predictor with named features:

Both features have been considered numerical:

Specify that the feature "gender" should be considered nominal:

IndeterminateThreshold  (1)

Specify a probability density threshold when training the predictor:

Visualize the probability density for a given example:

As no value has a probability density above 0.5, no prediction is made:

Specifying a threshold when predicting supersedes the trained threshold:

Update the value of the threshold in the predictor:

Method  (4)

Train a linear predictor:

Train a nearest-neighbors predictor:

Plot the predicted value as a function of the feature for both predictors:

Train a random forest predictor:

Find the standard deviation of the residuals on a test set:

In this example, using a linear regression predictor increases the standard deviation of the residuals:

However, using a linear regression predictor reduces the training time:

Train a linear regression, neural network, and Gaussian process predictor:

These methods produce smooth predictors:

Train a random forest and nearest-neighbor predictor:

These methods produce non-smooth predictors:

Train a neural network, a random forest, and a Gaussian process predictor:

The Gaussian process predictor is smooth and handles small datasets well:

MissingValueSynthesis  (1)

Train a predictor with two input features:

Get the prediction for an example that has a missing value:

Set the missing value synthesis to replace each missing variable with its estimated most likely value given known values (which is the default behavior):

Replace missing variables with random samples conditioned on known values:

Averaging over many random imputations is usually the best strategy and allows obtaining the uncertainty caused by the imputation:

Specify a learning method during training to control how the distribution of data is learned:

Predict an example with missing values using the "KernelDensityEstimation" distribution to condition values:

Provide an existing LearnedDistribution at training to use it when imputing missing values during training and later evaluations:

Specify an existing LearnedDistribution to synthesize missing values for an individual evaluation:

Control both the learning method and the evaluation strategy by passing an association at training:

PerformanceGoal  (1)

Train a predictor with an emphasis on training speed:

Find the standard deviation of the residuals on a test set:

By default, a compromise between prediction speed and performance is sought:

With the same data, train a predictor with an emphasis on training speed and memory:

The predictor uses less memory, but is also less accurate:

RecalibrationFunction  (1)

Load the Boston Homes dataset:

Train a predictor with model calibration:

Visualize the comparison plot on a test set:

Remove the recalibration function from the predictor:

Visualize the new comparison plot:

TargetDevice  (1)

Train a predictor on the system's default GPU using a neural network and look at the AbsoluteTiming:

Compare the previous result with the one achieved by using the default CPU computation:

TimeGoal  (2)

Train a predictor while specifying a total training time of 3 seconds:

Load the "BostonHomes" dataset:

Train a predictor while specifying a target training time of 0.1 seconds:

The predictor reached a standard deviation of about 3.2:

Train a classifier while specifying a target training time of 5 seconds:

The standard deviation of the predictor is now around 2.7:

TrainingProgressReporting  (1)

Load the "WineQuality" dataset:

Show training progress interactively during training of a predictor:

Show training progress interactively without plots:

Print training progress periodically during training:

Show a simple progress indicator:

Do not report progress:

UtilityFunction  (2)

Train a predictor:

Visualize the probability density for a given example:

By default, the value with the highest probability density is predicted:

This corresponds to a Dirac delta utility function:

Define a utility function that penalizes the predicted value's being smaller than the actual value:

Plot this function for a given actual value:

Train a predictor with this utility function:

The predictor decision is now changed despite the probability density's being unchanged:

Specifying a utility function when predicting supersedes the utility function specified at training:

Update the predictor utility:

Visualize the distribution of age for the name "Claire" with the built-in predictor "NameAge":

The most likely value of this distribution is the following:

Change the utility function to predict the mean value instead of the most likely value:

ValidationSet  (1)

Train a linear regression predictor on the "WineQuality" data:

Obtain the L2 regularization coefficient of the trained predictor:

Specify a validation set:

A different L2 regularization coefficient has been selected:

Applications  (5)

Train a predictor that predicts the median value of properties in a neighborhood of Boston, given some features of the neighborhood:

Generate a PredictorMeasurementsObject to analyze the performance of the predictor on a test set:

Visualize a scatter plot of the values of the test set as a function of the predicted values:

Compute the root mean square of the residuals:

Load a dataset of the average monthly temperature as a function of the city, the year, and the month:

Visualize a sample of the dataset:

Train a linear predictor on the dataset:

Plot the predicted temperature distribution of the city "Lincoln" in 2020 for different months:

For every month, plot the predicted temperature and its error bar (standard deviation):

Load a dataset of wine quality as a function of the wines' physical properties:

Visualize a few data points:

Get a description of the variables in the dataset:

Visualize the distribution of the "alcohol" and "pH" variables:

Train a predictor on the training set:

Predict the quality of an unknown wine:

Create a function that predicts the quality of the unknown wine as a function of its pH and alcohol level:

Plot this function to have a hint on how to improve this wine:

Load a dataset of wine quality as a function of the wines' physical properties:

Train a predictor to estimate wine quality:

Examine an example bottle:

Predict the example bottle's quality:

Calculate how much higher or lower this bottle's predicted quality is than the mean:

Get an estimation for how much each feature impacted the predictor's output for this bottle:

Visualize these feature impacts:

Confirm that the Shapley values fully explain the predicted quality:

Learn a distribution of the data that treats each feature as independent:

Estimate SHAP value feature importance for 100 bottles of wine, using 5 samples for each estimation:

Calculate how important each feature is to the model:

Visualize the model's feature importance:

Visualize a nonlinear relationship between a feature's value and its impact on the model's prediction:

Generate images of gauges associated with their values:

Train a predictor on this dataset:

Predict the value of a gauge from its image:

Interact with the predictor using Dynamic:

Properties & Relations  (1)

The linear regression predictor without regularization and LinearModelFit can train equivalent models:

Fit and NonlinearModelFit can also be equivalent:

Possible Issues  (1)

The RandomSeeding option does not always guarantee reproducibility of the result:

Train several predictors on the "WineQuality" dataset:

Compare the results when tested on a test set:

Neat Examples  (1)

Create a function to visualize the predictions of a given method after learning from 1D data:

Try the function with the "GaussianProcess" method on a simple dataset:

Visualize the prediction of other methods:

Wolfram Research (2014), Predict, Wolfram Language function, (updated 2021).


Wolfram Research (2014), Predict, Wolfram Language function, (updated 2021).


Wolfram Language. 2014. "Predict." Wolfram Language & System Documentation Center. Wolfram Research. Last Modified 2021.


Wolfram Language. (2014). Predict. Wolfram Language & System Documentation Center. Retrieved from


@misc{reference.wolfram_2023_predict, author="Wolfram Research", title="{Predict}", year="2021", howpublished="\url{}", note=[Accessed: 02-December-2023 ]}


@online{reference.wolfram_2023_predict, organization={Wolfram Research}, title={Predict}, year={2021}, url={}, note=[Accessed: 02-December-2023 ]}