Supervised Machine Learning
TopicOverview »
Supervised machine learning is the attempt to classify data or predict outcomes using mathematical models trained on labeled datasets. It is used to solved problems such as score estimation (customer satisfaction, quality assessment, …), forecasting (prices, agricultural yields, …) and data classification (spam detection, copyrighted or violent content, …). The Wolfram Language has support for all the most common supervised learning algorithms, conveniently packaged into high-level functionality that deals automatically with tasks like missing data imputation, feature selection and extraction, model selection and cross-validation.
Classification
Classify — classify data into categories using a built-in classifier or learning from examples
ClassifierFunction — symbolic representation of a classifier to be applied to data
ClassifierMeasurements — performance on test data
Regression
Predict — predict values from data using a built-in predictor or learning from examples
PredictorFunction — symbolic representation of a predictor to be applied to data
PredictorMeasurements — performance on test data
Object Detection
TrainImageContentDetector, TrainTextContentDetector — train custom detectors
ContentDetectorFunction — symbolic representation of a detector to be applied to data
Sequences Forecasting
SequencePredict — predict subsequent elements from sequence examples
SequencePredictorFunction — symbolic representation of a sequence predictor
Learning from Actions
BayesianMinimization — model-based minimization of arbitrary objective functions
ActiveClassification — learn a classifier by actively probing a system
ActivePrediction — learn a predictor by actively probing a system
ActiveClassificationObject ▪ ActivePredictionObject
Specific Supervised Learning Methods
Nearest, NearestNeighborGraph — find nearest neighbors
FindFit — find a generalized nonlinear fit
LinearModelFit ▪ LogitModelFit ▪ NonlinearModelFit ▪ GeneralizedLinearModelFit ▪ ProbitModelFit
TimeSeriesModelFit — fit a wide variety of types of time series
Interpolation — find an interpolation of values in a dataset
FindFormula — find a simple symbolic formula for data
FindSequenceFunction — find a function to reproduce a discrete sequence
FindHiddenMarkovStates — find the most probable path in a Markov model
Supervised Learning Methods »
"DecisionTree" — use a decision tree
"LogisticRegression" — use probabilities from linear combinations of features
"RandomForest" — use Breiman–Cutler ensembles of decision trees
"SupportVectorMachine" — classify using a support vector machine
"GradientBoostedTrees" ▪ "NearestNeighbors" ▪ "Markov" ▪ ...
Machine Learning Options
AnomalyDetector — how to detect anomalies in input data
ComputeUncertainty — return values including uncertainty (as Around)
FeatureExtractor — how to extract features to learn from
FeatureTypes — feature types to assume for input data
MissingValuePattern — specify how missing values are represented in data
MissingValueSynthesis — how to synthesize missing values
PerformanceGoal — whether to optimize for memory, quality or speed
RandomSeeding — how to seed randomization
RecalibrationFunction — how to post-process model predictions
TimeGoal — how long to allocate for training etc.