Classify

Classify[{example1class1,example2class2,}]

generates a ClassifierFunction[] based on the examples and classes given.

Classify[{example1,example2,}{class1,class2,}]

also generates a ClassifierFunction[] based on the examples and classes given.

Classify[class1{example11,example12,},class2{example21,},]

generates a ClassifierFunction[] based on an association of classes with their examples.

Classify[training,data]

attempts to classify data using a classifier function deduced from the training set given.

Classify["name",data]

attempts to classify data using the built-in classifier function represented by "name".

Classify[,data,prop]

gives the specified property of the classification associated with data.

Classify[classifier,opts]

takes an existing classifier function and modifies it with the new options given.

Details and Options

Examples

open allclose all

Basic Examples  (2)

Train a classifier function on labeled examples:

Use the classifier function to classify a new unlabeled example:

Obtain classification probabilities for this example:

Classify multiple examples:

Plot the probability that the class of an example is "A" as a function of the feature:

The training and the classification can be performed in one step:

Train a classifier with multiple features:

Classify a new example:

Classify an example that has missing features:

Get the probabilities for the most probable classes:

Scope  (16)

Custom Classifiers  (8)

Train a classifier on textual data:

Classify new examples:

Train a classifier on image examples, gathered by their class:

Classify new examples:

Obtain probabilities of a given class for new examples:

Train a classifier on data where the feature is a sequence of tokens:

Classify a new example:

Train a classifier on a dataset with features and classes in separate lists:

Obtain information about the classifier with Information:

Generate a ClassifierMeasurements[] object of the classifier applied to a test set:

Get the accuracy from the classifier measurements object:

Visualize the confusion matrix:

Train a classifier on a dataset with missing features:

Classify a new example:

Classify examples containing missing features:

Train a classifier on a dataset with named features. The order of the keys does not matter. Keys can be missing:

Classify a new example:

Classify examples containing missing features:

Construct a Dataset with a list of associations:

Train a classifier to predict the feature "gender" as function of the other features:

Once the classifier is trained, any input format can be used. Classify an example formatted as an association:

Find out the order of the features, and classify an example formatted as a list:

Classify examples in a Dataset:

Create an artificial dataset from three normally distributed clusters:

Train a classifier on this dataset:

Plot the training set and the probability distribution of each class as a function of the features:

Built-in Classifiers  (8)

Use the "Language" built-in classifier to detect the language in which a text is written:

Use it to detect the language of examples:

Obtain the probabilities for the most likely languages:

Restrict the classifier to some languages with the option ClassPriors:

Use the "FacebookTopic" built-in classifier to detect the topic of a Facebook post:

Classify multiple examples:

Unrecognized topics or languages will return Indeterminate:

Use the "CountryFlag" built-in classifier to recognize a country from its flag:

Use the "NameGender" built-in classifier to get the probable sex of a person from their first name:

Use the "NotablePerson" built-in classifier to determine what notable person is depicted in the given image:

Use the "Sentiment" built-in classifier to infer the sentiment of a social-media message:

Use the "Profanity" built-in classifier to return True if a text contains strong language:

Use the "Spam" built-in classifier to detect if an email is spam from its content:

Options  (21)

AcceptanceThreshold  (1)

Create a classifier with an anomaly detector:

Change the value of the acceptance threshold when evaluating the classifier:

Permanently change the value of the acceptance threshold in the classifier:

AnomalyDetector  (1)

Create a classifier and specify that an anomaly detector should be included:

Evaluate the classifier on an non-anomalous input:

Evaluate the classifier on an anomalous input:

The "Probabilities" property is not affected by the anomaly detector:

Temporarily remove the anomaly detector from the classifier:

Permanently remove the anomaly detector from the classifier:

ClassPriors  (1)

Train a classifier on an imbalanced dataset:

The training example 5False is classified as True:

Classify this example with a uniform prior over classes instead of the imbalanced training prior:

The class priors can be specified during the training:

The class priors of a classifier can also be changed after training:

FeatureExtractor  (3)

Train a FeatureExtractorFunction on a simple dataset:

Use the feature extractor function as a preprocessing step in Classify:

Train a classifier on texts preprocessed by custom functions and an extractor method:

Create a feature extractor and extract features from a dataset of texts:

Train a classifier on the extracted features:

Join the feature extractor to the classifier:

The classifier can now be used on the initial input type:

FeatureNames  (2)

Train a classifier and give a name to each feature:

Use the association format to predict a new example:

The list format can still be used:

Train a classifier on a training set with named features and use FeatureNames to set their order:

Features are ordered as specified:

Classify a new example from a list:

FeatureTypes  (2)

Train a classifier on data where the feature is intended to be a sequence of tokens:

Classify wrongly assumed that examples contained two different text features:

The following classification will output an error message:

Force Classify to interpret the feature as a "NominalSequence":

Classify a new example:

Train a classifier with named features:

Both features have been considered numerical:

Specify that the feature "gender" should be considered nominal:

IndeterminateThreshold  (1)

Specify a probability threshold when training the classifier:

Obtain class probabilities for an example:

As there are no class probabilities above 0.9, no prediction is made:

Specifying a threshold when classifying supersedes the trained threshold:

Update the value of the threshold in the classifier:

Method  (3)

Train a logistic classifier:

Train a random forest classifier:

Plot the probability of class "a" given the feature for both classifiers:

Train a nearest neighbors classifier:

Find the classification accuracy on a test set:

In this example, using a naive Bayes classifier reduces the classification accuracy:

However, using a naive Bayes classifier reduces the classification time:

MONK's problems consist of synthetic binary classification datasets used for comparing the performance of different classifiers. Generate the dataset for the second MONK problem:

Test the accuracy of each available classifier by training on 169 examples and testing on the entire dataset:

PerformanceGoal  (1)

Train a classifier with an emphasis on training speed:

Compute the classification accuracy on a test set:

By default, a compromise between classification speed and performance is sought:

With the same data, train a classifier with an emphasis on training speed and memory:

The classifier uses less memory, but is also less accurate:

TargetDevice  (1)

Train a classifier on the system's default GPU using a neural network and look at the AbsoluteTiming:

Compare the previous result with the one achieved by using the default CPU computation:

TimeGoal  (2)

Train a classifier while specifying a total training time of 5 seconds:

Load the "Mushroom" dataset:

Train a classifier while specifying a target training time of 0.1 seconds:

The classifier reached an accuracy of about 90%:

Train a classifier while specifying a target training time of 5 seconds:

The classifier reached an accuracy of about 99%:

TrainingProgreessReporting  (1)

Load the "UCILetter" dataset:

Show training progress interactively during training of a classifier:

Show training progress interactively without plots:

Print training progress periodically during training:

Show a simple progress indicator:

Do not report progress:

UtilityFunction  (1)

Train a classifier:

By default, the most probable class is predicted:

This corresponds to the following utility specification:

Train a classifier that penalizes examples of class "yes" being misclassified as "no":

The classifier decision is different despite the probabilities being unchanged:

Specifying a utility function when classifying supersedes the utility function specified at training:

Update the value of the utility function in the classifier:

ValidationSet  (1)

Train a logistic regression classifier on the Fisher iris data:

Obtain the L2 regularization coefficient of the trained classifier:

Specify a validation set:

A different L2 regularization coefficient has been selected:

Applications  (7)

Train a digit recognizer on 100 examples from the MNIST database of handwritten digits:

Use the classifier to recognize unseen digits:

Analyze probabilities of a misclassified example:

Train a classifier on 32 images of legendary creatures:

Use the classifier to recognize unseen creatures:

Train a classifier to recognize daytime from nighttime:

Test it on examples:

Train a classifier on the Fisher iris dataset to predict the species of Iris:

Predict the species of Iris from a list of features:

Test the accuracy of the classifier on a test set:

Generate a confusion matrix of the classifier on this test set:

Load the "Titanic" dataset, which contains a list of Titanic passengers with their age, sex, ticket class, and survival:

Visualize a sample of the dataset:

Train a logistic classifier on this dataset:

Calculate the survival probability of a 10-year-old girl traveling in third class:

Plot the survival probability as a function of age for some combinations of "class" and "sex":

Train a classifier that classifies movie review snippets as "positive" or "negative":

Classify an unseen movie review snippet:

Test the accuracy of the classifier on a test set:

Import examples of the writings of Shakespeare, Oscar Wilde, and Victor Hugo to train a classifier:

Generate an author classifier from these texts:

Find which author new texts are from:

Possible Issues  (1)

The RandomSeeding option does not always guarantee reproducibility of the result:

Train several classifiers on the "Titanic" dataset:

Compare the results when tested on a test set:

Neat Examples  (2)

Define and plot clusters sampled from normal distributions:

Blend colors to reflect the probability density of the different classes for each method:

Draw in the box to test a logistic classifier trained on the dataset ExampleData[{"MachineLearning","MNIST"}]:

Introduced in 2014
 (10.0)
 |
Updated in 2017
 (11.1)
2017
 (11.2)
2018
 (11.3)
2019
 (12.0)
2020
 (12.1)