AnomalyDetection

AnomalyDetection[{example1,example2,}]

generates an AnomalyDetectorFunction[] based on the examples given.

AnomalyDetection[LearnedDistribution[]]

generates an anomaly detector based on the given distribution.

AnomalyDetection[True{example11,example12,},False{example21,}]

can be used to indicate which examples should be considered anomalous.

Details and Options

  • AnomalyDetection can be used on many types of data, including numerical, nominal and images.
  • Each examplei can be a single data element, a list of data elements or an association of data elements. Examples can also be given as a Dataset object.
  • AnomalyDetection[examples] yields an AnomalyDetectorFunction[] that can detect anomalies, given new examples.
  • FindAnomalies[AnomalyDetectorFunction[],data,] can be used to find anomalies in data according to the given detector.
  • AnomalyDetection attempts to model the distribution of non-anomalous data in order to detect anomalies (i.e. "out-of-distribution" examples). Examples are considered anomalous when their RarerProbability value is below the value specified for AcceptanceThreshold.
  • When test data comes from the same distribution as training data, AcceptanceThreshold corresponds to the anomaly detection false-positive rate.
  • AnomalyDetection can be used with or without indicating which examples are anomalous (and which are not). Indicating which examples are anomalous helps with training the anomaly detector and allows it to determine a value for AcceptanceThreshold automatically.
  • In AnomalyDetection[True{example11,example12,},False{example21,}], True indicates that the corresponding examples are anomalous, and False that they are not. These labels can also be specified by AnomalyDetection[{example1,example2,}{True,False,}] and AnomalyDetection[{example1True,example2False,}].
  • AnomalyDetection[{example1,example2,}{i,j,}] can be used to specify that examplei, examplej, etc. should be considered anomalous, and that others should be considered as non-anomalous.
  • AnomalyDetection[{example1,example2,}None] specifies that none of the examples are anomalous.
  • The following options can be given:
  • AcceptanceThreshold0.001RarerProbability threshold to consider an example anomalous
    FeatureExtractorIdentityhow to extract features from which to learn
    FeatureNamesAutomaticfeature names to assign for input data
    FeatureTypesAutomaticfeature types to assume for input data
    MethodAutomaticwhich modeling algorithm to use
    PerformanceGoalAutomaticaspects of performance to optimize
    RandomSeeding1234what seeding of pseudorandom generators should be done internally
    TimeGoalAutomatichow long to spend training the detector
    TrainingProgressReportingAutomatichow to report progress during training
    ValidationSetAutomaticthe set of data on which to evaluate the model during training
  • Possible settings for PerformanceGoal include:
  • "Memory"minimize storage requirements of the detector
    "Quality"maximize the modeling quality of the detector
    "Speed"maximize speed for detecting new anomalies
    "TrainingSpeed"minimize time spent producing the detector
    Automaticautomatic tradeoff among speed, quality and memory
    {goal1,goal2,}automatically combine goal1, goal2, etc.
  • Possible settings for Method are as given in LearnDistribution[].
  • The following settings for TrainingProgressReporting can be used:
  • "Panel"show a dynamically updating graphical panel
    "Print"periodically report information using Print
    "ProgressIndicator"show a simple ProgressIndicator
    "SimplePanel"dynamically updating panel without learning curves
    Nonedo not report any information
  • Possible settings for RandomSeeding include:
  • Automaticautomatically reseed every time the function is called
    Inheriteduse externally seeded random numbers
    seeduse an explicit integer or strings as a seed
  • AnomalyDetection[,FeatureExtractor"Minimal"] indicates that the internal preprocessing should be as simple as possible.

Examples

open allclose all

Basic Examples  (2)

Train a detector function on a numeric dataset:

Use the trained detector to find examples that are considered anomalous:

Train an AnomalyDetectorFunction on a list of colors:

Attempt to find outliers in a list of colors using the trained anomaly detector:

Scope  (5)

Train an AnomalyDetectorFunction by labeling the anomalous examples with True, and False for the others:

Use the trained AnomalyDetectorFunction to find anomalies:

Train an AnomalyDetectorFunction by specifying that none of the examples are anomalous:

Use the trained AnomalyDetectorFunction to find anomalies:

Train a distribution on colors:

Generate an AnomalyDetectorFunction based on the trained distribution:

Use the detector function to find out-of-distribution colors:

Train an AnomalyDetectorFunction on a two-dimensional array of pseudorandom real numbers:

Use the trained AnomalyDetectorFunction to find anomalies in new examples with FindAnomalies:

Use the trained AnomalyDetectorFunction to find anomalies and their corresponding positions:

Obtain a random sample of training and test datasets of images:

Add anomalous examples to corrupt the datasets:

Train a "supervised" anomaly detector by specifying the position of the known anomalies in the training set:

Use the trained anomaly detector on the test set:

Options  (5)

AcceptanceThreshold  (1)

Create and visualize random 3D vectors with anomalies:

Train an anomaly detector function on the training set:

Use the anomaly detector function to find and visualize the anomalous examples in the test set:

Change the anomaly detection false-positive rate by specifying the AcceptanceThreshold:

Method  (1)

Obtain training and test datasets of images:

Add "out-of-distribution" examples to the test set:

Train the anomaly detector using the "Multinormal" method:

Find anomalous examples in the test set:

Train the anomaly detector using the "KernelDensityEstimation" method and attempt to find anomalies:

PerformanceGoal  (1)

Load the Fisher's Irises dataset with its numerical attributes:

Train an anomaly detector function by specifying the PerformanceGoal:

Compare the training time for anomaly detector functions with different performance goals:

TimeGoal  (1)

Obtain a dataset of images and train an anomaly detector function by specifying the time goal:

Obtain the training time of the anomaly detection:

TrainingProgressReporting  (1)

Obtain a dataset of images:

Show training progress interactively without the plots:

Print the training progress periodically during training:

Show a simple progress indicator:

Applications  (1)

Obtain training and test datasets of images:

Add anomalous examples to the test set:

Train an anomaly detector on the training set:

Find anomalous examples in the test set:

Introduced in 2019
 (12.0)
 |
Updated in 2020
 (12.1)