AnomalyDetection
AnomalyDetection[{example1,example2,…}]
generates an AnomalyDetectorFunction[…] based on the examples given.
AnomalyDetection[LearnedDistribution[…]]
generates an anomaly detector based on the given distribution.
AnomalyDetection[True{example11,example12,…},False{example21,…}]
can be used to indicate which examples should be considered anomalous.
Details and Options
- AnomalyDetection attempts to model the distribution of non-anomalous data in order to detect anomalies (i.e. "out-of-distribution" examples).
- Examples are considered anomalous when their RarerProbability value is below the value specified for AcceptanceThreshold.
- AnomalyDetection can be used on many types of data, including numerical, nominal and images.
- Each examplei can be a single data element, a list of data elements or an association of data elements. Examples can also be given as a Dataset or a Tabular object.
- Anomalous data can also be specified using the following syntax:
-
True{e11,e12,…},False{e21,…} association of anomalous (True) and non-anomalous data {e1,e2,…}{True,False,…} rule between examples and anomaly specifications {e1True,e2False,…} list of anomaly specification rules {e1,e2,…}{i,j,…} anomalies at position i, j, … {e1,e2,…}None no anomalous examples - AnomalyDetection[examples] yields an AnomalyDetectorFunction[…] that can detect anomalies, given new examples.
- FindAnomalies[AnomalyDetectorFunction[…],data,…] can be used to find anomalies in data according to the given detector.
- When test data comes from the same distribution as training data, AcceptanceThreshold corresponds to the anomaly detection false-positive rate.
- AnomalyDetection can be used with or without indicating which examples are anomalous (and which are not). Indicating which examples are anomalous helps with training the anomaly detector and allows it to determine a value for AcceptanceThreshold automatically.
- In AnomalyDetection[True{example11,example12,…},False{example21,…}], True indicates that the corresponding examples are anomalous, and False that they are not. These labels can also be specified by AnomalyDetection[{example1,example2,…}{True,False,…}] and AnomalyDetection[{example1True,example2False,…}].
- AnomalyDetection[{example1,example2,…}{i,j,…}] can be used to specify that examplei, examplej, etc. should be considered anomalous, and that others should be considered as non-anomalous.
- AnomalyDetection[{example1,example2,…}None] specifies that none of the examples are anomalous.
- The following options can be given:
-
AcceptanceThreshold 0.001 RarerProbability threshold to consider an example anomalous FeatureExtractor Identity how to extract features from which to learn FeatureNames Automatic feature names to assign for input data FeatureTypes Automatic feature types to assume for input data Method Automatic which modeling algorithm to use PerformanceGoal Automatic aspects of performance to optimize RandomSeeding 1234 what seeding of pseudorandom generators should be done internally TimeGoal Automatic how long to spend training the detector TrainingProgressReporting Automatic how to report progress during training ValidationSet Automatic the set of data on which to evaluate the model during training - Possible settings for PerformanceGoal include:
-
"Memory" minimize storage requirements of the detector "Quality" maximize the modeling quality of the detector "Speed" maximize speed for detecting new anomalies "TrainingSpeed" minimize time spent producing the detector Automatic automatic tradeoff among speed, quality and memory {goal1,goal2,…} automatically combine goal1, goal2, etc. - Possible settings for Method are as given in LearnDistribution[…].
- The following settings for TrainingProgressReporting can be used:
-
"Panel" show a dynamically updating graphical panel "Print" periodically report information using Print "ProgressIndicator" show a simple ProgressIndicator "SimplePanel" dynamically updating panel without learning curves None do not report any information - Possible settings for RandomSeeding include:
-
Automatic automatically reseed every time the function is called Inherited use externally seeded random numbers seed use an explicit integer or strings as a seed - AnomalyDetection[…,FeatureExtractor"Minimal"] indicates that the internal preprocessing should be as simple as possible.
Examples
open allclose allBasic Examples (2)
Train a detector function on a numeric dataset:
Use the trained detector to find examples that are considered anomalous:
Train an AnomalyDetectorFunction on a list of colors:
Attempt to find outliers in a list of colors using the trained anomaly detector:
Scope (8)
Train an AnomalyDetectorFunction by labeling the anomalous examples with True, and False for the others:
Specify the anomalies in an explicit list:
Specify the anomalies with a list of rules:
Specify only the position of anomalous examples:
Train an AnomalyDetectorFunction by specifying that none of the examples are anomalous:
Use the trained AnomalyDetectorFunction to find anomalies:
Train an AnomalyDetectorFunction on tabular data:
Apply the detector on a new table:
Train an AnomalyDetectorFunction on a two-dimensional array of pseudorandom real numbers:
Use the trained AnomalyDetectorFunction to find anomalies in new examples with FindAnomalies:
Use the trained AnomalyDetectorFunction to find anomalies and their corresponding positions:
Train a LearnedDistribution on colors:
Generate an AnomalyDetectorFunction based on the trained distribution:
Use the detector function to find out-of-distribution colors:
Options (5)
AcceptanceThreshold (1)
Create and visualize random 3D vectors with anomalies:
Train an anomaly detector function on the training set:
Use the anomaly detector function to find and visualize the anomalous examples in the test set:
Change the anomaly detection false-positive rate by specifying the AcceptanceThreshold:
Method (1)
PerformanceGoal (1)
Load the Fisher's Irises dataset with its numerical attributes:
Train an anomaly detector function by specifying the PerformanceGoal:
Compare the training time for anomaly detector functions with different performance goals:
TimeGoal (1)
Applications (3)
Obtain the Fisher's Iris dataset:
Train an anomaly detector assuming no out-of-distribution examples:
Use the detector on a new, unlabeled and partial measurement:
Obtain training and test datasets of images:
Add anomalous examples to the test set:
Train an anomaly detector on the training set:
Find anomalous examples in the test set:
Obtain a random sample of training and test datasets of images:
Add anomalous examples to corrupt the datasets:
Train a "supervised" anomaly detector by specifying the position of the known anomalies in the training set:
Text
Wolfram Research (2019), AnomalyDetection, Wolfram Language function, https://reference.wolfram.com/language/ref/AnomalyDetection.html (updated 2025).
CMS
Wolfram Language. 2019. "AnomalyDetection." Wolfram Language & System Documentation Center. Wolfram Research. Last Modified 2025. https://reference.wolfram.com/language/ref/AnomalyDetection.html.
APA
Wolfram Language. (2019). AnomalyDetection. Wolfram Language & System Documentation Center. Retrieved from https://reference.wolfram.com/language/ref/AnomalyDetection.html