FindAnomalies

FindAnomalies[{example1,example2,}]

gives a list of the examplei that are considered anomalous with respect to the other examples.

FindAnomalies[examples,prop]

gives the specified property related to the anomaly computation.

FindAnomalies[examples,{prop1,prop2,}]

gives the properties propi.

FindAnomalies[AnomalyDetectorFunction[],data]

finds anomalies in data using the anomaly detector function given.

FindAnomalies[fun,data,props]

gives properties related to the anomaly computation.

Details and Options

  • FindAnomalies can be used on many types of data, including numerical, nominal and images.
  • Each examplei can be a single data element, a list of data elements or an association of data elements. Examples can also be given as a Dataset object.
  • FindAnomalies attempts to model the distribution of non-anomalous data in order to detect anomalies (i.e. "out-of-distribution" examples). Examples are considered anomalous when their RarerProbability is below the value specified for AcceptanceThreshold.
  • In FindAnomalies[AnomalyDetectorFunction[],data], if data comes from the same distribution as the training examples for the detector, AcceptanceThreshold corresponds to the anomaly detection false-positive rate.
  • In FindAnomalies[,props], possible properties include:
  • "Anomalies"examples that are considered anomalous
    "AnomalyCount"number of examples that are considered anomalous
    "AnomalyBooleanList"Boolean values indicating whether examples are anomalous
    "AnomalyPositions"list of anomaly positions
    "AnomalyRarerProbabilities"rarer probabilities of anomalous examples
    "NonAnomalies"examples that are considered nonanomalous
    "RarerProbabilities"probability to generate a sample with lower PDF than data
  • The following options can be given:
  • AcceptanceThreshold0.001RarerProbability threshold to consider an example anomalous
    FeatureExtractorIdentityhow to extract features from which to learn
    FeatureNamesAutomaticfeature names to assign for input data
    FeatureTypesAutomaticfeature types to assume for input data
    MethodAutomaticwhich modeling algorithm to use
    PerformanceGoalAutomaticaspects of performance to optimize
    RandomSeeding1234what seeding of pseudorandom generators should be done internally
    TimeGoalAutomatichow long to spend training the detector
    TrainingProgressReportingAutomatichow to report progress during training
    ValidationSetAutomaticthe set of data on which to evaluate the model during training
  • Possible settings for PerformanceGoal include:
  • "Quality"maximize the modeling quality of the detector
    "Speed"maximize speed for detecting anomalies
    Automaticautomatic tradeoff among speeds, quality and memory
    {goal1,goal2,}automatically combine goal1, goal2, etc.
  • Possible settings for Method are as given in LearnDistribution[].
  • The following settings for TrainingProgressReporting can be used:
  • "Panel"show a dynamically updating graphical panel
    "Print"periodically report information using Print
    "ProgressIndicator"show a simple ProgressIndicator
    "SimplePanel"dynamically updating panel without learning curves
    Nonedo not report any information
  • FindAnomalies[,FeatureExtractor"Minimal"] indicates that the internal preprocessing should be as simple as possible.

Examples

open all close all

Basic Examples  (3)

Find anomalous examples on a numeric dataset:

In[1]:=
Click for copyable input
Out[1]=

Find anomalous examples on a nominal dataset:

In[1]:=
Click for copyable input
Out[1]=

Generate 100 colors following a given distribution:

In[1]:=
Click for copyable input
Out[1]=

Add out-of-distribution colors and attempt to detect them:

In[2]:=
Click for copyable input
Out[2]=

Scope  (1)

Options  (5)

Applications  (4)

Introduced in 2019
(12.0)