ClassifierMeasurements

ClassifierMeasurements[classifier,testset,prop]

gives measurements associated with property prop when classifier is evaluated on testset.

ClassifierMeasurements[,{prop1,prop2,}]

gives properties prop1, prop2, etc.

ClassifierMeasurements[classifier,testset]

yields a ClassifierMeasurementsObject[] that can be applied to any property.

Details and Options

  • The classifier is either a ClassifierFunction object as generated by Classify, or a NetGraph, NetChain, etc. in which the output is a "Class" NetDecoder.
  • ClassifierMeasurements[,opts] specifies that the classifier should use the options opts when applied to the test set. Possible options are as given in ClassifierFunction.
  • ClassifierMeasurementsObject[][prop] can be used to give properties prop from a ClassifierMeasurementsObject. When repeated property lookups are required, this is typically more efficient.
  • ClassifierMeasurementsObject[][prop,opts] specifies that the classifier should use the options opts when applied to the test set. It supersedes options given to ClassifierMeasurements.
  • ClassifierMeasurements has the same options as ClassifierFunction[], with the following additions:
  • WeightsAutomaticweights to be associated with test set examples
    ComputeUncertaintyFalsewhether measures should be given with their statistical uncertainty
  • When ComputeUncertaintyTrue, numerical measures will be returned as Around[result,err] where err represents the standard error (corresponding to a 68% confidence interval) associated with measure result.
  • Possible settings for Weights include:
  • Automaticassociates weight 1 with all test examples
    {w1,w2,}associates weight wi with the i^(th) test examples
  • Changing the weight of a test example from 1 to 2 is equivalent to duplicating the example.
  • Weights affect measures as well as their uncertainties.
  • Properties returning a single numeric value related to classification abilities on the test set include:
  • "Accuracy"fraction of correctly classified examples
    "Accuracy"ntop-n accuracy
    "AccuracyBaseline"accuracy if predicting the commonest class
    "CohenKappa"Cohen's kappa coefficient
    "Error"fraction of incorrectly classified examples
    "GeometricMeanProbability"geometric mean of the actual-class probabilities
    "LogLikelihood"log-likelihood of the model given the test set
    "MeanCrossEntropy"mean cross entropy over test examples
    "MeanDecisionUtility"mean utility over test example
    "Perplexity"exponential of the mean cross entropy
    "ScottPi"Scott's pi coefficient
    "RejectionRate"fraction of examples classified as Indeterminate
  • Examples classified as Indeterminate are discarded when measuring properties related to classification abilities on the test set, such as "Accuracy", "Error", or "MeanCrossEntropy".
  • Confusion matrixrelated properties include:
  • "ConfusionMatrix"counts cij of class i examples classified as class j
    "ConfusionMatrixPlot"plot of the confusion matrix
    "ConfusionMatrixPlot"{c1,c2,}confusion matrix plot restricted to classes c1, c2, etc.
    "ConfusionMatrixPlot"nconfusion matrix plot for the worst n-class subset
    "ConfusionFunction"function giving confusion matrix values
    "TopConfusions"pairs of classes that are most confused
    "TopConfusions"nn most confused class pairs
  • Timing-related properties include:
  • "EvaluationTime"time needed to classify one example of the test set
    "BatchEvaluationTime"marginal time to classify one example in a batch
  • Properties returning one value for each test-set example include:
  • "DecisionUtilities"value of the utility function for each example
    "Probabilities"actual-class classification probabilities for each example
  • Properties returning graphics include:
  • "Report"panel reporting main measurements
    "ROCCurve"Receiver Operating Characteristics curve for each class
    "ProbabilityHistogram"histogram of actual-class probabilities
    "AccuracyRejectionPlot"plot of the accuracy as function of the rejection rate
  • Properties returning examples from the test set include:
  • "Examples"all test examples
    "Examples"{i,j}all class i examples classified as class j
    "BestClassifiedExamples"examples having the highest actual-class probability
    "WorstClassifiedExamples"examples having the lowest actual-class probability
    "CorrectlyClassifiedExamples"examples correctly classified
    "MisclassifiedExamples"examples misclassified
    "TruePositiveExamples"true positive test examples for each class
    "FalsePositiveExamples"false positive test examples for each class
    "TrueNegativeExamples"true negative test examples for each class
    "FalseNegativeExamples"false negative test examples for each class
    "IndeterminateExamples"test examples classified as Indeterminate
    "LeastCertainExamples"examples having the highest distribution entropy
    "MostCertainExamples"examples having the lowest distribution entropy
  • Examples are given in the form inputiclassi, where classi is the actual class from the test set.
  • Properties such as "WorstClassifiedExamples" or "MostCertainExamples" output up to 10 examples. ClassifierMeasurementsObject[][propn] can be used to output n examples.
  • Properties returning one measure for each class include:
  • "AreaUnderROCCurve"area under the ROC curve for each class
    "ClassMeanCrossEntropy"mean cross entropy for each class
    "ClassRejectionRate"rejection rate for each class
    "F1Score"F1 score for each class
    "FalseDiscoveryRate"false discovery rate for each class
    "FalseNegativeRate"false negative rate for each class
    "FalsePositiveRate"false positive rate for each class
    "MatthewsCorrelationCoefficient"Matthews correlation coefficient for each class
    "NegativePredictiveValue"negative predictive value for each class
    "Precision"precision of classification for each class
    "Recall"recall rate of classification for each class
    "Specificity"specificity for each class
    "TruePositiveNumber"number of true positive examples
    "FalsePositiveNumber"number of false positive examples
    "TrueNegativeNumber"number of true negative examples
    "FalseNegativeNumber"number of false negative examples
  • ClassifierMeasurementsObject[][propclass] can be used to only return the measure associated with the specified class.
  • ClassifierMeasurementsObject[][prop<|class1w1,class2w2,|>] can be used to return a weighted average of each class measure.
  • ClassifierMeasurementsObject[][propf] can be used to apply function f to the returned class measures (e.g. ClassifierMeasurementsObject[][propMean]).
  • Properties such as "Precision" or "Recall" give one measure for each possible "positive class". The "negative class" is the union of all the classes that are not the positive class. For such properties, one can average the measures for all possible positive classes using ClassifierMeasurementsObject[][propaverage], where average can be:
  • "MacroAverage"takes the mean of the measures
    "WeightedMacroAverage"weights each measure by its related class frequency
    "MicroAverage"joins true positive/true negative etc. examples of all classes to give a unique measure
  • Other properties include:
  • "ClassifierFunction"ClassifierFunction[] being measured
    "Properties"list of measurement properties available

Examples

open allclose all

Basic Examples  (1)

Define a training set and a test set:

Create a classifier on the training set:

Measure the accuracy of the classifier on the test set:

Measure the F1 score of each class:

Obtain a function giving the number of test examples for each pair of actual class and predicted class:

Use the function to obtain the number of examples of class "B" predicted as "A":

Generate a ClassifierMeasurementsObject of the classifier with the test set:

Perform the previous measurements using the ClassifierMeasurementsObject:

Scope  (2)

Basic Uses  (1)

Create an artificial dataset from three clusters normally distributed:

Create a dataset from the clusters:

Separate the dataset into a training set and a test set:

Train a classifier on the training set:

Generate a ClassifierMeasurementsObject of the classifier with the test set:

Measure the accuracy from the ClassifierMeasurementsObject:

Visualize the confusion matrix:

Extract and visualize test examples of class Blue that have been classified as Yellow:

Neural Networks  (1)

Load the Fisher Iris flower dataset:

Define a network to classify the dataset:

Train the network using cross entropy loss:

Generate a ClassifierMeasurementsObject of the net with the test set:

Measure the accuracy:

Generate a confusion matrix plot:

Options  (6)

ClassPriors  (1)

Load the training set and test set of the "Satellite" dataset:

Train a classifier on the training set:

Train a classifier on the training set:

Visualize the confusion matrix obtained when the classifier has a different value of ClassPriors:

Perform the same operation by first generating a ClassifierMeasurementsObject:

IndeterminateThreshold  (1)

Load the training set and test set of the "Titanic" dataset:

Train a classifier on the training set:

Visualize the confusion matrix of the classifier on the test set:

Visualize the confusion matrix obtained when the classifier has a different value of IndeterminateThreshold:

Measure the accuracy of the classifier on the test set for different values of IndeterminateThreshold:

Visualize the accuracy-versus-rejection curve:

TargetDevice  (1)

Train a classifier using a neural network:

Measure the accuracy of the classifier on a test set for different setting of TargetDevice:

UtilityFunction  (1)

Load the training set and test set of the "Mushroom" dataset:

Train a classifier on a part of the training set:

Visualize the confusion matrix of the classifier on the test set:

Visualize the confusion matrix obtained when the classifier has a different value of UtilityFunction:

Perform the same operation by first generating a ClassifierMeasurementsObject:

"Uncertainty"  (1)

Train a classifier that classifies movie review snippets as "positive" or "negative":

Generate a ClassifierMeasurements[] object using a test set:

Obtain a measure of the accuracy along with its uncertainty:

Obtain a measure of other properties along with their uncertainties:

Weights  (1)

Create a classifier on a training set:

Generate a measurement object while specifying the weights that each test example has:

Compute the accuracy:

Weights can also be modified when using the measurement object:

Uncertainties are also affected by weights:

Applications  (2)

Train a classifier on the Fisher Iris dataset to predict the species of Iris (setosa, versicolor, virginica) from four measured features:

Measure the accuracy of the classifier on a test set:

Generate a confusion matrix to visualize the actual and predicted classifications of the test set using the classifier:

Extract examples of class "versicolor" being misclassified as "virginica":

Return the confusion matrix as a set of associations:

Train a classifier on a sample of the MNIST dataset:

Generate a ClassifierMeasurementsObject of the classifier on the MNIST test set:

Visualize the confusion matrix:

Extract examples of 9 confused with 0:

Extract the 20 worst classified examples:

Compute the F-score of each class to find for which class the classifier should be improved:

Find the minimal value for the rejection threshold in order for the accuracy to be above 90%:

Visualize the confusion matrix and compute the F-scores with this rejection threshold:

Visualize the 3 most confused classes:

Introduced in 2014
 (10.0)
 |
Updated in 2017
 (11.1)
2018
 (11.3)
2019
 (12.0)