Wolfram Language & System 10.4 (2016)|Legacy Documentation

This is documentation for an earlier version of the Wolfram Language.View current documentation (Version 11.2)

ClusterClassify

ClusterClassify[data]
generates a ClassifierFunction[] by partitioning data into clusters of similar elements.

ClusterClassify[data,n]
generates a ClassifierFunction[] with exactly n clusters.

Details and OptionsDetails and Options

  • ClusterClassify works for a variety of data types, including numerical, textual, and image, as well as dates and times.
  • The following options can be given:
  • CriterionFunctionAutomaticcriterion for selecting a method
    DistanceFunctionAutomaticthe distance function to use
    MethodAutomaticwhat method to use
    PerformanceGoalAutomaticaspect of performance to optimize
    WeightsAutomaticwhat weight to give to each example
  • By default, the following distance functions are used for different types of elements:
  • ColorDistancecolors
    EditDistancestrings
    EuclideanDistancenumeric data
    ImageDistanceimages
    JaccardDissimilarityBoolean data
  • The setting for DistanceFunction can be any distance or dissimilarity function, or a function f defining a distance between two values.
  • Possible settings for PerformanceGoal include:
  • Automaticautomatic tradeoff between speed, accuracy, and memory
    "Memory"minimize the storage requirements of the classifier
    "Quality"maximize the accuracy of the classifier
    "Speed"maximize the speed of the classifier
    "TrainingSpeed"minimize the time spent producing the classifier
  • Possible settings for Method include:
  • Automaticautomatically select a method
    "Agglomerate"single linkage clustering algorithm
    "DBSCAN"density-based spatial clustering of applications with noise
    "NeighborhoodContraction"displace examples toward high-density region
    "JarvisPatrick"JarvisPatrick clustering algorithm
    "KMeans"k-means clustering algorithm
    "MeanShift"mean-shift clustering algorithm
    "KMedoids"partitioning around medoids
    "SpanningTree"minimum spanning tree-based clustering algorithm
    "Spectral"spectral clustering algorithm
  • The methods , , and can only be used when the number of clusters is specified.
  • Possible settings for CriterionFunction include:
  • "StandardDeviation"root-mean-square standard deviation
    "RSquared"R-squared
    "Dunn"Dunn index
    "CalinskiHarabasz"CalinskiHarabasz index
    "DaviesBouldin"DaviesBouldin index
    Automaticinternal index

ExamplesExamplesopen allclose all

Basic Examples  (3)Basic Examples  (3)

Train the ClassifierFunction on some numerical data:

In[1]:=
Click for copyable input
In[2]:=
Click for copyable input
Out[2]=

Use the classifier function to classify a new unlabeled example:

In[3]:=
Click for copyable input
Out[3]=

Obtain classification probabilities for this example:

In[4]:=
Click for copyable input
Out[4]=

Classify multiple examples:

In[5]:=
Click for copyable input
Out[5]=

Plot the probabilities for the two different classes in the interval :

In[6]:=
Click for copyable input
Out[6]=

Train the ClassifierFunction on some colors by requiring the number of classes to be 5:

In[1]:=
Click for copyable input
Out[1]=
In[2]:=
Click for copyable input
Out[2]=

Train the ClassifierFunction on some unlabeled data:

In[3]:=
Click for copyable input
Out[3]=

Gather the elements by their class number:

In[4]:=
Click for copyable input
Out[4]=

Train the ClassifierFunction on some strings:

In[1]:=
Click for copyable input
In[2]:=
Click for copyable input
Out[2]=
In[3]:=
Click for copyable input
Out[3]=

Gather the elements by their class number:

In[4]:=
Click for copyable input
Out[4]=
Introduced in 2016
(10.4)