FindClusters

FindClusters[{e1,e2,}]
partitions the ei into clusters of similar elements.

FindClusters[{e1v1,e2v2,}]
returns the vi corresponding to the ei in each cluster.

FindClusters[{e1,e2,}{v1,v2,}]
gives the same result.

FindClusters[label1e1,label2e2,]
returns the labeli corresponding to the ei in each cluster.

FindClusters[data,n]
partitions data into at most n clusters.

Details and OptionsDetails and Options

  • FindClusters works for a variety of data types, including numerical, textual, and image, as well as dates and times.
  • The following options can be given:
  • CriterionFunctionAutomaticcriterion for selecting a method
    DistanceFunctionAutomaticthe distance function to use
    MethodAutomaticwhat method to use
    PerformanceGoalAutomaticaspect of performance to optimize
    WeightsAutomaticwhat weight to give to each example
  • By default, the following distance functions are used for different types of elements:
  • ColorDistancecolors
    EditDistancestrings
    EuclideanDistancenumeric data
    ImageDistanceimages
    JaccardDissimilarityBoolean data
  • The setting for DistanceFunction can be any distance or dissimilarity function, or a function f defining a distance between two values.
  • Possible settings for PerformanceGoal include:
  • Automaticautomatic tradeoff among speed, accuracy, and memory
    "Quality"maximize the accuracy of the classifier
    "Speed"maximize the speed of the classifier
  • Possible settings for Method include:
  • Automaticautomatically select a method
    "Agglomerate"single-linkage clustering algorithm
    "DBSCAN"density-based spatial clustering of applications with noise
    "NeighborhoodContraction"displace examples toward high-density region
    "JarvisPatrick"JarvisPatrick clustering algorithm
    "KMeans"k-means clustering algorithm
    "MeanShift"mean-shift clustering algorithm
    "KMedoids"partitioning around medoids
    "SpanningTree"minimum spanning tree-based clustering algorithm
    "Spectral"spectral clustering algorithm
    "GaussianMixture"variational Gaussian mixture algorithm
  • The methods "KMeans", and "KMedoids" can only be used when the number of clusters is specified.
  • Possible settings for CriterionFunction include:
  • "StandardDeviation"root-mean-square standard deviation
    "RSquared"R-squared
    "Dunn"Dunn index
    "CalinskiHarabasz"CalinskiHarabasz index
    "DaviesBouldin"DaviesBouldin index
    Automaticinternal index

ExamplesExamplesopen allclose all

Basic Examples  (4)Basic Examples  (4)

Find clusters of nearby values:

In[1]:=
Click for copyable input
Out[1]=

Find exactly four clusters:

In[1]:=
Click for copyable input
Out[1]=

Represent clustered elements with the right-hand sides of each rule:

In[1]:=
Click for copyable input
Out[1]=

Represent clustered elements with the keys of the association:

In[1]:=
Click for copyable input
Out[1]=
Introduced in 2007
(6.0)
| Updated in 2016
(11.0)