FindClusters

FindClusters[{e1,e2,}]

partitions the ei into clusters of similar elements.

FindClusters[{e1v1,e2v2,}]

returns the vi corresponding to the ei in each cluster.

FindClusters[{e1,e2,}{v1,v2,}]

gives the same result.

FindClusters[label1e1,label2e2,]

returns the labeli corresponding to the ei in each cluster.

FindClusters[data,n]

partitions data into at most n clusters.

Details and Options

  • FindClusters works for a variety of data types, including numerical, textual, and image, as well as dates and times.
  • The following options can be given:
  • CriterionFunctionAutomaticcriterion for selecting a method
    DistanceFunctionAutomaticthe distance function to use
    FeatureNamesAutomaticfeature names to assign for input data
    FeatureTypesAutomaticfeature types to assume for input data
    MethodAutomaticwhat method to use
    PerformanceGoalAutomaticaspect of performance to optimize
    WeightsAutomaticwhat weight to give to each example
  • By default, FindClusters will preprocess the data automatically unless a DistanceFunction is specified.
  • The setting for DistanceFunction can be any distance or dissimilarity function, or a function f defining a distance between two values.
  • Possible settings for PerformanceGoal include:
  • Automaticautomatic tradeoff among speed, accuracy, and memory
    "Quality"maximize the accuracy of the classifier
    "Speed"maximize the speed of the classifier
  • Possible settings for Method include:
  • Automaticautomatically select a method
    "Agglomerate"single-linkage clustering algorithm
    "DBSCAN"density-based spatial clustering of applications with noise
    "NeighborhoodContraction"displace examples toward high-density region
    "JarvisPatrick"JarvisPatrick clustering algorithm
    "KMeans"k-means clustering algorithm
    "MeanShift"mean-shift clustering algorithm
    "KMedoids"partitioning around medoids
    "SpanningTree"minimum spanning tree-based clustering algorithm
    "Spectral"spectral clustering algorithm
    "GaussianMixture"variational Gaussian mixture algorithm
  • The methods "KMeans" and "KMedoids" can only be used when the number of clusters is specified.
  • Possible settings for CriterionFunction include:
  • "StandardDeviation"root-mean-square standard deviation
    "RSquared"R-squared
    "Dunn"Dunn index
    "CalinskiHarabasz"CalinskiHarabasz index
    "DaviesBouldin"DaviesBouldin index
    Automaticinternal index

Examples

open allclose all

Basic Examples  (4)

Find clusters of nearby values:

In[1]:=
Click for copyable input
Out[1]=

Find exactly four clusters:

In[1]:=
Click for copyable input
Out[1]=

Represent clustered elements with the right-hand sides of each rule:

In[1]:=
Click for copyable input
Out[1]=

Represent clustered elements with the keys of the association:

In[1]:=
Click for copyable input
Out[1]=

Scope  (6)

Options  (13)

Applications  (3)

Properties & Relations  (2)

Neat Examples  (2)

See Also

ClusteringComponents  ClusterClassify  Classify  Partition  Split  Gather  Nearest  FindShortestTour  DistanceTransform  MeanShift

Tutorials

Introduced in 2007
(6.0)
| Updated in 2016
(11.0)