ClusteringComponents

ClusteringComponents[array]
gives an array in which each element at the lowest level of array is replaced by an integer index representing the cluster in which the element lies.

ClusteringComponents[array,n]
finds at most n clusters.

ClusteringComponents[array,n,level]
finds clusters at the specified level in array.

ClusteringComponents[image]
finds clusters of pixels with similar values in image.

ClusteringComponents[image,n]
finds at most n clusters in image.

Details and OptionsDetails and Options

  • ClusteringComponents works for a variety of data types, including numerical, textual, and image, as well as dates and times.
  • The following options can be given:
  • CriterionFunctionAutomaticcriterion for selecting a method
    DistanceFunctionAutomaticthe distance function to use
    MethodAutomaticwhat method to use
    PerformanceGoalAutomaticaspect of performance to optimize
    WeightsAutomaticwhat weight to give to each example
  • By default, the following distance functions are used for different types of elements:
  • ColorDistancecolors
    EditDistancestrings
    EuclideanDistancenumeric data
    ImageDistanceimages
    JaccardDissimilarityBoolean data
  • The setting for DistanceFunction can be any distance or dissimilarity function, or a function f defining a distance between two values.
  • Possible settings for PerformanceGoal include:
  • Automaticautomatic tradeoff among speed, accuracy, and memory
    "Quality"maximize the accuracy of the classifier
    "Speed"maximize the speed of the classifier
  • Possible settings for Method include:
  • Automaticautomatically select a method
    "Agglomerate"single linkage clustering algorithm
    "DBSCAN"density-based spatial clustering of applications with noise
    "NeighborhoodContraction"displace examples toward high-density region
    "JarvisPatrick"JarvisPatrick clustering algorithm
    "KMeans"k-means clustering algorithm
    "MeanShift"mean-shift clustering algorithm
    "KMedoids"partitioning around medoids
    "SpanningTree"minimum spanning tree-based clustering algorithm
    "Spectral"spectral clustering algorithm
    "GaussianMixture"variational Gaussian mixture algorithm
  • The methods "KMeans", and "KMedoids" can only be used when the number of clusters is specified.
  • Possible settings for CriterionFunction include:
  • "StandardDeviation"root-mean-square standard deviation
    "RSquared"R-squared
    "Dunn"Dunn index
    "CalinskiHarabasz"CalinskiHarabasz index
    "DaviesBouldin"DaviesBouldin index
    Automaticinternal index
Introduced in 2010
(8.0)
| Updated in 2016
(11.0)