"Agglomerate" (Machine Learning Method)

Details & Suboptions

  • "Agglomerate" is a hierarchical clustering method. "Agglomerate" works well when clusters have similar densities and are isotropic; however, it can fail when clusters have different sizes and it is sensitive to the choice of the dissimilarity function.
  • The following plots show the results of the "Agglomerate" method applied to toy datasets:
  • "Agglomerate" constructs a hierarchy of clusters. At the beginning of the procedure, the method assigns every data point to a cluster, then it iteratively merges similar clusters. At each iteration, the most similar clusters are combined. The combination stops when the specified number of clusters is reached. When the number of clusters is not specified, the combination stops when all cluster pairs are above a given similarity threshold.
  • The following suboption can be given:
  • ClusterDissimilarityFunction"Single"linkage for the cluster dissimilarity
  • Cluster dissimilarity is defined by a linkage function, which can be specified by the suboption ClusterDissimilarityFunction. Possible settings for ClusterDissimilarityFunction include:
  • "Single"distance between the clusters' closest points
    "Complete"distance between the clusters' furthest points
    "Average"average distance between the clusters' points
    "Centroid"distance between the cluster means
    "Median"distance between the cluster medians
    "Ward"average squared distance between the clusters' points
    "WeightedAverage"weighted-average intercluster dissimilarity
    fpure function
  • "Complete" linkage tends to break clusters in contrast to "Average", "Centroid", "Median" and "Ward", which tend to find relatively compact and distant regular-sized clusters.

Examples

open allclose all

Basic Examples  (3)

Find clusters of nearby values using the "Agglomerate" method:

Find four clusters of nearby values using the "Agglomerate" clustering method:

Obtain clusters in a list of colors using the "Agglomerate" method:

Create a ClassifierFunction to identify two clusters of images using the "Agglomerate" method:


Find the cluster assignments and gather the images by their corresponding clusters:

Options  (2)

"ClusterDissimilarityFunction"  (2)

Generate groups of two-dimensional, normally distributed data points:

Obtain nearby clusters using different ClusterDissimilarityFunction:

Obtain nearby clusters in a list of colors using the "Agglomerate" method:

Obtain nearby clusters in a list of colors using different cluster dissimilarities: